fix: reject host-less catalog URLs in base and preset validators (#3209)#3227
Open
Noor-ul-ain001 wants to merge 1 commit into
Open
fix: reject host-less catalog URLs in base and preset validators (#3209)#3227Noor-ul-ain001 wants to merge 1 commit into
Noor-ul-ain001 wants to merge 1 commit into
Conversation
…hub#3209) `CatalogStackBase._validate_catalog_url` (inherited by `IntegrationCatalog`) and `PresetCatalog._validate_catalog_url` checked `parsed.netloc`, which is truthy for host-less URLs like `https://:8080` (port only) or `https://user@` (userinfo only). Such URLs slipped past validation despite the error message promising "a valid URL with a host", then failed later with a confusing fetch error. Switch both validators to `parsed.hostname` (None for those inputs), matching the workflow, step, and bundler catalog validators that already do this. Add regression tests covering port-only and userinfo-only URLs for both validators. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes URL host validation in the remaining catalog URL validators that were incorrectly relying on urlparse(...).netloc (which can be truthy even when no host is present). It aligns these validators with the rest of the codebase by switching to parsed.hostname, and adds regression tests to prevent reintroducing the host-less URL acceptance bug.
Changes:
- Update
CatalogStackBase._validate_catalog_urlandPresetCatalog._validate_catalog_urlto reject host-less URLs by checkingparsed.hostnameinstead ofparsed.netloc. - Add parametrized regression tests covering port-only (
https://:8080...) and userinfo-only (https://user@...) host-less URLs for both validators.
Show a summary per file
| File | Description |
|---|---|
| src/specify_cli/catalogs.py | Switch catalog URL host validation from netloc to hostname to correctly reject host-less URLs. |
| src/specify_cli/presets/init.py | Align preset catalog URL validation with other validators by checking hostname to reject host-less URLs. |
| tests/integrations/test_integration_catalog.py | Add regression coverage ensuring IntegrationCatalog rejects host-less URLs that previously slipped through. |
| tests/test_presets.py | Add regression coverage ensuring PresetCatalog rejects host-less URLs that previously slipped through. |
Review details
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 0
- Review effort level: Low
mnriem
requested changes
Jun 29, 2026
mnriem
left a comment
Collaborator
There was a problem hiding this comment.
Please resolve conflicts by pulling in upstream/main
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two catalog URL validators rejected host-less URLs incorrectly because they checked
parsed.netlocinstead ofparsed.hostname.netlocis truthy for URLs likehttps://:8080(port only) orhttps://user@(userinfo only), which have no host — so they slipped past validation even though the error message promises "a valid URL with a host". The bad URL was then accepted and only failed later with a confusing fetch error.Fixes #3209.
How
Switch both stragglers to
parsed.hostname(which isNonefor these inputs), aligning them with the sibling validators that already do this correctly:src/specify_cli/catalogs.py—CatalogStackBase._validate_catalog_url(inherited byIntegrationCatalog)src/specify_cli/presets/__init__.py—PresetCatalog._validate_catalog_urlThe workflow (
workflows/catalog.py), step, and bundler (bundler/commands_impl/catalog_config.py) validators already check.hostname— the bundler even documents why. This is purely an alignment fix; HTTPS/localhost behavior is unchanged.Tests
Added parametrized regression tests for both validators covering port-only (
https://:8080,https://:8080/catalog.json) and userinfo-only (https://user@,https://user:pass@) URLs:tests/integrations/test_integration_catalog.py::TestCatalogURLValidation::test_hostless_url_rejectedtests/test_presets.py::TestPresetCatalog::test_validate_catalog_url_hostless_rejectedVerified the new tests fail on
main(host-less URLs wrongly accepted) and pass with the fix. Existing HTTPS / HTTP-rejected / localhost-allowed / missing-host tests still pass.ruffclean.🤖 Generated with Claude Code