Skip to content

fix: reject host-less catalog URLs in base and preset validators (#3209)#3227

Open
Noor-ul-ain001 wants to merge 1 commit into
github:mainfrom
Noor-ul-ain001:fix/3209-hostless-catalog-url
Open

fix: reject host-less catalog URLs in base and preset validators (#3209)#3227
Noor-ul-ain001 wants to merge 1 commit into
github:mainfrom
Noor-ul-ain001:fix/3209-hostless-catalog-url

Conversation

@Noor-ul-ain001

Copy link
Copy Markdown
Contributor

What

Two catalog URL validators rejected host-less URLs incorrectly because they checked parsed.netloc instead of parsed.hostname. netloc is truthy for URLs like https://:8080 (port only) or https://user@ (userinfo only), which have no host — so they slipped past validation even though the error message promises "a valid URL with a host". The bad URL was then accepted and only failed later with a confusing fetch error.

Fixes #3209.

How

Switch both stragglers to parsed.hostname (which is None for these inputs), aligning them with the sibling validators that already do this correctly:

  • src/specify_cli/catalogs.pyCatalogStackBase._validate_catalog_url (inherited by IntegrationCatalog)
  • src/specify_cli/presets/__init__.pyPresetCatalog._validate_catalog_url

The workflow (workflows/catalog.py), step, and bundler (bundler/commands_impl/catalog_config.py) validators already check .hostname — the bundler even documents why. This is purely an alignment fix; HTTPS/localhost behavior is unchanged.

from urllib.parse import urlparse
p = urlparse("https://:8080")
print(bool(p.netloc), p.hostname)   # True None  <- netloc truthy, no host

Tests

Added parametrized regression tests for both validators covering port-only (https://:8080, https://:8080/catalog.json) and userinfo-only (https://user@, https://user:pass@) URLs:

  • tests/integrations/test_integration_catalog.py::TestCatalogURLValidation::test_hostless_url_rejected
  • tests/test_presets.py::TestPresetCatalog::test_validate_catalog_url_hostless_rejected

Verified the new tests fail on main (host-less URLs wrongly accepted) and pass with the fix. Existing HTTPS / HTTP-rejected / localhost-allowed / missing-host tests still pass. ruff clean.

🤖 Generated with Claude Code

…hub#3209)

`CatalogStackBase._validate_catalog_url` (inherited by `IntegrationCatalog`)
and `PresetCatalog._validate_catalog_url` checked `parsed.netloc`, which is
truthy for host-less URLs like `https://:8080` (port only) or `https://user@`
(userinfo only). Such URLs slipped past validation despite the error message
promising "a valid URL with a host", then failed later with a confusing fetch
error.

Switch both validators to `parsed.hostname` (None for those inputs), matching
the workflow, step, and bundler catalog validators that already do this.

Add regression tests covering port-only and userinfo-only URLs for both
validators.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes URL host validation in the remaining catalog URL validators that were incorrectly relying on urlparse(...).netloc (which can be truthy even when no host is present). It aligns these validators with the rest of the codebase by switching to parsed.hostname, and adds regression tests to prevent reintroducing the host-less URL acceptance bug.

Changes:

  • Update CatalogStackBase._validate_catalog_url and PresetCatalog._validate_catalog_url to reject host-less URLs by checking parsed.hostname instead of parsed.netloc.
  • Add parametrized regression tests covering port-only (https://:8080...) and userinfo-only (https://user@...) host-less URLs for both validators.
Show a summary per file
File Description
src/specify_cli/catalogs.py Switch catalog URL host validation from netloc to hostname to correctly reject host-less URLs.
src/specify_cli/presets/init.py Align preset catalog URL validation with other validators by checking hostname to reject host-less URLs.
tests/integrations/test_integration_catalog.py Add regression coverage ensuring IntegrationCatalog rejects host-less URLs that previously slipped through.
tests/test_presets.py Add regression coverage ensuring PresetCatalog rejects host-less URLs that previously slipped through.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 0
  • Review effort level: Low

@mnriem mnriem left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve conflicts by pulling in upstream/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Host-less catalog URLs (e.g. https://:8080) pass URL validation in base and preset validators

3 participants