Skip to content

Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton#7989

Merged
achamayou merged 20 commits into
copilot/jwk-refresh-curl-multi-planfrom
copilot/copilotjwk-refresh-curl-multi-plan
Jun 29, 2026
Merged

Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton#7989
achamayou merged 20 commits into
copilot/jwk-refresh-curl-multi-planfrom
copilot/copilotjwk-refresh-curl-multi-plan

Conversation

Copilot AI commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Migrates JWT/JWK auto-refresh outbound HTTP fetches away from the legacy RPCSessions::create_client() path and onto ccf::curl::CurlRequest attached to CurlmLibuvContextSingleton, following the curl multi singleton direction introduced in #7102.

Core changes

  • src/node/jwt_key_auto_refresh.h

    • Fetches both OpenID metadata and JWKS via curl multi singleton requests.
    • Removes the RPCSessions dependency from JwtKeyAutoRefresh.
    • Bounds metadata/JWKS response bodies with the configurable jwt.key_refresh_max_response_size node startup config setting (default 1 MB).
    • Sets bounded connect and transfer timeouts for outbound refresh requests.
    • Restricts curl protocols to HTTPS.
    • Verifies TLS peer certificates and hostnames using the configured governance CA bundle.
    • Clears libcurl's default CA path so validation is pinned to the configured CA bundle rather than also trusting system roots.
    • Reports curl connection/TLS failures through send_refresh_jwt_keys_error(), so refresh failure metrics include network-level fetch failures.
    • Uses weak ownership plus a stopped flag for scheduled tasks and curl callbacks so late async responses are ignored during shutdown instead of extending auto-refresh lifetime.
  • src/node/node_state.h

    • Constructs JwtKeyAutoRefresh without passing rpcsessions.
    • Passes the configured jwt.key_refresh_max_response_size value into auto-refresh.
  • src/http/curl.h

    • Lowers expected stop-path callback logs from fail to debug.
    • Checks curl_easy_getinfo and curl_multi_get_handles results in curl cleanup paths.
    • Checks curl_slist_append failures without leaking the existing list.

Metadata and JWKS behavior

  • Parses optional OpenID metadata issuer constraints inside the metadata parse error path, so malformed/non-string values are recorded as refresh failures rather than escaping the task.
  • Applies metadata issuer constraints to JWKS keys before constructing the internal SetJwtPublicSigningKeys request.
  • Validates jwks_uri before handing it to curl:
    • URI must parse successfully
    • scheme must be HTTPS
  • Allows jwks_uri host/port to differ from the issuer authority. OIDC Discovery requires HTTPS for jwks_uri, but does not require matching issuer/JWKS authorities; Google is a common provider where these authorities differ.

Configuration and docs

  • Adds jwt.key_refresh_max_response_size to startup config, JSON config handling, config schema, generated operations docs, test infra config generation, and JWT app docs.
  • Moves the PR changelog entries under 7.0.7 and updates python/pyproject.toml to 7.0.7.
  • Removes the temporary migration plan document.

Tests

Adds/updates JWT auto-refresh coverage for:

  • Initial auto-refresh on primary and after primary failover.
  • Curl connection failures incrementing refresh failure metrics.
  • TLS trust failures incrementing refresh failure metrics.
  • Invalid OpenID metadata issuer fields being recorded as refresh failures.
  • Cross-authority HTTPS jwks_uri succeeding.
  • Response size limit failures and recovery after restoring metadata.
  • Existing auto-refresh behavior and backup-primary auto-refresh coverage.

Validation run locally

  • ninja -C build logging curl_test
  • cd build && ../tests/.venv/bin/python ../tests/e2e_curl.py
  • Direct jwt_test.run_manual(args) against rebuilt build/samples/apps/logging/logging

Copilot AI and others added 3 commits June 25, 2026 20:50
Replace rpcsessions->create_client() HTTP fetches in JwtKeyAutoRefresh
with ccf::curl::CurlRequest attached to CurlmLibuvContextSingleton,
following the pattern established in #7102 for QuoteEndorsementsClient.

Changes:
- jwt_key_auto_refresh.h: inherit enable_shared_from_this, add
  send_curl_get() helper with CURLOPT_CAINFO_BLOB CA bundle support,
  migrate both metadata and JWKS fetches to curl with task-deferred
  response callbacks
- node_state.h: remove rpcsessions parameter from JwtKeyAutoRefresh
  constructor call
- CHANGELOG.md: add [Unreleased] entry for this change

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
Copilot AI changed the title [WIP] Migrate JWT/JWK auto-refresh to curl multi singleton client Migrate JWT/JWK auto-refresh from RPCSessions to curl multi singleton Jun 25, 2026
Copilot AI requested a review from achamayou June 25, 2026 20:54
The JWT issuer test server creates TLS certs using generate_cert which
hashes the CN (e.g. sha256("localhost")), so libcurl's hostname
verification fails when connecting to the server because neither the CN
nor any SAN matches the actual hostname.

Add an optional `san` parameter to `generate_cert` that, when provided,
adds a DNS Subject Alternative Name extension to the certificate. Update
JwtIssuer._generate_auth_data to pass the hostname as the SAN so that
libcurl can successfully verify the server certificate.

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>
@achamayou

Copy link
Copy Markdown
Member

@copilot look at the test failures to improve your change.

…etrics

Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@copilot look at the test failures to improve your change.

Fixed three CI failures in commit Fix CI failures: clang-format, CHANGELOG version, JWT test baseline metrics:

  1. C/C++ format: Applied clang-format to src/node/jwt_key_auto_refresh.h (long lambda capture lists were not wrapped per style rules).

  2. Release notes: Moved the changelog entry from ## [Unreleased] into the existing ## [7.0.6] section under a new ### Changed heading, matching the version in python/pyproject.toml.

  3. programmability_and_jwt test failure: The new code calls send_refresh_jwt_keys_error() for curl-level connection failures, which is the observability improvement described in the PR. However, when the test runs a second time after a primary failover (line 794 in jwt_test.py), the new primary fires the auto-refresh while the issuer server is momentarily down — generating a failure count before the test's server starts. The fix captures a baseline_m snapshot right before start_openid_server and changes the assertions to delta comparisons (m["failures"] == baseline_m["failures"]) so pre-existing connection failures from between test invocations are excluded.

achamayou and others added 5 commits June 26, 2026 09:18
Use HTTPS-only curl requests with bounded timeouts for JWT/JWK auto-refresh, and add coverage for connection and TLS refresh failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Normalize the parsed jwks_uri scheme before enforcing HTTPS-only refreshes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@achamayou achamayou added the run-long-test Run Long Test job label Jun 26, 2026
@achamayou achamayou requested a review from cjen1-msft June 26, 2026 19:51
@achamayou achamayou marked this pull request as ready for review June 29, 2026 20:09
@achamayou achamayou requested a review from a team as a code owner June 29, 2026 20:09
@achamayou achamayou merged commit 5ce8852 into copilot/jwk-refresh-curl-multi-plan Jun 29, 2026
21 of 22 checks passed
@achamayou achamayou deleted the copilot/copilotjwk-refresh-curl-multi-plan branch June 29, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-long-test Run Long Test job

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants