diff --git a/CHANGELOG.md b/CHANGELOG.md index 5e2ef85f32c8..db2b7ea8fe4a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,15 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html). +## [7.0.7] + +[7.0.7]: https://github.com/microsoft/CCF/releases/tag/ccf-7.0.7 + +### Changed + +- JWT/JWK auto-refresh outbound HTTP fetches (OpenID metadata and JWKS) now use the curl multi singleton client introduced in #7102, replacing the previous `RPCSessions::create_client()` path. Connection and TLS failures are now counted in refresh failure metrics via `send_refresh_jwt_keys_error()`, improving observability of network-level refresh errors (#7989). +- JWT/JWK auto-refresh now supports configuring the maximum response body size for fetched OpenID metadata and JWKS via the `jwt.key_refresh_max_response_size` node startup config setting (#7989). + ## [7.0.6] [7.0.6]: https://github.com/microsoft/CCF/releases/tag/ccf-7.0.6 diff --git a/doc/build_apps/auth/jwt.rst b/doc/build_apps/auth/jwt.rst index 48dd1fcdd4c9..92e9856d877a 100644 --- a/doc/build_apps/auth/jwt.rst +++ b/doc/build_apps/auth/jwt.rst @@ -107,7 +107,7 @@ Now the issuer can be created with auto-refresh enabled: .. note:: - The key refresh interval is set via the ``jwt.key_refresh_interval_s`` configuration entry, where the default is 30 min (1800 seconds). + The key refresh interval is set via the ``jwt.key_refresh_interval`` configuration entry, where the default is 30 min (1800 seconds). The maximum response body size accepted when fetching OpenID metadata and JWKS is set via ``jwt.key_refresh_max_response_size``, where the default is 1 MB. Removing a token issuer ----------------------- diff --git a/doc/dev/jwk_refresh_curl_multi_migration_plan.md b/doc/dev/jwk_refresh_curl_multi_migration_plan.md deleted file mode 100644 index 5df21a83a738..000000000000 --- a/doc/dev/jwk_refresh_curl_multi_migration_plan.md +++ /dev/null @@ -1,182 +0,0 @@ -# JWK refresh migration to curl multi singleton - -This note captures the implementation plan for migrating JWT/JWK auto-refresh in CCF away from the current `RPCSessions::create_client()` outbound HTTP client and onto the curl multi singleton infrastructure introduced in microsoft/CCF#7102. - -## Reference PR - -PR microsoft/CCF#7102 introduced: - -- `ccf::curl::UniqueCURL` -- `ccf::curl::UniqueSlist` -- `ccf::curl::RequestBody` -- `ccf::curl::ResponseBody` -- `ccf::curl::ResponseHeaders` -- `ccf::curl::CurlRequest` -- `ccf::curl::CurlmLibuvContextSingleton` - -It migrated `src/node/quote_endorsements_client.h` from the enclave/host RPC client path to asynchronous curl requests attached to the singleton curl multi/libuv context. - -## Current JWK refresh flow - -The current JWK refresh implementation is in: - -- `src/node/jwt_key_auto_refresh.h` - -It uses `rpcsessions->create_client(...)` in two places: - -1. OpenID metadata fetch: - - GET `issuer + "/.well-known/openid-configuration"` - - parse `jwks_uri` -2. JWKS fetch: - - GET the parsed `jwks_uri` - - parse `JsonWebKeySet` - - call the internal `/node/jwt_keys/refresh` endpoint via `send_refresh_jwt_keys(...)` - -The internal update path should remain unchanged. - -## Implementation goals - -- Replace both external HTTP fetches with `ccf::curl::CurlRequest`. -- Enqueue asynchronous requests with `ccf::curl::CurlmLibuvContextSingleton::get_instance()->attach_request(...)`. -- Preserve the existing refresh, parsing, issuer constraint, and metrics behaviour. -- Improve observability where possible: curl connection/TLS failures should call `send_refresh_jwt_keys_error()` so they are reflected in refresh failure metrics. -- Keep changes bite-sized and testable. - -## Bite-sized implementation steps - -### 1. Add curl dependency to JWK refresh - -- Include `http/curl.h` from `src/node/jwt_key_auto_refresh.h`. -- Keep the existing implementation compiling unchanged. - -Verification: - -- Build affected targets. - -### 2. Add request-building helpers - -Add private helpers to `JwtKeyAutoRefresh` for: - -- defaulting URL ports to `443` for HTTPS -- building a full request URL from `http::URL` -- setting common curl options -- attaching CA bundle material to curl -- creating a bounded `ccf::curl::ResponseBody` - -Verification: - -- Unit-level compile checks. -- Existing JWT tests still pass before call sites are migrated. - -### 3. Implement CA bundle handling for curl - -The existing code reads a PEM CA bundle from KV and constructs `tls::CA`/`tls::Cert` for the old client. - -For curl, prefer in-memory CA bundle support if available: - -- `CURLOPT_CAINFO_BLOB` via `UniqueCURL::set_blob_opt(...)` - -If unsupported by the project's supported libcurl version, introduce a request-owned fallback that keeps temporary CA material alive for the async request lifetime. - -Verification: - -- Valid issuer cert signed by configured CA bundle succeeds. -- Wrong CA bundle fails and is counted as a refresh failure. - -### 4. Migrate OpenID metadata fetch first - -In `refresh_jwt_keys()`: - -- Preserve issuer iteration, `auto_refresh` checks, attempts accounting, and missing CA bundle handling. -- Replace `rpcsessions->create_client(...)`, `connect(...)`, and `send_request(...)` with a curl GET request. -- Callback should: - - call `send_refresh_jwt_keys_error()` on `curl_response != CURLE_OK` - - otherwise pass status code and response body to `handle_jwt_metadata_response(...)` - -Expected signature change: - -```cpp -void handle_jwt_metadata_response( - const std::string& issuer, - std::string ca_cert_bundle_pem, - ccf::http_status status, - std::vector&& data); -``` - -Verification: - -- Metadata happy path still reaches JWKS fetch. -- Invalid metadata still increments failure metrics. -- Metadata connection failure increments failure metrics. - -### 5. Migrate JWKS fetch - -In `handle_jwt_metadata_response(...)`: - -- Preserve HTTP status handling, metadata parsing, `jwks_uri` parsing, and issuer constraint extraction. -- Replace the old client with a curl GET request using the same CA bundle material. -- Callback should: - - call `send_refresh_jwt_keys_error()` on `curl_response != CURLE_OK` - - otherwise pass status code and body to `handle_jwt_jwks_response(...)` - -Verification: - -- Existing JWT auto-refresh happy path passes. -- Invalid JWKS response increments failure metrics. -- HTTP non-200 response still triggers existing error handling. - -### 6. Clean up dependencies - -Once both external fetches use curl: - -- Remove the `rpcsessions` member from `JwtKeyAutoRefresh` if unused. -- Remove the constructor parameter if unused. -- Update all construction sites. -- Remove stale includes and comments about connection errors not being tracked. - -Verification: - -- Full build catches all construction-site updates. -- No `rpcsessions->create_client` usage remains in `jwt_key_auto_refresh.h`. - -### 7. Tests - -Add or update focused tests around: - -- successful JWT key auto-refresh -- initial one-off refresh after adding an issuer -- invalid OpenID metadata -- invalid JWKS -- unavailable issuer endpoint / connection failure -- TLS trust failure with an incorrect CA bundle - -Existing tests to keep passing include: - -- `test_jwt_key_auto_refresh` -- `test_jwt_key_initial_refresh` -- `test_jwt_key_auto_refresh_entries` - -### 8. Validation order - -1. Build affected C++ targets. -2. Run curl tests introduced by microsoft/CCF#7102 where available. -3. Run JWT auto-refresh tests. -4. Run broader JWT test file. -5. Check shutdown for libuv handle leaks or async lifetime issues. - -## Risks and things to watch - -- Async callback captures must own all data they use. -- CA bundle material must outlive async curl requests. -- Hostname/SNI verification should use the actual target host, especially for `jwks_uri`. -- Response bodies should be bounded rather than using `SIZE_MAX` where possible. -- Curl connection failures will now be counted as refresh failures; this is an intentional observability improvement but should be noted in the PR. -- The internal `/node/jwt_keys/refresh` update path should remain unchanged. - -## Definition of done - -- `src/node/jwt_key_auto_refresh.h` no longer uses `rpcsessions->create_client(...)` for OpenID/JWKS external fetches. -- Metadata and JWKS requests use `ccf::curl::CurlRequest` and `CurlmLibuvContextSingleton`. -- Existing JWT auto-refresh behaviour remains intact. -- New failure-mode tests cover connection and TLS failures. -- The PR description explains the relationship to microsoft/CCF#7102 and calls out the metrics-observability improvement. diff --git a/doc/host_config_schema/host_config.json b/doc/host_config_schema/host_config.json index 30a48cdeaf69..ec7c776871ca 100644 --- a/doc/host_config_schema/host_config.json +++ b/doc/host_config_schema/host_config.json @@ -653,6 +653,11 @@ "type": "string", "default": "30min", "description": "Interval at which JWT keys for issuers registered with auto-refresh are automatically refreshed" + }, + "key_refresh_max_response_size": { + "type": "string", + "default": "1MB", + "description": "Maximum response body size accepted when fetching OpenID metadata and JWKS for JWT issuer auto-refresh" } }, "description": "This section includes configuration for JWT issuers automatic refresh", diff --git a/include/ccf/node/startup_config.h b/include/ccf/node/startup_config.h index 457b418947eb..3d6cef413798 100644 --- a/include/ccf/node/startup_config.h +++ b/include/ccf/node/startup_config.h @@ -65,6 +65,7 @@ namespace ccf struct JWT { ccf::ds::TimeString key_refresh_interval = {"30min"}; + ccf::ds::SizeString key_refresh_max_response_size = {"1MB"}; bool operator==(const JWT&) const = default; }; diff --git a/python/pyproject.toml b/python/pyproject.toml index bcfe2aae22da..bde210cda2b8 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "ccf" -version = "7.0.6" +version = "7.0.7" authors = [ { name="CCF Team", email="CCF-Sec@microsoft.com" }, ] diff --git a/samples/constitutions/default/actions.js b/samples/constitutions/default/actions.js index 20c9bd69bb0e..d1d3477ccd1e 100644 --- a/samples/constitutions/default/actions.js +++ b/samples/constitutions/default/actions.js @@ -505,12 +505,10 @@ function checkReconfigurationType(config, new_config) { const from = config.reconfiguration_type; const to = new_config.reconfiguration_type; if (from !== to && to !== undefined) { - if ( - !( - (from === undefined || from === "OneTransaction") && - to === "TwoTransaction" - ) - ) { + if (!( + (from === undefined || from === "OneTransaction") && + to === "TwoTransaction" + )) { throw new Error( `Cannot change reconfiguration type from ${from} to ${to}.`, ); diff --git a/samples/minimal_ccf/app/actions.js b/samples/minimal_ccf/app/actions.js index d04e54c83494..2df7c76d4fc5 100644 --- a/samples/minimal_ccf/app/actions.js +++ b/samples/minimal_ccf/app/actions.js @@ -485,12 +485,10 @@ function checkReconfigurationType(config, new_config) { const from = config.reconfiguration_type; const to = new_config.reconfiguration_type; if (from !== to && to !== undefined) { - if ( - !( - (from === undefined || from === "OneTransaction") && - to === "TwoTransaction" - ) - ) { + if (!( + (from === undefined || from === "OneTransaction") && + to === "TwoTransaction" + )) { throw new Error( `Cannot change reconfiguration type from ${from} to ${to}.`, ); diff --git a/src/common/configuration.h b/src/common/configuration.h index 29dcc2e08aca..d11b775b26e6 100644 --- a/src/common/configuration.h +++ b/src/common/configuration.h @@ -73,7 +73,8 @@ namespace ccf DECLARE_JSON_TYPE_WITH_OPTIONAL_FIELDS(CCFConfig::JWT); DECLARE_JSON_REQUIRED_FIELDS(CCFConfig::JWT); - DECLARE_JSON_OPTIONAL_FIELDS(CCFConfig::JWT, key_refresh_interval); + DECLARE_JSON_OPTIONAL_FIELDS( + CCFConfig::JWT, key_refresh_interval, key_refresh_max_response_size); DECLARE_JSON_TYPE_WITH_OPTIONAL_FIELDS(CCFConfig::Attestation::Environment); DECLARE_JSON_REQUIRED_FIELDS(CCFConfig::Attestation::Environment); diff --git a/src/http/curl.h b/src/http/curl.h index 5d41e52d10fa..9d822db864b6 100644 --- a/src/http/curl.h +++ b/src/http/curl.h @@ -164,7 +164,18 @@ namespace ccf::curl void append(const char* str) { - p.reset(curl_slist_append(p.release(), str)); + auto* current = p.get(); + auto* updated = curl_slist_append(current, str); + if (updated == nullptr) + { + throw std::runtime_error("Error calling curl_slist_append"); + } + if (updated != current) + { + auto* released = p.release(); + (void)released; + p.reset(updated); + } } void append(const std::string& key, const std::string& value) @@ -586,10 +597,18 @@ namespace ccf::curl // retrieve the request data and attach a lifetime to it ccf::curl::CurlRequest* request = nullptr; - curl_easy_getinfo(easy, CURLINFO_PRIVATE, &request); + try + { + CHECK_CURL_EASY_GETINFO(easy, CURLINFO_PRIVATE, &request); + } + catch (const std::runtime_error&) + { + CHECK_CURL_MULTI(curl_multi_remove_handle, p.get(), easy); + throw; + } if (request == nullptr) { - curl_multi_remove_handle(p.get(), easy); + CHECK_CURL_MULTI(curl_multi_remove_handle, p.get(), easy); throw std::runtime_error( "CURLMSG_DONE received with no associated request data"); } @@ -597,7 +616,7 @@ namespace ccf::curl // detach the easy handle such that it can be cleaned up with the // destructor of CurlRequest - curl_multi_remove_handle(p.get(), easy); + CHECK_CURL_MULTI(curl_multi_remove_handle, p.get(), easy); // handle response inline. Note that if this is expensive, it should // defer its work to a task @@ -664,7 +683,7 @@ namespace ccf::curl if (self->is_stopping) { - LOG_FAIL_FMT("async_requests_callback called while stopping"); + LOG_DEBUG_FMT("async_requests_callback called while stopping"); return; } @@ -694,7 +713,7 @@ namespace ccf::curl if (self->is_stopping) { - LOG_FAIL_FMT("libuv_timeout_callback called while stopping"); + LOG_DEBUG_FMT("libuv_timeout_callback called while stopping"); return; } @@ -722,7 +741,7 @@ namespace ccf::curl if (self->is_stopping) { - LOG_FAIL_FMT("curl_timeout_callback called while stopping"); + LOG_DEBUG_FMT("curl_timeout_callback called while stopping"); return 0; } @@ -763,7 +782,7 @@ namespace ccf::curl if (self->is_stopping) { - LOG_FAIL_FMT( + LOG_DEBUG_FMT( "libuv_socket_poll_callback called on {} while stopped", socket_context->socket); return; @@ -957,27 +976,56 @@ namespace ccf::curl is_stopping = true; // remove, stop and cleanup all curl easy handles - std::unique_ptr easy_handles( - curl_multi_get_handles(curl_request_curlm), - // NOLINTNEXTLINE(bugprone-multi-level-implicit-pointer-conversion) - [](CURL** handles) { curl_free(static_cast(handles)); }); - // curl_multi_get_handles returns the handles as a null-terminated array - for (size_t i = 0; easy_handles.get()[i] != nullptr; ++i) + auto* easy_handles_raw = curl_multi_get_handles(curl_request_curlm); + if (easy_handles_raw == nullptr) + { + LOG_FAIL_FMT("Error calling curl_multi_get_handles while closing"); + } + else { - auto* easy = easy_handles.get()[i]; - curl_multi_remove_handle(curl_request_curlm, easy); - if (easy != nullptr) + std::unique_ptr easy_handles( + easy_handles_raw, + // NOLINTNEXTLINE(bugprone-multi-level-implicit-pointer-conversion) + [](CURL** handles) { curl_free(static_cast(handles)); }); + // curl_multi_get_handles returns the handles as a null-terminated array + for (size_t i = 0; easy_handles.get()[i] != nullptr; ++i) { - // attach a lifetime to the request - ccf::curl::CurlRequest* request = nullptr; - curl_easy_getinfo(easy, CURLINFO_PRIVATE, &request); - if (request == nullptr) + auto* easy = easy_handles.get()[i]; + const auto remove_res = + curl_multi_remove_handle(curl_request_curlm, easy); + if (remove_res != CURLM_OK) { LOG_FAIL_FMT( - "CURLMSG_DONE received with no associated request data"); + "Error calling curl_multi_remove_handle while closing: {} ({})", + remove_res, + curl_multi_strerror(remove_res)); + } + if (easy != nullptr) + { + // attach a lifetime to the request + ccf::curl::CurlRequest* request = nullptr; + const auto getinfo_res = + curl_easy_getinfo(easy, CURLINFO_PRIVATE, &request); + if (getinfo_res != CURLE_OK) + { + LOG_FAIL_FMT( + "Error calling curl_easy_getinfo while closing: {} ({})", + getinfo_res, + curl_easy_strerror(getinfo_res)); + curl_easy_cleanup(easy); + continue; + } + if (request == nullptr) + { + LOG_FAIL_FMT( + "CURL easy handle had no associated request data while " + "closing"); + curl_easy_cleanup(easy); + continue; + } + std::unique_ptr request_data_ptr(request); + curl_easy_cleanup(easy); } - std::unique_ptr request_data_ptr(request); - curl_easy_cleanup(easy); } } // Drain the deque rather than letting it destruct diff --git a/src/node/jwt_key_auto_refresh.h b/src/node/jwt_key_auto_refresh.h index 9738c36d70db..34724b614522 100644 --- a/src/node/jwt_key_auto_refresh.h +++ b/src/node/jwt_key_auto_refresh.h @@ -3,47 +3,96 @@ #pragma once #include "ccf/ds/json.h" +#include "ccf/ds/nonstd.h" #include "ccf/service/tables/jwt.h" +#include "http/curl.h" #include "http/http_builder.h" #include "http/http_rpc_context.h" #include "node/rpc/node_frontend.h" +#include "tasks/basic_task.h" +#include "tasks/task_system.h" #define FMT_HEADER_ONLY +#include #include namespace ccf { class JwtKeyAutoRefresh + : public std::enable_shared_from_this { private: size_t refresh_interval_s; NetworkState& network; std::shared_ptr consensus; - std::shared_ptr rpcsessions; std::shared_ptr rpc_map; ccf::crypto::ECKeyPairPtr node_sign_kp; ccf::crypto::Pem node_cert; std::atomic_size_t attempts; + std::atomic_bool stopped; + size_t max_response_size; ccf::tasks::Task periodic_refresh_task; + static constexpr long request_connection_timeout_s = 5; + static constexpr long request_response_timeout_s = 5; + + void send_curl_get( + const std::string& url, + const std::string& ca_bundle_pem, + ccf::curl::CurlRequest::ResponseCallback callback) + { + ccf::curl::UniqueCURL curl_handle; + curl_handle.set_opt(CURLOPT_HTTPGET, 1L); + curl_handle.set_opt(CURLOPT_CONNECTTIMEOUT, request_connection_timeout_s); + curl_handle.set_opt(CURLOPT_TIMEOUT, request_response_timeout_s); + // 1L enables peer certificate verification. See libcurl docs: + // https://curl.se/libcurl/c/CURLOPT_SSL_VERIFYPEER.html + curl_handle.set_opt(CURLOPT_SSL_VERIFYPEER, 1L); + // 2L requires the certificate name to match the requested host. + // See libcurl docs: + // https://curl.se/libcurl/c/CURLOPT_SSL_VERIFYHOST.html + curl_handle.set_opt(CURLOPT_SSL_VERIFYHOST, 2L); + curl_handle.set_opt(CURLOPT_PROTOCOLS_STR, "https"); + curl_handle.set_blob_opt( + CURLOPT_CAINFO_BLOB, + reinterpret_cast(ca_bundle_pem.data()), + ca_bundle_pem.size()); + curl_handle.set_opt(CURLOPT_CAPATH, nullptr); + + ccf::curl::UniqueSlist headers; + + auto request = std::make_unique( + std::move(curl_handle), + HTTP_GET, + url, + std::move(headers), + nullptr, + std::make_unique(max_response_size), + std::move(callback)); + + ccf::curl::CurlmLibuvContextSingleton::get_instance()->attach_request( + std::move(request)); + } + public: JwtKeyAutoRefresh( size_t refresh_interval_s, NetworkState& network, const std::shared_ptr& consensus, - const std::shared_ptr& rpcsessions, const std::shared_ptr& rpc_map, ccf::crypto::ECKeyPairPtr node_sign_kp, - ccf::crypto::Pem node_cert) : + ccf::crypto::Pem node_cert, + size_t max_response_size) : refresh_interval_s(refresh_interval_s), network(network), consensus(consensus), - rpcsessions(rpcsessions), rpc_map(rpc_map), node_sign_kp(std::move(node_sign_kp)), node_cert(std::move(node_cert)), - attempts(0) + attempts(0), + stopped(false), + max_response_size(max_response_size) {} ~JwtKeyAutoRefresh() @@ -53,19 +102,28 @@ namespace ccf void start() { + stopped.store(false); LOG_DEBUG_FMT("JWT key initial auto-refresh"); - periodic_refresh_task = ccf::tasks::make_basic_task([this]() { - if (!this->consensus->can_replicate()) + const auto self = weak_from_this(); + periodic_refresh_task = ccf::tasks::make_basic_task([self]() { + const auto self_sp = self.lock(); + if (self_sp == nullptr || self_sp->stopped.load()) + { + return; + } + + if (!self_sp->consensus->can_replicate()) { LOG_DEBUG_FMT("JWT key auto-refresh: Node is not primary, skipping"); } else { - this->refresh_jwt_keys(); + self_sp->refresh_jwt_keys(); } LOG_DEBUG_FMT( - "JWT key auto-refresh: Scheduling in {}s", this->refresh_interval_s); + "JWT key auto-refresh: Scheduling in {}s", + self_sp->refresh_interval_s); }); const std::chrono::seconds period(refresh_interval_s); @@ -74,6 +132,7 @@ namespace ccf void stop() { + stopped.store(true); if (periodic_refresh_task != nullptr) { periodic_refresh_task->cancel_task(); @@ -83,15 +142,22 @@ namespace ccf void schedule_once() { LOG_DEBUG_FMT("JWT key one-off refresh: Scheduling without delay"); - ccf::tasks::add_task(ccf::tasks::make_basic_task([this]() { - if (!this->consensus->can_replicate()) + const auto self = weak_from_this(); + ccf::tasks::add_task(ccf::tasks::make_basic_task([self]() { + const auto self_sp = self.lock(); + if (self_sp == nullptr || self_sp->stopped.load()) + { + return; + } + + if (!self_sp->consensus->can_replicate()) { LOG_DEBUG_FMT( "JWT key one-off refresh: Node is not primary, skipping"); } else { - this->refresh_jwt_keys(); + self_sp->refresh_jwt_keys(); } })); } @@ -166,9 +232,6 @@ namespace ccf return; } - // call internal endpoint to update keys - auto msg = SetJwtPublicSigningKeys{issuer, jwks}; - // For each key we leave the specified issuer constraint or set a common // one otherwise (if present). if (issuer_constraint.has_value()) @@ -182,12 +245,15 @@ namespace ccf } } + // call internal endpoint to update keys + auto msg = SetJwtPublicSigningKeys{issuer, jwks}; + send_refresh_jwt_keys(msg); } void handle_jwt_metadata_response( const std::string& issuer, - std::shared_ptr<::tls::CA> ca, + std::string ca_bundle_pem, ccf::http_status status, std::vector&& data) { @@ -211,10 +277,16 @@ namespace ccf std::string jwks_url_str; nlohmann::json metadata; + std::optional issuer_constraint{std::nullopt}; try { metadata = ccf::parse_json_safe(data); jwks_url_str = metadata.at("jwks_uri").get(); + const auto constraint = metadata.find("issuer"); + if (constraint != metadata.end()) + { + issuer_constraint = constraint->get(); + } } catch (const std::exception& e) { @@ -226,6 +298,10 @@ namespace ccf send_refresh_jwt_keys_error(); return; } + // Validate jwks_uri before handing it to libcurl; the parsed result is + // not used directly since the full URL string is passed to curl. The + // JWKS host/port may differ from the issuer authority; OIDC Discovery + // requires HTTPS here, but does not require matching authorities. ::http::URL jwks_url; try { @@ -234,56 +310,93 @@ namespace ccf catch (const std::invalid_argument& e) { LOG_FAIL_FMT( - "JWT key auto-refresh: Cannot parse jwks_uri for issuer '{}': {}", + "JWT key auto-refresh: Cannot parse jwks_uri for issuer '{}': {} " + "({})", issuer, - jwks_url_str); + jwks_url_str, + e.what()); send_refresh_jwt_keys_error(); return; } - auto jwks_url_port = !jwks_url.port.empty() ? jwks_url.port : "443"; - - auto ca_cert = std::make_shared<::tls::Cert>( - ca, std::nullopt, std::nullopt, jwks_url.host); - std::optional issuer_constraint{std::nullopt}; - const auto constraint = metadata.find("issuer"); - if (constraint != metadata.end()) + ccf::nonstd::to_lower(jwks_url.scheme); + if (jwks_url.scheme != "https") { - issuer_constraint = *constraint; + LOG_FAIL_FMT( + "JWT key auto-refresh: jwks_uri for issuer '{}' must use https: {}", + issuer, + jwks_url_str); + send_refresh_jwt_keys_error(); + return; } LOG_DEBUG_FMT( - "JWT key auto-refresh: Requesting JWKS at https://{}:{}{}", - jwks_url.host, - jwks_url_port, - jwks_url.path); - auto http_client = rpcsessions->create_client(ca_cert); - // Note: Connection errors are not signalled and hence not tracked in - // endpoint metrics currently. - http_client->connect( - std::string(jwks_url.host), - std::string(jwks_url_port), - [this, issuer, issuer_constraint]( - ccf::http_status status, - http::HeaderMap&&, - std::vector&& data) { - handle_jwt_jwks_response( - issuer, issuer_constraint, status, std::move(data)); - return true; - }); - ::http::Request r(jwks_url.path, HTTP_GET); - r.set_header(ccf::http::headers::HOST, std::string(jwks_url.host)); - http_client->send_request(std::move(r)); + "JWT key auto-refresh: Requesting JWKS at {}", jwks_url_str); + + const auto self = weak_from_this(); + auto response_callback = + [self, issuer, issuer_constraint]( + std::unique_ptr&& request, + CURLcode curl_response, + long status_code) { + auto http_status = static_cast(status_code); + auto response_body_sp = std::make_shared>( + request->get_response_body() != nullptr ? + std::move(request->get_response_body()->buffer) : + std::vector{}); + ccf::tasks::add_task( + ccf::tasks::make_basic_task([self, + issuer, + issuer_constraint, + curl_response, + http_status, + response_body_sp]() { + const auto self_sp = self.lock(); + if (self_sp == nullptr || self_sp->stopped.load()) + { + return; + } + + if (curl_response != CURLE_OK) + { + LOG_FAIL_FMT( + "JWT key auto-refresh: Failed to fetch JWKS for issuer '{}': " + "{} ({})", + issuer, + curl_easy_strerror(curl_response), + curl_response); + self_sp->send_refresh_jwt_keys_error(); + return; + } + self_sp->handle_jwt_jwks_response( + issuer, + issuer_constraint, + http_status, + std::move(*response_body_sp)); + })); + }; + + send_curl_get(jwks_url_str, ca_bundle_pem, std::move(response_callback)); } void refresh_jwt_keys() { + if (stopped.load()) + { + return; + } + auto tx = network.tables->create_read_only_tx(); auto* jwt_issuers = tx.ro(network.jwt_issuers); auto* ca_cert_bundles = tx.ro(network.ca_cert_bundles); jwt_issuers->foreach([this, &ca_cert_bundles]( const JwtIssuer& issuer, const JwtIssuerMetadata& metadata) { + if (stopped.load()) + { + return false; + } + if (!metadata.auto_refresh) { LOG_DEBUG_FMT( @@ -312,38 +425,59 @@ namespace ccf return true; } - auto metadata_url_str = issuer + "/.well-known/openid-configuration"; - auto metadata_url = ::http::parse_url_full(metadata_url_str); - auto metadata_url_port = - !metadata_url.port.empty() ? metadata_url.port : "443"; - - auto ca_pems = - crypto::split_x509_cert_bundle(ca_cert_bundle_pem.value()); - auto ca = std::make_shared<::tls::CA>(ca_pems); - auto ca_cert = std::make_shared<::tls::Cert>( - ca, std::nullopt, std::nullopt, metadata_url.host); + auto metadata_url = issuer + "/.well-known/openid-configuration"; LOG_DEBUG_FMT( - "JWT key auto-refresh: Requesting OpenID metadata at https://{}:{}{}", - metadata_url.host, - metadata_url_port, - metadata_url.path); - auto http_client = rpcsessions->create_client(ca_cert); - // Note: Connection errors are not signalled and hence not tracked in - // endpoint metrics currently. - http_client->connect( - std::string(metadata_url.host), - std::string(metadata_url_port), - [this, issuer, ca]( - ccf::http_status status, - ccf::http::HeaderMap&&, - std::vector&& data) { - handle_jwt_metadata_response(issuer, ca, status, std::move(data)); - return true; - }); - ::http::Request r(metadata_url.path, HTTP_GET); - r.set_header(ccf::http::headers::HOST, std::string(metadata_url.host)); - http_client->send_request(std::move(r)); + "JWT key auto-refresh: Requesting OpenID metadata at {}", + metadata_url); + + auto ca_bundle_pem = ca_cert_bundle_pem.value(); + + const auto self = weak_from_this(); + auto response_callback = + [self, issuer, ca_bundle_pem]( + std::unique_ptr&& request, + CURLcode curl_response, + long status_code) { + auto http_status = static_cast(status_code); + auto response_body_sp = std::make_shared>( + request->get_response_body() != nullptr ? + std::move(request->get_response_body()->buffer) : + std::vector{}); + ccf::tasks::add_task( + ccf::tasks::make_basic_task([self, + issuer, + ca_bundle_pem, + curl_response, + http_status, + response_body_sp]() { + const auto self_sp = self.lock(); + if (self_sp == nullptr || self_sp->stopped.load()) + { + return; + } + + if (curl_response != CURLE_OK) + { + LOG_FAIL_FMT( + "JWT key auto-refresh: Failed to fetch OpenID metadata for " + "issuer '{}': {} ({})", + issuer, + curl_easy_strerror(curl_response), + curl_response); + self_sp->send_refresh_jwt_keys_error(); + return; + } + self_sp->handle_jwt_metadata_response( + issuer, + ca_bundle_pem, + http_status, + std::move(*response_body_sp)); + })); + }; + + send_curl_get( + metadata_url, ca_bundle_pem, std::move(response_callback)); return true; }); } diff --git a/src/node/node_state.h b/src/node/node_state.h index 8e66c00b159b..738520612040 100644 --- a/src/node/node_state.h +++ b/src/node/node_state.h @@ -1419,10 +1419,10 @@ namespace ccf config.jwt.key_refresh_interval.count_s(), network, consensus, - rpcsessions, rpc_map, node_sign_kp, - self_signed_node_cert); + self_signed_node_cert, + config.jwt.key_refresh_max_response_size); jwt_key_auto_refresh->start(); network.tables->set_map_hook( diff --git a/tests/config.jinja b/tests/config.jinja index d4a232fe0cec..b6a0bf18b51d 100644 --- a/tests/config.jinja +++ b/tests/config.jinja @@ -98,7 +98,8 @@ }, "jwt": { - "key_refresh_interval": "{{ jwt_key_refresh_interval }}" + "key_refresh_interval": "{{ jwt_key_refresh_interval }}", + "key_refresh_max_response_size": "{{ jwt_key_refresh_max_response_size }}" }, "output_files": { "node_certificate_file": "{{ node_certificate_file or "node.pem" }}", diff --git a/tests/infra/crypto.py b/tests/infra/crypto.py index 8602562560da..ad48fb44dfb2 100644 --- a/tests/infra/crypto.py +++ b/tests/infra/crypto.py @@ -133,6 +133,7 @@ def generate_cert( ca=False, valid_from=None, validity_days=10, + san: Optional[str] = None, ) -> str: cn = cn or "dummy" if issuer_priv_key_pem is None: @@ -175,6 +176,11 @@ def generate_cert( x509.BasicConstraints(ca=True, path_length=None), critical=True, ) + if san is not None: + builder = builder.add_extension( + x509.SubjectAlternativeName([x509.DNSName(san)]), + critical=False, + ) cert = builder.sign(issuer_priv, hashes.SHA256(), default_backend()) diff --git a/tests/infra/e2e_args.py b/tests/infra/e2e_args.py index 9a135f836fe2..260fb9c924fa 100644 --- a/tests/infra/e2e_args.py +++ b/tests/infra/e2e_args.py @@ -112,6 +112,11 @@ def cli_args( action="append", default=[], ) + parser.add_argument( + "--jwt-key-refresh-max-response-size", + help="Maximum response body size accepted when fetching JWT issuer OpenID metadata and JWKS", + default="1MB", + ) parser.add_argument( "-o", "--network-only", diff --git a/tests/infra/jwt_issuer.py b/tests/infra/jwt_issuer.py index 76f38ab70681..1ef125ee8c27 100644 --- a/tests/infra/jwt_issuer.py +++ b/tests/infra/jwt_issuer.py @@ -158,7 +158,7 @@ def _generate_auth_data(self, cn=None): else: raise ValueError(f"Unsupported algorithm: {self._alg}") - cert = infra.crypto.generate_cert(key_priv, cn=cn, ca=True) + cert = infra.crypto.generate_cert(key_priv, cn=cn, ca=True, san=cn) return (key_priv, key_pub), cert def __init__( diff --git a/tests/infra/network.py b/tests/infra/network.py index 12ce0e16f9e8..16c6bf33274e 100644 --- a/tests/infra/network.py +++ b/tests/infra/network.py @@ -203,6 +203,7 @@ class Network: "max_open_sessions_hard", "forwarding_timeout_ms", "jwt_key_refresh_interval_s", + "jwt_key_refresh_max_response_size", "common_read_only_ledger_dir", "curve_id", "initial_node_cert_validity_days", diff --git a/tests/infra/remote.py b/tests/infra/remote.py index bc7d5769ab53..471445c1277e 100644 --- a/tests/infra/remote.py +++ b/tests/infra/remote.py @@ -313,6 +313,7 @@ def __init__( join_timer_s=None, sig_ms_interval=None, jwt_key_refresh_interval_s=None, + jwt_key_refresh_max_response_size="1MB", election_timeout_ms=None, consensus_update_timeout_ms=None, node_data_json_file=None, @@ -518,6 +519,7 @@ def __init__( join_timer=f"{join_timer_s}s" if join_timer_s else None, signature_interval_duration=f"{sig_ms_interval}ms", jwt_key_refresh_interval=f"{jwt_key_refresh_interval_s}s", + jwt_key_refresh_max_response_size=jwt_key_refresh_max_response_size, election_timeout=f"{election_timeout_ms}ms", message_timeout=f"{consensus_update_timeout_ms}ms", node_data_json_file=node_data_json_file, diff --git a/tests/jwt_test.py b/tests/jwt_test.py index ca7d67f7769e..4d40916fc883 100644 --- a/tests/jwt_test.py +++ b/tests/jwt_test.py @@ -4,7 +4,9 @@ import json import time import base64 +import socket import infra.network +import infra.jwt_issuer import infra.path import infra.proc import infra.net @@ -12,7 +14,11 @@ import infra.e2e_args import infra.proposal import suite.test_requirements as reqs -from infra.jwt_issuer import get_jwt_issuers, get_jwt_keys +from infra.jwt_issuer import ( + OpenIDProviderServer, + get_jwt_issuers, + get_jwt_keys, +) import ca_certs import ccf.ledger from ccf.tx_id import TxID @@ -371,6 +377,15 @@ def check_kv_jwt_key_matches(args, network, kid, key_pem): assert stored_key == key_pem, "input cert is not equal to stored cert" +def check_kv_jwt_key_constraint(args, network, kid, expected_constraint): + primary, _ = network.find_nodes() + latest_jwt_signing_keys = get_jwt_keys(args, primary) + + assert kid in latest_jwt_signing_keys + stored_constraint = latest_jwt_signing_keys[kid][0]["constraint"] + assert stored_constraint == expected_constraint + + def check_kv_jwt_keys_not_empty(args, network, issuer): primary, _ = network.find_nodes() latest_jwt_signing_keys = get_jwt_keys(args, primary) @@ -393,6 +408,234 @@ def get_jwt_refresh_endpoint_metrics(primary) -> dict: return r.body.json() +def get_unused_local_port(): + with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: + s.bind(("localhost", 0)) + return s.getsockname()[1] + + +def add_auto_refresh_jwt_issuer(network, primary, issuer, ca_cert_bundle_name): + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as ca_cert_bundle_fp: + ca_cert_bundle_fp.write(issuer.tls_cert) + ca_cert_bundle_fp.flush() + network.consortium.set_ca_cert_bundle( + primary, ca_cert_bundle_name, ca_cert_bundle_fp.name + ) + + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as metadata_fp: + json.dump( + { + "issuer": issuer.name, + "auto_refresh": True, + "ca_cert_bundle_name": ca_cert_bundle_name, + }, + metadata_fp, + ) + metadata_fp.flush() + network.consortium.set_jwt_issuer(primary, metadata_fp.name) + + +def remove_all_jwt_issuers(network, args, primary): + for issuer in list(get_jwt_issuers(args, primary)): + network.consortium.remove_jwt_issuer(primary, issuer) + + +def check_refresh_failures_increased(primary, failures_before): + m = get_jwt_refresh_endpoint_metrics(primary) + assert m["failures"] > failures_before, m + + +def test_jwt_key_auto_refresh_connection_failure(network, args): + primary, _ = network.find_nodes() + remove_all_jwt_issuers(network, args, primary) + failures_before = get_jwt_refresh_endpoint_metrics(primary)["failures"] + issuer_host = "localhost" + issuer_port = get_unused_local_port() + issuer = infra.jwt_issuer.JwtIssuer( + f"https://{issuer_host}:{issuer_port}", cn=issuer_host + ) + + LOG.info("Add JWT issuer with auto-refresh pointing at an unavailable endpoint") + add_auto_refresh_jwt_issuer(network, primary, issuer, "jwt_connection_failure") + try: + with_timeout( + lambda: check_refresh_failures_increased(primary, failures_before), + timeout=5, + ) + finally: + network.consortium.remove_jwt_issuer(primary, issuer.name) + + +def test_jwt_key_auto_refresh_tls_failure(network, args): + primary, _ = network.find_nodes() + remove_all_jwt_issuers(network, args, primary) + failures_before = get_jwt_refresh_endpoint_metrics(primary)["failures"] + issuer = infra.jwt_issuer.JwtIssuer("https://localhost", cn="localhost") + wrong_ca_issuer = infra.jwt_issuer.JwtIssuer( + "https://localhost-wrong-ca", cn="localhost" + ) + ca_cert_bundle_name = "jwt_tls_failure" + + LOG.info("Add wrong CA cert for JWT issuer") + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as ca_cert_bundle_fp: + ca_cert_bundle_fp.write(wrong_ca_issuer.tls_cert) + ca_cert_bundle_fp.flush() + network.consortium.set_ca_cert_bundle( + primary, ca_cert_bundle_name, ca_cert_bundle_fp.name + ) + + LOG.info("Start OpenID endpoint server with a certificate signed by another CA") + with issuer.start_openid_server(0) as server: + issuer_name = f"https://localhost:{server.bind_port}" + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as metadata_fp: + json.dump( + { + "issuer": issuer_name, + "auto_refresh": True, + "ca_cert_bundle_name": ca_cert_bundle_name, + }, + metadata_fp, + ) + metadata_fp.flush() + network.consortium.set_jwt_issuer(primary, metadata_fp.name) + + try: + with_timeout( + lambda: check_refresh_failures_increased(primary, failures_before), + timeout=5, + ) + finally: + network.consortium.remove_jwt_issuer(primary, issuer_name) + + +def test_jwt_key_auto_refresh_invalid_metadata_issuer(network, args): + primary, _ = network.find_nodes() + remove_all_jwt_issuers(network, args, primary) + failures_before = get_jwt_refresh_endpoint_metrics(primary)["failures"] + issuer = infra.jwt_issuer.JwtIssuer("https://localhost", cn="localhost") + ca_cert_bundle_name = "jwt_invalid_metadata_issuer" + + LOG.info("Add CA cert for JWT issuer") + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as ca_cert_bundle_fp: + ca_cert_bundle_fp.write(issuer.tls_cert) + ca_cert_bundle_fp.flush() + network.consortium.set_ca_cert_bundle( + primary, ca_cert_bundle_name, ca_cert_bundle_fp.name + ) + + LOG.info("Start OpenID endpoint server with a non-string issuer metadata field") + with issuer.start_openid_server(0) as server: + issuer_name = f"https://localhost:{server.bind_port}" + server.metadata["issuer"] = {"unexpected": "object"} + + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as metadata_fp: + json.dump( + { + "issuer": issuer_name, + "auto_refresh": True, + "ca_cert_bundle_name": ca_cert_bundle_name, + }, + metadata_fp, + ) + metadata_fp.flush() + network.consortium.set_jwt_issuer(primary, metadata_fp.name) + + try: + with_timeout( + lambda: check_refresh_failures_increased(primary, failures_before), + timeout=5, + ) + finally: + network.consortium.remove_jwt_issuer(primary, issuer_name) + + +def test_jwt_key_auto_refresh_cross_authority_jwks_uri(network, args): + primary, _ = network.find_nodes() + remove_all_jwt_issuers(network, args, primary) + issuer_host = "localhost" + issuer_port = get_unused_local_port() + jwks_port = get_unused_local_port() + issuer = infra.jwt_issuer.JwtIssuer( + f"https://{issuer_host}:{issuer_port}", cn=issuer_host + ) + ca_cert_bundle_name = "jwt_cross_authority_jwks_uri" + kid = "cross_authority_jwks_uri" + + LOG.info("Add CA cert for JWT issuer") + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as ca_cert_bundle_fp: + ca_cert_bundle_fp.write(issuer.tls_cert) + ca_cert_bundle_fp.flush() + network.consortium.set_ca_cert_bundle( + primary, ca_cert_bundle_name, ca_cert_bundle_fp.name + ) + + LOG.info("Start OpenID endpoint server with cross-authority JWKS URI") + with issuer.start_openid_server(issuer_port, kid) as server, OpenIDProviderServer( + jwks_port, issuer.tls_priv, issuer.tls_cert, issuer.create_jwks(kid) + ) as jwks_server: + issuer_name = issuer.name + # Exercise OIDC-compatible metadata where JWKS are served from a + # different authority than the issuer metadata. + server.metadata["jwks_uri"] = f"https://localhost:{jwks_server.bind_port}/keys" + + with tempfile.NamedTemporaryFile(prefix="ccf", mode="w+") as metadata_fp: + json.dump( + { + "issuer": issuer_name, + "auto_refresh": True, + "ca_cert_bundle_name": ca_cert_bundle_name, + }, + metadata_fp, + ) + metadata_fp.flush() + network.consortium.set_jwt_issuer(primary, metadata_fp.name) + + try: + with_timeout( + lambda: check_kv_jwt_key_matches( + args, network, kid, issuer.key_pub_pem + ), + timeout=5, + ) + finally: + network.consortium.remove_jwt_issuer(primary, issuer_name) + + +def test_jwt_key_auto_refresh_response_size_limit(network, args): + primary, _ = network.find_nodes() + remove_all_jwt_issuers(network, args, primary) + failures_before = get_jwt_refresh_endpoint_metrics(primary)["failures"] + issuer = infra.jwt_issuer.JwtIssuer("https://localhost", cn="localhost") + ca_cert_bundle_name = "jwt_response_size_limit" + kid = "response_size_limit" + + LOG.info("Start OpenID endpoint server with oversized metadata") + with issuer.start_openid_server(0, kid) as server: + issuer.name = f"https://localhost:{server.bind_port}" + server.metadata["oversized_response"] = "x" * 4096 + add_auto_refresh_jwt_issuer(network, primary, issuer, ca_cert_bundle_name) + + try: + with_timeout( + lambda: check_refresh_failures_increased(primary, failures_before), + timeout=5, + ) + + LOG.info("Restore OpenID metadata and re-add JWT issuer") + del server.metadata["oversized_response"] + network.consortium.remove_jwt_issuer(primary, issuer.name) + add_auto_refresh_jwt_issuer(network, primary, issuer, ca_cert_bundle_name) + + with_timeout( + lambda: check_kv_jwt_key_matches( + args, network, kid, issuer.key_pub_pem + ), + timeout=5, + ) + finally: + network.consortium.remove_jwt_issuer(primary, issuer.name) + + @reqs.description("JWT with auto_refresh enabled") def test_jwt_key_auto_refresh(network, args): primary, _ = network.find_nodes() @@ -415,6 +658,11 @@ def test_jwt_key_auto_refresh(network, args): ) LOG.info("Start OpenID endpoint server") + # Capture baseline metrics before the server starts: any connection failures + # from a prior test run (e.g. after a primary failover when the server was + # briefly unavailable) will already be reflected here and must not be counted + # as failures introduced by this test. + baseline_m = get_jwt_refresh_endpoint_metrics(primary) with issuer.start_openid_server(issuer_port, kid) as server: # Send oversized headers with the payload that will cause the CCF client to # fail parsing and log an error. @@ -451,21 +699,22 @@ def assert_request_count_increased(): ), timeout=5, ) + check_kv_jwt_key_constraint(args, network, kid, issuer.name) - LOG.info("Check that JWT refresh has attempts and successes and no failures") + LOG.info("Check that JWT refresh has attempts and successes") m = get_jwt_refresh_endpoint_metrics(primary) - assert m["attempts"] > 0, m - assert m["successes"] > 0, m - assert m["failures"] == 0, m + assert m["attempts"] > baseline_m["attempts"], m + assert m["successes"] > baseline_m["successes"], m + failures = m["failures"] LOG.info("Serve invalid JWKS") server.jwks = {"foo": "bar"} - LOG.info("Check that JWT refresh endpoint has some failures") + LOG.info("Check that JWT refresh endpoint has more failures") def check_has_failures(): m = get_jwt_refresh_endpoint_metrics(primary) - assert m["failures"] > 0, m + assert m["failures"] > failures, m with_timeout(check_has_failures, timeout=5) @@ -812,6 +1061,11 @@ def run_manual(args): primary.stop() network.wait_for_new_primary(primary) test_jwt_key_initial_refresh(network, args) + test_jwt_key_auto_refresh_connection_failure(network, args) + test_jwt_key_auto_refresh_tls_failure(network, args) + test_jwt_key_auto_refresh_invalid_metadata_issuer(network, args) + test_jwt_key_auto_refresh_cross_authority_jwks_uri(network, args) + test_jwt_key_auto_refresh_response_size_limit(network, args) def run_ca_cert(args): diff --git a/tests/npm-app/src/endpoints/crypto.ts b/tests/npm-app/src/endpoints/crypto.ts index 94afa9730728..2bbadad891cd 100644 --- a/tests/npm-app/src/endpoints/crypto.ts +++ b/tests/npm-app/src/endpoints/crypto.ts @@ -97,9 +97,7 @@ interface RsaOaepAesKwpParams { } type WrapAlgoParams = - | RsaOaepParams - | RsaOaepAesKwpParams - | ccfcrypto.AesKwpParams; + RsaOaepParams | RsaOaepAesKwpParams | ccfcrypto.AesKwpParams; interface WrapKeyRequest { key: Base64; // typically an AES key diff --git a/tests/programmability.py b/tests/programmability.py index 7af0236ee453..804c69c363b2 100644 --- a/tests/programmability.py +++ b/tests/programmability.py @@ -652,6 +652,7 @@ def run(args): package="samples/apps/logging/logging", nodes=infra.e2e_args.min_nodes(cr.args, f=1), jwt_key_refresh_interval_s=100000, + jwt_key_refresh_max_response_size="4KB", issuer_port=12346, forwarding_timeout_ms=jwt_forwarding_timeout_ms, )