fix(sse-retry): gate retry timing only on the early side by maxisbey · Pull Request #375 · modelcontextprotocol/conformance

maxisbey · 2026-06-30T14:48:25Z

Problem

sse-retry flakes in python-sdk CI (~4–5% of runs, including recent failures on every push to main), always with the signature ✗ sse-retry: 2 passed, 0 failed, 1 warnings — the retry-timing check landing in its (700, 1000]ms "slightly late" WARNING band, which fails the scenario under the WARNING policy (#245). Example run: https://github.com/modelcontextprotocol/python-sdk/actions/runs/28443055485

The 2025-11-25 spec sentence is "The client MUST respect the retry field, waiting the given number of milliseconds before attempting to reconnect" — a lower bound. Neither MCP nor WHATWG SSE bounds reconnect delay from above. The measured delay is the client's sleep (≥ retry) plus EOF detection, GET propagation, and harness dispatch — all inflated by load, and suite mode runs every scenario's client concurrently. Measured locally with the python client: 503ms idle against the 700ms gate, 591–674ms with the suite pinned to 2 cores. The band gates the environment, not the client.

Change (one file, +11/−36)

Delete the slightly-late WARNING and >2× FAILURE bands: the timing check is now FAILURE only when the reconnect arrives earlier than retry − 50ms, otherwise SUCCESS, with the measured delay kept in details. Load can only push a compliant client deeper into the passing region, so the flake is structurally gone.
Demote the "could not measure timing" path from WARNING to INFO (the other spurious-warning path).
One-line fix: the initialize response negotiated protocolVersion: 2025-03-26, a version whose spec had no retry MUST; now 2025-11-25 (same issue 4686e06 fixed for server-sse-polling).

Strictly relaxing — no SDK that passes today can start failing on a pin bump; retry stays at 500ms so test duration is unchanged.

Trade-off accepted: the >2× band was the only (unsound) detector for clients that ignore the retry field — a client's ~1000ms default backoff sat near that threshold by coincidence, and a loaded runner pushed compliant clients over it. With the band gone, retry-field-ignorance is untested, which seems proportionate for a scenario that is removed in draft.

Verification

python-sdk client, real run: 3/3 SUCCESS (actualDelayMs: 502).
python-sdk client patched to ignore the retry hint (reconnects at its 1000ms fallback, previously red): now passes, delay recorded in details.
npm run test 328 passed; typecheck and lint clean.

_{AI Disclaimer}

pkg-pr-new · 2026-06-30T14:48:49Z

Open in StackBlitz

npx https://pkg.pr.new/@modelcontextprotocol/conformance@375

commit: 4265e85

pcarleton · 2026-06-30T15:02:44Z

can we fix this more surgically?

this test is irrelevant in 2026-07-28 and beyond, so i don't want to invest a bunch more into it
SDK's have passed at this for previous tiering, so every change to an existing suite means potentially noisy failures when an SDK bumps (maybe it gets better! but we'd need to test that)
Bumping timeouts to be super long means every test runs for at least that long. if we are going to do this, i'd rather we categorize the test as "long running" / tiering only, so we don't run it in CI by default, and then we just run it during tiering. that's not ideal either though
5+ line comments are ai smell imo

The retry MUST ("waiting the given number of milliseconds before attempting to reconnect") is a lower bound; no spec text bounds reconnect delay from above. The slightly-late WARNING and >2x FAILURE bands gated environment latency, which CI load inflates - in python-sdk CI the warning band fired on ~4-5% of runs under the suite's parallel client spawns. - delete the late bands: FAILURE only when the reconnect arrives earlier than retry minus the 50ms tolerance - demote the could-not-measure path from WARNING to INFO - negotiate protocolVersion 2025-11-25, the version whose MUST is under test (was 2025-03-26; same fix as 4686e06 for server-sse-polling) Strictly relaxing: no client that passes today can start failing.

maxisbey · 2026-06-30T18:22:46Z

Descoped — the diff is now one file, +11/−36:

deleted the two late-side bands (the 2025-11-25 MUST is "waiting the given number of milliseconds before attempting to reconnect" — a lower bound; neither MCP nor WHATWG SSE bounds reconnect delay from above), and demoted the can't-measure warning to INFO
retry stays at 500ms, so test duration is unchanged — no need for a long-running/tiering-only class
strictly relaxing: no SDK that passes today can start failing on a bump; flaky SDKs just stop flaking
kept one one-line fix: the initialize response negotiated 2025-03-26, a version with no retry MUST (same issue 4686e06 fixed for server-sse-polling)

Trade-off accepted: the >2× band was the only (unsound) detector for clients that ignore the retry field; with it gone that's untested, which seems right for a scenario that's removed in draft. PR description updated to match.

_{AI Disclaimer}

maxisbey marked this pull request as ready for review June 30, 2026 14:50

maxisbey force-pushed the sse-retry-load-monotone branch from 5905aa7 to 4265e85 Compare June 30, 2026 17:41

maxisbey changed the title ~~fix(sse-retry): make the retry-timing verdict one-sided and causally keyed~~ fix(sse-retry): gate retry timing only on the early side Jun 30, 2026

maxisbey mentioned this pull request Jul 1, 2026

De-flake conformance CI: solo re-verification, spawn-storm reduction, result artifacts modelcontextprotocol/python-sdk#3043

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(sse-retry): gate retry timing only on the early side#375

fix(sse-retry): gate retry timing only on the early side#375
maxisbey wants to merge 1 commit into
modelcontextprotocol:mainfrom
maxisbey:sse-retry-load-monotone

maxisbey commented Jun 30, 2026 •

edited

Loading

Uh oh!

pkg-pr-new Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

pcarleton commented Jun 30, 2026

Uh oh!

maxisbey commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

maxisbey commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Change (one file, +11/−36)

Verification

Uh oh!

pkg-pr-new Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcarleton commented Jun 30, 2026

Uh oh!

maxisbey commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxisbey commented Jun 30, 2026 •

edited

Loading

pkg-pr-new Bot commented Jun 30, 2026 •

edited

Loading