-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add SDK MCP OAuth host token handlers #1669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+5,508
−41
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these necessarily tied together?
I'm thinking back to permission requests and tool invocation requests. Initially the only way you could handle them was to provide a handler that the SDK would invoke, but that was prohibitive, especially with regards to needing to be able to suspend the runtime and later resume it (possibly on a different machine) and supply the result of the operation in order to keep going. We made it so that providing the callback was optional; if you don't supply it, you're on your own for handling the events and calling the runtime rpc method to supply the results.
Is there a correlary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at least at the moment, register interest in the callback is necessary since the runtime has two MCP OAuth flows - the new one being introduced here and the in-process one currently used by the CLI; I'm working on a PR to switch the CLI to use the new SDK flow - as part of the CLI-over-SDK effort - but for now the interest registration is required in order for the runtime to know which path to take.
Regardless, even after we remove the CLI flow, I'm not sure how the "you're on your own" approach would work here MCP OAuth... The LLM can request a tool call at an arbitrary time, which could trigger an "oauth_required" request, which must be satisfied by the host; if this doesn't happen, the tool call can't proceed, and the turn is stuck. So I think it means the host must be informed (and it must respond in a timely manner)... I think that this oauth flow must indeed be rewired upon resume (even on a different machine), but it doesn't change the fact that an OAuth request can happen at any point, and the host must be able to handle it...
In other words, I think that any SDK host that potentially needs to support OAuth-authorizing MCP servers must register this callback, and if such a request is emitted without that callback being registered, I'm not sure what choice we have besides a fail-fast error... But I might have an incorrect mental model here - let me know how you see this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I've understood correctly, all the SDK ends up doing is listening for a particular event, invoking the user's callback, and calling the exposed RPC method to supply the result, right? So SDK consumer could do the same thing without providing a callback: they could listen for the same event, do whatever they want in response (whatever they would have supplied in the callback), and then call that same RPC method, right?
The PR is using a callback being set as an indication that the client is handling oauth, but if the above is true, then those needn't be 1:1... supplying a callback could imply it, but not supplying a callback doesn't mean the client can't also handle it. My question is really: should there be a separate knob for opting-in to handling oauth beyond just supplying a callback.
It's fine if the answer is "not right now, we could always add one in the future if supplying a callback is problematic for some reason".
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: Maybe read my comment below, I think I understood better what you were asking
Here's how the flow works:
IIUC you're asking whether it makes sense to have a host opt into host-delegated OAuth without registering a callback. In that case, I can't see what would happen when the runtime receives a 401/403 HTTP error which is the trigger for OAuth - how would the host application be made aware that OAuth needs to happen? Like are you thinking of some sort of polling mechanism where the host application would periodically check if there are any pending OAuth requests? I guess that's theoretically possible, but would at the very least mean lots of latency no?
Are you asking because reinstating host callbacks with the runtime after resume is problematic? I looked at this and it seems that we have several other callbacks which IIRC are should also be reinstated after resume - but I could be wrong.
Happy to do a quick call tomorrow to discuss all this!
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I reread everything and I think I understand your concern... You're saying that in the SDK implementation here (not the runtime), we expose a single callback which must return the token; this would prevents e.g. an implementation that persists the various parameters to disk, and then some other processes loads those and calls handlePendingRequest with the token. In other words, this is less about separating the opt-in from the callback, it's more to have a more low-level split interaction where there's the callback invoked by the runtime doesn't have to return an access token "immediately", and the host application later invokes handlePendingRequest manually.
I can see that... And that's exactly how the low-level RPC API is actually structured (oauth_required is the low-level callback that returns nothing). If so, then I think all this should be achievable with the low-level APIs today:
There may be some issues around registering interest (and the callback) early enough during resume so as to avoid any race conditions... Also, specifically for the multiple process scenario (one process receives oauth_required, persists, another process actually completes with the access token), the runtime itself needs to remember/correlate the pending OAuth requests, and that's in-memory only; so a true resume will have lost that (assuming the runtime is in the same process as the host application).
tl;dr I think we have a nice high-level convenience callback for the common scenario (register a single async callback that returns the access token). In principle, users can still drop down to low-level RPC APIs to do the advanced (cross-process) thing, but it's likely we have some gaps around this - but they go beyond simple API shape here in the SDK, and we should probably think/design that separately (the main question is whether the high-level API as-is seems OK even if the advanced scenario isn't there yet).
Does this all make sense and correspond to what you're asking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! Being able to write that code sample is what I was asking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great, understood.
Though as above, I suspect that for actual cross-process resume there'll still be some gaps/blockers... We can deal with those later.