Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate#1495
Draft
Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate#1495
Conversation
…orted server activity Agent-Logs-Url: https://github.com/modelcontextprotocol/csharp-sdk/sessions/267e60a2-c6e5-4d2d-96cb-830b16a80bd4 Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix flaky test in DiagnosticTests.Session_TracksActivities
Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate
Apr 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Session_TracksActivitieswas racing against the OpenTelemetry in-memory exporter: it waited only for a server activity matchingDisplayName + Kind, but the server processes messages via fire-and-forget tasks so tags (gen_ai.tool.name,mcp.session.id, etc.) may not be populated when the activity first appears in the collection — causingAssert.Single(predicate)to find zero matches.Changes
isExpectedServerToolCall(including all required tags) before theusingblock so the same predicate drives both the wait and the assertion — no drift possible between the two.WaitForAsync(DisplayName + Kind)withWaitForAsync(() => activities.Any(isExpectedServerToolCall)), blocking until the activity is fully populated.WaitForAsyncnow catchesTaskCanceledExceptionand rethrows asXunitExceptionwith a descriptive message instead of surfacing a raw cancellation.Original prompt
CI is failing on PR #1449 due to a flaky test in
tests/ModelContextProtocol.Tests/DiagnosticTests.cs:DiagnosticTests.Session_TracksActivitiesfails withAssert.Single() Failure: The collection did not contain any matching itemsat line 57.Action run/job reference:
.github/workflows/ci-build-test.ymlModelContextProtocol.Tests.DiagnosticTests.Session_TracksActivities(net9.0)Goal:
Make the test deterministic by waiting for the specific server-side activity that the test later asserts on, rather than only waiting for any server activity with the same DisplayName. The server processes messages with fire-and-forget tasks and OpenTelemetry in-memory exporter is async, so tags/status may not be populated immediately.
Required changes:
tests/ModelContextProtocol.Tests/DiagnosticTests.cs, updateSession_TracksActivitiesto:tools/call DoubleValueactivity (DisplayName, Kind=Server, required tags likegen_ai.tool.name,mcp.method.name,gen_ai.operation.name, and presence ofmcp.protocol.versionandmcp.session.id, and expected status if applicable).WaitForAsync(() => activities.Any(predicate))to wait until that specific activity exists.Assert.Single(activities, predicate).Optional improvement (only if minimal and clearly helpful):
WaitForAsyncto throw a clearer xUnit exception on timeout rather than surfacing aTaskCanceledException.Acceptance criteria:
PR title suggestion:
Fix flaky DiagnosticTests.Session_TracksActivities by waiting for exported server activity
The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.
Assistant: The job is failing because one test is flaky on net9.0:
ModelContextProtocol.Tests.DiagnosticTests.Session_TracksActivitiesAssert.Single() Failure: The collection did not contain any matching itemstests/ModelContextProtocol.Tests/DiagnosticTests.cs:line 57(commit7ff9d2fa44b60bcaee4091c0470ff70178c2b4a6)Why it fails
Session_TracksActivitiescorrectly waits for the server activity to appear:…but immediately after that it does:
The exporter is async and the server processes messages with fire-and-forget tasks (your comment in the test says so). On some runs/framework versions, the activity exists but its tags/status may not be populated yet, or it may not match the predicate at that exact moment—so
Assert.Single(predicate)finds zero matching items and fails.Solution (make the test wait for the exact predicate it asserts)
Update the test to wait until the specific server activity that matches the later
Assert.Singlepredicate exists (not just “any server activity with that name”).A minimal, focused change is:
WaitForAsyncon that predicateAssert.SingleSuggested patch (edit
tests/ModelContextProtocol.Tests/DiagnosticTests.cs)Replace the current wait (lines ~36–42) with:
Then keep the assertion, but reuse the predicate so it cannot drift:
Optional hardening (recommended)
Your
WaitForAsynccan also fail with an uncaughtTaskCanceledExceptionwhich produces noisier failures. If you want clearer diagnostics, change it to throw anXunitExceptionwith a helpful message when it t...This pull request was created from Copilot chat.