Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate by Copilot · Pull Request #1495 · modelcontextprotocol/csharp-sdk

Copilot · 2026-04-02T16:45:07Z

Session_TracksActivities was racing against the OpenTelemetry in-memory exporter: it waited only for a server activity matching DisplayName + Kind, but the server processes messages via fire-and-forget tasks so tags (gen_ai.tool.name, mcp.session.id, etc.) may not be populated when the activity first appears in the collection — causing Assert.Single(predicate) to find zero matches.

Changes

Shared predicate: Extract isExpectedServerToolCall (including all required tags) before the using block so the same predicate drives both the wait and the assertion — no drift possible between the two.
Wait on full predicate: Replace the loose WaitForAsync(DisplayName + Kind) with WaitForAsync(() => activities.Any(isExpectedServerToolCall)), blocking until the activity is fully populated.
Clearer timeout failure: WaitForAsync now catches TaskCanceledException and rethrows as XunitException with a descriptive message instead of surfacing a raw cancellation.

Func<Activity, bool> isExpectedServerToolCall = a =>
    a.DisplayName == "tools/call DoubleValue" &&
    a.Kind == ActivityKind.Server &&
    a.Status == ActivityStatusCode.Unset &&
    a.Tags.Any(t => t.Key == "gen_ai.tool.name"       && t.Value == "DoubleValue") &&
    a.Tags.Any(t => t.Key == "mcp.method.name"         && t.Value == "tools/call") &&
    a.Tags.Any(t => t.Key == "gen_ai.operation.name"   && t.Value == "execute_tool") &&
    a.Tags.Any(t => t.Key == "mcp.protocol.version"    && !string.IsNullOrEmpty(t.Value)) &&
    a.Tags.Any(t => t.Key == "mcp.session.id"          && !string.IsNullOrEmpty(t.Value));

await WaitForAsync(
    () => activities.Any(isExpectedServerToolCall),
    failureMessage: "Timed out waiting for the expected server tool-call activity ...");

var serverToolCall = Assert.Single(activities, a => isExpectedServerToolCall(a));

Original prompt

CI is failing on PR #1449 due to a flaky test in tests/ModelContextProtocol.Tests/DiagnosticTests.cs: DiagnosticTests.Session_TracksActivities fails with Assert.Single() Failure: The collection did not contain any matching items at line 57.

Action run/job reference:

Run: 23910228803
Job: 69732069649
Workflow: .github/workflows/ci-build-test.yml
Failure: ModelContextProtocol.Tests.DiagnosticTests.Session_TracksActivities (net9.0)

Goal:
Make the test deterministic by waiting for the specific server-side activity that the test later asserts on, rather than only waiting for any server activity with the same DisplayName. The server processes messages with fire-and-forget tasks and OpenTelemetry in-memory exporter is async, so tags/status may not be populated immediately.

Required changes:

In tests/ModelContextProtocol.Tests/DiagnosticTests.cs, update Session_TracksActivities to:
- Define a predicate that matches the expected server tools/call DoubleValue activity (DisplayName, Kind=Server, required tags like gen_ai.tool.name, mcp.method.name, gen_ai.operation.name, and presence of mcp.protocol.version and mcp.session.id, and expected status if applicable).
- Call WaitForAsync(() => activities.Any(predicate)) to wait until that specific activity exists.
- Then call Assert.Single(activities, predicate).
Keep the rest of the test’s assertions intact.

Optional improvement (only if minimal and clearly helpful):

Improve WaitForAsync to throw a clearer xUnit exception on timeout rather than surfacing a TaskCanceledException.

Acceptance criteria:

CI test job passes reliably across target frameworks/OS.
The updated test continues to validate the intended tracing semantics (client/server activities, session IDs, traceparent propagation).

PR title suggestion:
Fix flaky DiagnosticTests.Session_TracksActivities by waiting for exported server activity

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant: The job is failing because one test is flaky on net9.0:

Failing test: ModelContextProtocol.Tests.DiagnosticTests.Session_TracksActivities
Failure: Assert.Single() Failure: The collection did not contain any matching items
Location: tests/ModelContextProtocol.Tests/DiagnosticTests.cs:line 57 (commit 7ff9d2fa44b60bcaee4091c0470ff70178c2b4a6)

Why it fails

Session_TracksActivities correctly waits for the server activity to appear:

await WaitForAsync(() => activities.Any(a =>
    a.DisplayName == "tools/call DoubleValue" && a.Kind == ActivityKind.Server));

…but immediately after that it does:

var serverToolCall = Assert.Single(activities, a => /* ... */ a.Kind == ActivityKind.Server);

The exporter is async and the server processes messages with fire-and-forget tasks (your comment in the test says so). On some runs/framework versions, the activity exists but its tags/status may not be populated yet, or it may not match the predicate at that exact moment—so Assert.Single(predicate) finds zero matching items and fails.

Solution (make the test wait for the exact predicate it asserts)

Update the test to wait until the specific server activity that matches the later Assert.Single predicate exists (not just “any server activity with that name”).

A minimal, focused change is:

Add a predicate for the expected server tool-call activity
WaitForAsync on that predicate
Only then call Assert.Single

Suggested patch (edit `tests/ModelContextProtocol.Tests/DiagnosticTests.cs`)

Replace the current wait (lines ~36–42) with:

Func<Activity, bool> isExpectedServerToolCall = a =>
    a.DisplayName == "tools/call DoubleValue" &&
    a.Kind == ActivityKind.Server &&
    a.Status == ActivityStatusCode.Unset &&
    a.Tags.Any(t => t.Key == "gen_ai.tool.name" && t.Value == "DoubleValue") &&
    a.Tags.Any(t => t.Key == "mcp.method.name" && t.Value == "tools/call") &&
    a.Tags.Any(t => t.Key == "gen_ai.operation.name" && t.Value == "execute_tool") &&
    a.Tags.Any(t => t.Key == "mcp.protocol.version" && !string.IsNullOrEmpty(t.Value)) &&
    a.Tags.Any(t => t.Key == "mcp.session.id" && !string.IsNullOrEmpty(t.Value));

await WaitForAsync(() => activities.Any(isExpectedServerToolCall));

Then keep the assertion, but reuse the predicate so it cannot drift:

var serverToolCall = Assert.Single(activities, isExpectedServerToolCall);

Optional hardening (recommended)

Your WaitForAsync can also fail with an uncaught TaskCanceledException which produces noisier failures. If you want clearer diagnostics, change it to throw an XunitException with a helpful message when it t...

This pull request was created from Copilot chat.

…orted server activity Agent-Logs-Url: https://github.com/modelcontextprotocol/csharp-sdk/sessions/267e60a2-c6e5-4d2d-96cb-830b16a80bd4 Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Initial plan

6baa64f

Copilot AI assigned Copilot and ericstj Apr 2, 2026

Copilot started work on behalf of ericstj April 2, 2026 16:45 View session

Fix flaky DiagnosticTests.Session_TracksActivities by waiting for exp…

bb39e15

…orted server activity Agent-Logs-Url: https://github.com/modelcontextprotocol/csharp-sdk/sessions/267e60a2-c6e5-4d2d-96cb-830b16a80bd4 Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix flaky test in DiagnosticTests.Session_TracksActivities~~ Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate Apr 2, 2026

Copilot AI requested a review from ericstj April 2, 2026 16:53

Copilot finished work on behalf of ericstj April 2, 2026 16:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate#1495

Fix flaky DiagnosticTests.Session_TracksActivities by waiting for full server activity predicate#1495
Copilot wants to merge 2 commits intomainfrom
copilot/fix-flaky-diagnostic-test

Copilot AI commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Why it fails

Solution (make the test wait for the exact predicate it asserts)

Suggested patch (edit tests/ModelContextProtocol.Tests/DiagnosticTests.cs)

Optional hardening (recommended)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Apr 2, 2026 •

edited

Loading

Suggested patch (edit `tests/ModelContextProtocol.Tests/DiagnosticTests.cs`)