Skip to content

docs: add multi-tenancy guide, example server, and OAuth e2e tests#2312

Closed
andylim-duo wants to merge 44 commits intomodelcontextprotocol:mainfrom
andylim-duo:feature/multi-tenant-docs-testing
Closed

docs: add multi-tenancy guide, example server, and OAuth e2e tests#2312
andylim-duo wants to merge 44 commits intomodelcontextprotocol:mainfrom
andylim-duo:feature/multi-tenant-docs-testing

Conversation

@andylim-duo
Copy link

Summary

Iteration 6 of the multi-tenancy implementation plan — documentation, example server, and end-to-end tests.

  • Multi-tenancy guide (docs/multi-tenancy.md): Comprehensive documentation covering architecture (with mermaid flow diagram), static and dynamic tenant provisioning (onboarding/offboarding, feature flags, plugin systems), authentication setup with TokenVerifier and identity provider configuration (Duo Security, Auth0, Okta, Microsoft Entra ID, custom JWT), session isolation, security considerations, and backward compatibility.

  • Example server (examples/servers/simple-multi-tenant/): Two-tenant demo (Acme analytics, Globex content) with tenant-scoped tools, resources, prompts, and a shared whoami tool that reads Context.tenant_id. Includes README with usage instructions and in-memory testing guide.

  • OAuth e2e tests (test_multi_tenancy_oauth_e2e.py): Full HTTP stack integration tests proving tenant isolation through the real auth middleware — bearer token → AuthContextMiddlewaretenant_id_var → session manager → handler → tenant-scoped response. Uses StubTokenVerifier to bypass OAuth while exercising the production middleware path.

  • Migration guide update (docs/migration.md): Documents multi-tenancy as a breaking change with migration instructions.

Test Plan

  • All 13 multi-tenancy tests pass (8 in-memory e2e + 5 OAuth e2e)
  • Full test suite passes (1219 passed, 98 skipped, 1 xfailed)
  • pyright passes with 0 errors across the entire project
  • ruff check and ruff format pass cleanly

Add tenant_id support to AuthorizationCode, RefreshToken, and AccessToken
models to enable multi-tenant isolation. This is the foundation for
tenant-scoped authentication, allowing tokens to be associated with
specific tenants.

The field is optional (defaults to None) for backward compatibility.
Thread tenant_id from authentication tokens through the request
lifecycle to enable tenant-scoped operations in handlers.

Changes:
- Add tenant_id field to base RequestContext (inherited by ServerRequestContext)
- Add get_tenant_id() helper in auth_context module to extract tenant from auth
- Populate tenant_id in both ServerRequestContext instantiation sites in lowlevel/server.py
- Add tenant_id property with getter/setter to ServerSession

This is iteration 2 of the multi-tenancy implementation, building on
the tenant_id field added to auth tokens in iteration 1.
Use proper async with pattern to ensure ServerSession's internal
streams are cleaned up correctly, preventing resource warnings.
feat(auth): add tenant_id field to authentication token models
Add two tests to verify tenant_id doesn't leak between:
- Concurrent async requests using the auth contextvar
- Separate ServerSession instances

These tests validate critical security properties for multi-tenant
deployments where isolation between tenants must be guaranteed.
…tring

Document the purpose and usage of the tenant_id field for multi-tenant
server deployments.
…uest

Wire up session.tenant_id so it is set automatically from the auth
contextvar on the first authenticated request (set-once semantics).
This connects RequestContext.tenant_id and ServerSession.tenant_id,
ensuring the session is bound to a tenant for its lifetime.
Extract _simulate_tenant_binding helper to avoid pyright narrowing
session.tenant_id to a literal type after assertion, which caused the
subsequent `is None` check to be flagged as always-False.
…on handling

Add two E2E tests using Client(server) that exercise the session.tenant_id
set-once binding in lowlevel/server.py, covering the previously uncovered
branches in _handle_request (line 456) and _handle_notification (line 504).
This line is now covered by the E2E tenant notification test.
Make the tenant_id setter raise ValueError if attempting to change
to a different value once already set. This prevents accidental tenant
reassignment which could be a security issue. Setting to the same
value is allowed (idempotent).
…action

Move tenant identification to a transport-agnostic contextvar
(tenant_id_var) in the shared layer, removing the hard dependency from
lowlevel/server.py on the auth middleware module.

AuthContextMiddleware now sets tenant_id_var alongside auth_context_var,
and the core server reads from the shared contextvar instead of calling
get_tenant_id() from the auth module. This keeps the dependency direction
correct (auth → shared, server → shared) and allows other transports to
set tenant_id_var through their own mechanisms.
Remove unused MockApp() and AuthContextMiddleware(app) that were
immediately overwritten by AuthContextMiddleware(TenantCheckApp()).
Use checkpoint() for deterministic context switching instead of a
fixed-duration sleep in the tenant isolation test.
Import checkpoint directly from anyio.lowlevel to fix pyright
reportAttributeAccessIssue on the lazy submodule.
…ontext

feat(auth): add tenant_id to session and request context
…ager, and PromptManager

Change internal storage from simple name-keyed dicts to composite
(tenant_id, name) keyed dicts, enabling the same resource name to exist
independently under different tenants.

All public methods gain a keyword-only `tenant_id: str | None = None`
parameter. Existing callers are unaffected — the default None scope
preserves backward compatibility with single-tenant usage.
On Windows CI runners, several tests fail intermittently due to timing
sensitivity and unhandled warnings from TerminateProcess() cleanup.

- Replace single-check file-growth assertions with polling loops inside
  anyio.fail_after(5) timeouts in process cleanup tests, so slow runners
  get multiple chances to observe the process has stopped
- Increase subprocess.run() timeout from 20s to 60s in test_command_execution
  to accommodate slow CI runners
- Add PytestUnraisableExceptionWarning filter on Windows alongside existing
  ResourceWarning filters, since Windows TerminateProcess() prevents graceful
  transport cleanup

Github-Issue:#6
Add pragma: no branch to the polling loop conditionals in process
cleanup tests. These false branches (loop retry) are only exercised
on slow Windows CI runners and cannot be covered locally.

Github-Issue:#6
fix(test): stabilize flaky Windows CI tests
Integrate tenant_id into MCPServer server layer, completing the plumbing
from request context through to tenant-scoped storage in managers.

- Add tenant_id property to Context class, reading from request context
- Update all 7 private _handle_* methods to pass ctx.tenant_id to public methods
- Add keyword-only tenant_id parameter to all public methods (list_tools,
  call_tool, list_resources, read_resource, list_resource_templates,
  list_prompts, get_prompt, add_tool, remove_tool, add_resource, add_prompt)
- All new parameters default to None for full backward compatibility

This is iteration 4 of 6 in the multi-tenancy implementation plan.
…truction

Replace plain dict with Experimental() to satisfy pyright type checking.
Verify that each tenant's tool returns its own distinct result content,
not just that results are non-None.
…-handlers

feat(server): thread tenant_id through MCPServer handlers to managers
…missing APIs

Convert ToolManager, ResourceManager, and PromptManager from flat
tuple-keyed dicts to nested `{tenant_id: {name: T}}` dicts, giving O(1)
per-tenant lookups instead of O(n) scans. Add duplicate-warning support
for resource templates, add remove_resource and remove_prompt methods,
document thread-safety constraints, and centralize Context construction
in tests via a make_context fixture.

Github-Issue:#8
Move the duplicated MakeContext type alias from test_multi_tenancy_managers.py
and test_resource_manager.py into conftest.py as the single source of truth.

Github-Issue:#8
Remove the inner dict from the outer storage when the last entry for a
tenant is deleted, preventing unbounded accumulation of empty dicts in
long-running servers with transient tenants.

Github-Issue:#8
Add tests for ResourceManager and PromptManager where a tenant has
multiple items and only one is removed, exercising the branch where the
scope dict persists. Fixes 100% branch coverage requirement in CI.

Github-Issue:#8
refactor(managers): O(1) tenant lookups and missing APIs
Prevent cross-tenant session hijacking by validating that the
authenticated tenant matches the session's bound tenant on every
request. Sessions created without a tenant (no auth) remain accessible
to all requests for backward compatibility.

Adds a parallel _session_tenants dict that records the tenant_id from
tenant_id_var at session creation time. On existing session lookup, a
mismatch returns 404 (same as "session not found" to avoid information
leakage). Tenant mappings are cleaned up alongside sessions on all
exit paths: idle timeout, crash, and shutdown.

Includes 7 tests covering bidirectional isolation, same-tenant reuse,
backward compatibility, unauthenticated access rejection, and cleanup.
100% branch coverage on streamable_http_manager.py.
Add tests for _extract_session_id and _extract_status that exercise
the "skip non-start messages" and "not found" paths, replacing pragma
suppressions with actual coverage.
- Split tenant mismatch log: WARNING with session ID only, DEBUG for
  tenant values to avoid leaking sensitive data at default log levels
- _set_tenant/_reset_tenant helpers always set/reset the contextvar
  unconditionally, avoiding subtle bugs when tenant is None
- Wrap all blocking-session tests in try/finally to ensure stop.set()
  runs even on assertion failures, preventing test hangs
…manager

feat(session-manager): tenant validation on session access
Add comprehensive documentation and testing for the multi-tenant MCP
server feature (Iteration 6 of the multi-tenancy implementation plan).

Multi-tenancy guide (docs/multi-tenancy.md):
- Architecture overview with mermaid flow diagram
- Static and dynamic tenant provisioning (onboarding, feature flags, plugins)
- Authentication setup with TokenVerifier and identity provider configuration
  (Duo Security, Auth0, Okta, Microsoft Entra ID, custom JWT)
- Session isolation semantics
- Security considerations and backward compatibility

Example server (examples/servers/simple-multi-tenant/):
- Two-tenant demo (Acme analytics, Globex content) with tenant-scoped
  tools, resources, and prompts
- Shared whoami tool demonstrating Context.tenant_id
- README with usage instructions and in-memory testing guide

OAuth e2e tests (test_multi_tenancy_oauth_e2e.py):
- Full HTTP stack test: bearer token -> AuthContextMiddleware ->
  tenant_id_var -> session manager -> handler -> tenant-scoped response
- StubTokenVerifier for testing without a real OAuth server
- ASGI lifespan helper for httpx.ASGITransport
- Tests for tenant tool isolation, tool invocation, whoami identity,
  and unauthenticated request rejection

Also updates migration.md with multi-tenancy breaking change notes.
@andylim-duo
Copy link
Author

Apologies for the erroneous PR — this was opened against the upstream repo by mistake. It has been closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant