Skip to content

Commit c03a81f

Browse files
committed
docs: clarify Streamable HTTP stateless mode semantics and usage
1 parent 62575ed commit c03a81f

1 file changed

Lines changed: 146 additions & 1 deletion

File tree

README.md

Lines changed: 146 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1243,7 +1243,7 @@ Note that `uv run mcp run` or `uv run mcp dev` only supports server using FastMC
12431243

12441244
### Streamable HTTP Transport
12451245

1246-
> **Note**: Streamable HTTP transport is the recommended transport for production deployments. Use `stateless_http=True` and `json_response=True` for optimal scalability.
1246+
> **Note**: Streamable HTTP transport is the recommended transport for production deployments. For serverless and load-balanced environments, consider using `stateless_http=True` and `json_response=True`. See [Understanding Stateless Mode](#understanding-stateless-mode) for guidance on choosing between stateful and stateless operation.
12471247
12481248
<!-- snippet-source examples/snippets/servers/streamable_config.py -->
12491249
```python
@@ -1355,6 +1355,151 @@ The streamable HTTP transport supports:
13551355
- JSON or SSE response formats
13561356
- Better scalability for multi-node deployments
13571357

1358+
#### Understanding Stateless Mode
1359+
1360+
The Streamable HTTP transport can operate in two modes: **stateful** (default) and **stateless**. Understanding the difference is important for choosing the right deployment model.
1361+
1362+
##### What "Stateless" Means
1363+
1364+
In **stateless mode** (`stateless_http=True`), each HTTP request creates a completely independent MCP session that exists only for the duration of that single request:
1365+
1366+
- **No session tracking**: No `Mcp-Session-Id` header is used or required
1367+
- **Per-request lifecycle**: Each request initializes a fresh server instance, processes the request, and terminates
1368+
- **No state persistence**: No information is retained between requests
1369+
- **No event store**: Resumability features are disabled
1370+
1371+
This is fundamentally different from **stateful mode** (default), where:
1372+
1373+
- A session persists across multiple requests
1374+
- The `Mcp-Session-Id` header links requests to an existing session
1375+
- Server state (e.g., subscriptions, context) is maintained between calls
1376+
- Event stores can provide resumability if the connection drops
1377+
1378+
##### MCP Features Impacted by Stateless Mode
1379+
1380+
When running in stateless mode, certain MCP features are unavailable or behave differently:
1381+
1382+
| Feature | Stateful Mode | Stateless Mode |
1383+
|---------|---------------|----------------|
1384+
| **Server Notifications** | ✅ Supported | ❌ Not available<sup>1</sup> |
1385+
| **Resource Subscriptions** | ✅ Supported | ❌ Not available<sup>1</sup> |
1386+
| **Multi-turn Context** | ✅ Maintained | ❌ Lost between requests<sup>2</sup> |
1387+
| **Long-running Tools** | ✅ Can use notifications for progress | ⚠️ Must complete within request timeout |
1388+
| **Event Resumability** | ✅ With event store | ❌ Not applicable |
1389+
| **Tools/Resources/Prompts** | ✅ Fully supported | ✅ Fully supported |
1390+
| **Concurrent Requests** | ⚠️ One per session | ✅ Unlimited<sup>3</sup> |
1391+
1392+
<sup>1</sup> Server-initiated notifications require a persistent connection to deliver updates
1393+
<sup>2</sup> Each request starts fresh; client must provide all necessary context
1394+
<sup>3</sup> Each request is independent, enabling horizontal scaling
1395+
1396+
##### When to Use Stateless Mode
1397+
1398+
**Stateless mode is ideal for:**
1399+
1400+
- **Serverless Deployments**: AWS Lambda, Cloud Functions, or similar FaaS platforms where instances are ephemeral
1401+
- **Load-Balanced Multi-Node**: Deploying across multiple servers without sticky sessions
1402+
- **Stateless APIs**: Services where each request is self-contained (e.g., data lookups, calculations)
1403+
- **High Concurrency**: Scenarios requiring many simultaneous independent operations
1404+
- **Simplified Operations**: Avoiding session management complexity
1405+
1406+
**Use stateful mode when:**
1407+
1408+
- Server needs to push notifications to clients (e.g., progress updates, real-time events)
1409+
- Resources require subscriptions with change notifications
1410+
- Tools maintain conversation state across multiple turns
1411+
- Long-running operations need to report progress asynchronously
1412+
- Connection resumability is required
1413+
1414+
##### Example: Stateless Configuration
1415+
1416+
```python
1417+
from mcp.server.fastmcp import FastMCP
1418+
1419+
# Stateless server - each request is independent
1420+
mcp = FastMCP(
1421+
"StatelessAPI",
1422+
stateless_http=True, # Enable stateless mode
1423+
json_response=True, # Recommended for stateless
1424+
)
1425+
1426+
@mcp.tool()
1427+
def calculate(a: int, b: int, operation: str) -> int:
1428+
"""Stateless calculation tool."""
1429+
operations = {"add": a + b, "multiply": a * b}
1430+
return operations[operation]
1431+
1432+
# Each request will:
1433+
# 1. Initialize a new server instance
1434+
# 2. Process the calculate tool call
1435+
# 3. Return the result
1436+
# 4. Terminate the instance
1437+
```
1438+
1439+
##### Deployment Patterns
1440+
1441+
**Pattern 1: Pure Stateless (Recommended)**
1442+
1443+
```python
1444+
# Best for: Serverless, auto-scaling environments
1445+
mcp = FastMCP("MyServer", stateless_http=True, json_response=True)
1446+
1447+
# Clients can connect to any instance
1448+
# Load balancer doesn't need session affinity
1449+
```
1450+
1451+
**Pattern 2: Stateful with Sticky Sessions**
1452+
1453+
```python
1454+
# Best for: When you need notifications but have load balancing
1455+
mcp = FastMCP("MyServer", stateless_http=False) # Default
1456+
1457+
# Load balancer must use sticky sessions based on Mcp-Session-Id header
1458+
# ALB/NGINX can route by header value to maintain session affinity
1459+
```
1460+
1461+
**Pattern 3: Hybrid Approach**
1462+
1463+
```python
1464+
# Deploy both modes side-by-side
1465+
stateless_mcp = FastMCP("StatelessAPI", stateless_http=True)
1466+
stateful_mcp = FastMCP("StatefulAPI", stateless_http=False)
1467+
1468+
app = Starlette(routes=[
1469+
Mount("/api/stateless", app=stateless_mcp.streamable_http_app()),
1470+
Mount("/api/stateful", app=stateful_mcp.streamable_http_app()),
1471+
])
1472+
```
1473+
1474+
##### Technical Details
1475+
1476+
**Session Lifecycle in Stateless Mode:**
1477+
1478+
1. Client sends HTTP POST request to `/mcp` endpoint
1479+
2. Server creates ephemeral `StreamableHTTPServerTransport` (no session ID)
1480+
3. Server initializes fresh `Server` instance with `stateless=True` flag
1481+
4. Request is processed using the ephemeral transport
1482+
5. Response is sent back to client
1483+
6. Transport and server instance are immediately terminated
1484+
1485+
**Performance Characteristics:**
1486+
1487+
- **Initialization overhead**: Each request pays the cost of server initialization
1488+
- **Memory efficiency**: No long-lived sessions consuming memory
1489+
- **Scalability**: Excellent horizontal scaling with no state synchronization
1490+
- **Latency**: Slightly higher per-request latency due to initialization
1491+
1492+
**Stateless Mode Checklist:**
1493+
1494+
When designing for stateless mode, ensure:
1495+
1496+
- ✅ Tools are self-contained and don't rely on previous calls
1497+
- ✅ All required context is passed in each request
1498+
- ✅ Tools complete synchronously within request timeout
1499+
- ✅ No server notifications or subscriptions are needed
1500+
- ✅ Client handles any necessary state management
1501+
- ✅ Operations are idempotent where possible
1502+
13581503
#### CORS Configuration for Browser-Based Clients
13591504

13601505
If you'd like your server to be accessible by browser-based MCP clients, you'll need to configure CORS headers. The `Mcp-Session-Id` header must be exposed for browser clients to access it:

0 commit comments

Comments
 (0)