perf: add schema caching to avoid repeated reflection (#685)

SamMorrowDrums · omgitsads · almaleksia · web-flow · commit c2c7edc3f026 · 2026-01-20T20:16:46.000-05:00
## Summary This PR adds schema caching that dramatically reduces the cost of tool registration in stateless server patterns. **Contributors:** Thanks to @omgitsads and @almaleksia for collaborating on and researching this solution! ## Problem In stateless deployments like `github/github-mcp-server`, a new `*mcp.Server` is created for each incoming request. This means `AddTool` is called repeatedly for the same tools, causing: 1. **For typed handlers**: `jsonschema.ForType()` reflection is called every time 2. **For pre-defined schemas**: `schema.Resolve()` is called every time ## Solution Add a `schemaCache` that stores: - Schemas by `reflect.Type` (for auto-generated schemas from typed handlers) - Resolved schemas by schema pointer (for pre-defined schemas) The cache is: - Concurrent-safe using `sync.Map` - Unbounded (typical MCP servers have &lt;100 tools) - Global (lives across server instances) ## Benchmark Results ``` BenchmarkAddToolTypedHandler-22 977080 1223 ns/op 1208 B/op 21 allocs/op BenchmarkAddToolTypedHandlerNoCache-22 6764 161463 ns/op 39262 B/op 1072 allocs/op ``` | Metric | With Cache | Without Cache | Improvement | |--------|------------|---------------|-------------| | Time | 1,223 ns/op | 161,463 ns/op | **132x faster** | | Allocations | 21 allocs | 1,072 allocs | **51x fewer** | | Memory | 1,208 B/op | 39,262 B/op | **32x less** | ## Files Changed - `mcp/schema_cache.go` - New cache implementation - `mcp/server.go` - Modified `setSchema` to use cache - `mcp/schema_cache_test.go` - Unit tests for caching behavior - `mcp/schema_cache_benchmark_test.go` - Benchmarks ## Impact for Integrators **Automatic** - no code changes required. Integrators using: - Typed handlers (`AddTool[In, Out]`) → cache by type - Pre-defined schemas (`Tool{InputSchema: schema}`) → cache by pointer Both patterns benefit from caching after the first call. --- ## Real-World Performance Validation The following benchmarks were conducted using `github/github-mcp-server` (a production MCP server with ~130 tools) to validate the performance impact in a real-world scenario. ### Test Environment - **Server:** github-mcp-server (stateless HTTP deployment) - **Tools registered per request:** ~130 tools - **Test methodology:** Python benchmark client, 30 iterations (latency), 100 iterations (stress) - **Date:** December 4, 2025 ### Configurations Tested | Configuration | Description | |--------------|-------------| | **main (mcp-go)** | Original implementation using mcp-go library | | **go-sdk (no cache)** | go-sdk WITHOUT schema caching (broken) | | **go-sdk (with cache)** | go-sdk WITH this PR&#39;s schema caching fix | ### Latency Test Results (n=30) #### Operation: `initialize` | Configuration | P50 | P99 | Status | |--------------|-----|-----|--------| | main (mcp-go) | 11.48ms | 14.00ms | ✅ Baseline | | go-sdk (no cache) | 20.47ms | 25.30ms | 🔴 **+78% REGRESSION** | | go-sdk (with cache) | 10.36ms | 14.33ms | ✅ **FIXED (-10%)** | #### Operation: `tools/list` | Configuration | P50 | P99 | Status | |--------------|-----|-----|--------| | main (mcp-go) | 13.46ms | 22.47ms | ✅ Baseline | | go-sdk (no cache) | 22.98ms | 29.11ms | 🔴 **+71% REGRESSION** | | go-sdk (with cache) | 14.15ms | 15.91ms | ✅ **FIXED (+5%)** | #### Operation: `prompts/list` | Configuration | P50 | P99 | Status | |--------------|-----|-----|--------| | main (mcp-go) | 11.56ms | 13.81ms | ✅ Baseline | | go-sdk (no cache) | 21.00ms | 26.42ms | 🔴 **+82% REGRESSION** | | go-sdk (with cache) | 10.77ms | 15.25ms | ✅ **FIXED (-7%)** | ### Stress Test Results (n=100) | Configuration | P50 (initialize) | P50 (tools/list) | P50 (prompts/list) | |--------------|------------------|------------------|-------------------| | main (mcp-go) | 12.06ms | 14.30ms | 11.34ms | | go-sdk (no cache) | 20.06ms | 23.44ms | 19.59ms | | go-sdk (with cache) | **11.83ms** | **14.58ms** | **11.05ms** | ### Memory/Allocation Comparison (from pprof) | Configuration | Total Allocations | Comparison | |--------------|-------------------|------------| | main (mcp-go) | 355.91 MB | Baseline | | go-sdk (no cache) | **1208.70 MB** | 🔴 **3.4x MORE allocations** | #### Top Allocation Sources - go-sdk WITHOUT cache (broken) | Function | Size | % | Issue | |----------|------|---|-------| | `jsonschema.UnmarshalJSON` | 324.60 MB | 27% | 🚨 Schema re-parsing | | `encoding/json.Unmarshal` | 341.10 MB | 28% | JSON deserialization | | `jsonschema.resolve` | 219.51 MB | 18% | 🚨 Schema re-resolution | | `jsonschema.MarshalJSON` | 92.52 MB | 8% | Schema JSON encoding | ### Root Cause Analysis The `google/jsonschema-go` library regenerates JSON schemas on every request instead of caching them. In a server with ~130 tools, this causes: - **70-80% latency regression** on all MCP operations - **3.4x more memory allocations** per request - **~70% of all allocations** are schema-related operations ### Key Findings 1. ✅ **REGRESSION CONFIRMED:** go-sdk without schema caching is 70-80% slower than mcp-go 2. ✅ **FIX VERIFIED:** Schema caching restores performance to baseline (or better) 3. ✅ **MEMORY IMPACT:** 3.4x reduction in allocations with caching 4. ✅ **PRODUCTION READY:** Fixed version performs equivalently to mcp-go baseline --------- Co-authored-by: Adam Holt <4619+omgitsads@users.noreply.github.com> Co-authored-by: Ksenia Bobrova <1885174+almaleksia@users.noreply.github.com>
diff --git a/docs/server.md b/docs/server.md
@@ -403,6 +403,23 @@ _See [mcp/tool_example_test.go](../mcp/tool_example_test.go) for the full
 example, or [examples/server/toolschemas](examples/server/toolschemas/main.go)
 for more examples of customizing tool schemas._
 
+**Stateless server deployments:** Some deployments create a new
+[`Server`](https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp#Server)
+for each incoming request, re-registering tools every time. To avoid repeated
+schema generation, create a
+[`SchemaCache`](https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp#SchemaCache)
+and share it across server instances:
+
+```go
+var schemaCache = mcp.NewSchemaCache() // create once at startup
+
+func handleRequest(w http.ResponseWriter, r *http.Request) {
+    s := mcp.NewServer(impl, &mcp.ServerOptions{SchemaCache: schemaCache})
+    mcp.AddTool(s, myTool, myHandler)
+    // ...
+}
+```
+
 ## Utilities
 
 ### Completion
diff --git a/internal/docs/server.src.md b/internal/docs/server.src.md
@@ -193,6 +193,23 @@ _See [mcp/tool_example_test.go](../mcp/tool_example_test.go) for the full
 example, or [examples/server/toolschemas](examples/server/toolschemas/main.go)
 for more examples of customizing tool schemas._
 
+**Stateless server deployments:** Some deployments create a new
+[`Server`](https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp#Server)
+for each incoming request, re-registering tools every time. To avoid repeated
+schema generation, create a
+[`SchemaCache`](https://pkg.go.dev/github.com/modelcontextprotocol/go-sdk/mcp#SchemaCache)
+and share it across server instances:
+
+```go
+var schemaCache = mcp.NewSchemaCache() // create once at startup
+
+func handleRequest(w http.ResponseWriter, r *http.Request) {
+    s := mcp.NewServer(impl, &mcp.ServerOptions{SchemaCache: schemaCache})
+    mcp.AddTool(s, myTool, myHandler)
+    // ...
+}
+```
+
 ## Utilities
 
 ### Completion
diff --git a/mcp/schema_cache.go b/mcp/schema_cache.go
@@ -0,0 +1,69 @@
+// Copyright 2025 The Go MCP SDK Authors. All rights reserved.
+// Use of this source code is governed by an MIT-style
+// license that can be found in the LICENSE file.
+
+package mcp
+
+import (
+	"reflect"
+	"sync"
+
+	"github.com/google/jsonschema-go/jsonschema"
+)
+
+// A SchemaCache caches JSON schemas to avoid repeated reflection and resolution.
+//
+// This is useful for stateless server deployments (one [Server] per request)
+// where tools are re-registered on every request. Without caching, each
+// [AddTool] call triggers expensive reflection-based schema generation.
+//
+// A SchemaCache is safe for concurrent use by multiple goroutines.
+//
+// # Trade-offs
+//
+// The cache is unbounded: it stores one entry per unique Go type or schema
+// pointer. For typical MCP servers with a fixed set of tools, memory usage
+// is negligible. However, if tool input types are generated dynamically,
+// the cache will grow without bound.
+//
+// The cache uses pointer identity for pre-defined schemas. If a schema's
+// contents change but the pointer remains the same, stale resolved schemas
+// may be returned. In practice, this is not an issue because tool schemas
+// are typically defined once at startup.
+type SchemaCache struct {
+	byType   sync.Map // reflect.Type -> *cachedSchema
+	bySchema sync.Map // *jsonschema.Schema -> *jsonschema.Resolved
+}
+
+type cachedSchema struct {
+	schema   *jsonschema.Schema
+	resolved *jsonschema.Resolved
+}
+
+// NewSchemaCache creates a new [SchemaCache].
+func NewSchemaCache() *SchemaCache {
+	return &SchemaCache{}
+}
+
+func (c *SchemaCache) getByType(t reflect.Type) (*jsonschema.Schema, *jsonschema.Resolved, bool) {
+	if v, ok := c.byType.Load(t); ok {
+		cs := v.(*cachedSchema)
+		return cs.schema, cs.resolved, true
+	}
+	return nil, nil, false
+}
+
+func (c *SchemaCache) setByType(t reflect.Type, schema *jsonschema.Schema, resolved *jsonschema.Resolved) {
+	c.byType.Store(t, &cachedSchema{schema: schema, resolved: resolved})
+}
+
+func (c *SchemaCache) getBySchema(schema *jsonschema.Schema) (*jsonschema.Resolved, bool) {
+	if v, ok := c.bySchema.Load(schema); ok {
+		return v.(*jsonschema.Resolved), true
+	}
+	return nil, false
+}
+
+func (c *SchemaCache) setBySchema(schema *jsonschema.Schema, resolved *jsonschema.Resolved) {
+	c.bySchema.Store(schema, resolved)
+}
diff --git a/mcp/schema_cache_test.go b/mcp/schema_cache_test.go
@@ -0,0 +1,208 @@
+// Copyright 2025 The Go MCP SDK Authors. All rights reserved.
+// Use of this source code is governed by an MIT-style
+// license that can be found in the LICENSE file.
+
+package mcp
+
+import (
+	"context"
+	"reflect"
+	"testing"
+
+	"github.com/google/jsonschema-go/jsonschema"
+)
+
+func TestSchemaCacheByType(t *testing.T) {
+	cache := NewSchemaCache()
+
+	type TestInput struct {
+		Name string `json:"name"`
+	}
+
+	rt := reflect.TypeFor[TestInput]()
+
+	if _, _, ok := cache.getByType(rt); ok {
+		t.Error("expected cache miss for new type")
+	}
+
+	schema := &jsonschema.Schema{Type: "object"}
+	resolved, err := schema.Resolve(nil)
+	if err != nil {
+		t.Fatalf("failed to resolve schema: %v", err)
+	}
+	cache.setByType(rt, schema, resolved)
+
+	gotSchema, gotResolved, ok := cache.getByType(rt)
+	if !ok {
+		t.Error("expected cache hit after set")
+	}
+	if gotSchema != schema {
+		t.Error("schema mismatch")
+	}
+	if gotResolved != resolved {
+		t.Error("resolved schema mismatch")
+	}
+}
+
+func TestSchemaCacheBySchema(t *testing.T) {
+	cache := NewSchemaCache()
+
+	schema := &jsonschema.Schema{
+		Type: "object",
+		Properties: map[string]*jsonschema.Schema{
+			"query": {Type: "string"},
+		},
+	}
+
+	if _, ok := cache.getBySchema(schema); ok {
+		t.Error("expected cache miss for new schema")
+	}
+
+	resolved, err := schema.Resolve(nil)
+	if err != nil {
+		t.Fatalf("failed to resolve schema: %v", err)
+	}
+	cache.setBySchema(schema, resolved)
+
+	gotResolved, ok := cache.getBySchema(schema)
+	if !ok {
+		t.Error("expected cache hit after set")
+	}
+	if gotResolved != resolved {
+		t.Error("resolved schema mismatch")
+	}
+
+	// Different pointer should miss (cache uses pointer identity).
+	schema2 := &jsonschema.Schema{Type: "object"}
+	if _, ok = cache.getBySchema(schema2); ok {
+		t.Error("expected cache miss for different schema pointer")
+	}
+}
+
+func TestSetSchemaCachesGeneratedSchemas(t *testing.T) {
+	cache := NewSchemaCache()
+
+	type TestInput struct {
+		Query string `json:"query"`
+	}
+
+	rt := reflect.TypeFor[TestInput]()
+
+	var sfield1 any
+	var rfield1 *jsonschema.Resolved
+	if _, err := setSchema[TestInput](&sfield1, &rfield1, cache); err != nil {
+		t.Fatalf("setSchema failed: %v", err)
+	}
+
+	cachedSchema, cachedResolved, ok := cache.getByType(rt)
+	if !ok {
+		t.Fatal("schema not cached after first setSchema call")
+	}
+
+	var sfield2 any
+	var rfield2 *jsonschema.Resolved
+	if _, err := setSchema[TestInput](&sfield2, &rfield2, cache); err != nil {
+		t.Fatalf("setSchema failed on second call: %v", err)
+	}
+
+	if sfield2.(*jsonschema.Schema) != cachedSchema {
+		t.Error("expected cached schema to be returned")
+	}
+	if rfield2 != cachedResolved {
+		t.Error("expected cached resolved schema to be returned")
+	}
+}
+
+func TestSetSchemaCachesProvidedSchemas(t *testing.T) {
+	cache := NewSchemaCache()
+
+	schema := &jsonschema.Schema{
+		Type: "object",
+		Properties: map[string]*jsonschema.Schema{
+			"query": {Type: "string"},
+		},
+	}
+
+	var sfield1 any = schema
+	var rfield1 *jsonschema.Resolved
+	if _, err := setSchema[map[string]any](&sfield1, &rfield1, cache); err != nil {
+		t.Fatalf("setSchema failed: %v", err)
+	}
+
+	cachedResolved, ok := cache.getBySchema(schema)
+	if !ok {
+		t.Fatal("resolved schema not cached after first setSchema call")
+	}
+	if rfield1 != cachedResolved {
+		t.Error("expected same resolved schema")
+	}
+
+	var sfield2 any = schema
+	var rfield2 *jsonschema.Resolved
+	if _, err := setSchema[map[string]any](&sfield2, &rfield2, cache); err != nil {
+		t.Fatalf("setSchema failed on second call: %v", err)
+	}
+
+	if rfield2 != cachedResolved {
+		t.Error("expected cached resolved schema to be returned")
+	}
+}
+
+func TestSetSchemaNilCache(t *testing.T) {
+	type TestInput struct {
+		Query string `json:"query"`
+	}
+
+	var sfield1 any
+	var rfield1 *jsonschema.Resolved
+	if _, err := setSchema[TestInput](&sfield1, &rfield1, nil); err != nil {
+		t.Fatalf("setSchema failed: %v", err)
+	}
+
+	var sfield2 any
+	var rfield2 *jsonschema.Resolved
+	if _, err := setSchema[TestInput](&sfield2, &rfield2, nil); err != nil {
+		t.Fatalf("setSchema failed on second call: %v", err)
+	}
+
+	if sfield1 == nil || sfield2 == nil {
+		t.Error("expected schemas to be generated")
+	}
+	if rfield1 == nil || rfield2 == nil {
+		t.Error("expected resolved schemas to be generated")
+	}
+}
+
+func TestAddToolWithSharedCache(t *testing.T) {
+	cache := NewSchemaCache()
+
+	type GreetInput struct {
+		Name string `json:"name" jsonschema:"the name to greet"`
+	}
+
+	type GreetOutput struct {
+		Message string `json:"message"`
+	}
+
+	handler := func(ctx context.Context, req *CallToolRequest, in GreetInput) (*CallToolResult, GreetOutput, error) {
+		return &CallToolResult{}, GreetOutput{Message: "Hello, " + in.Name}, nil
+	}
+
+	tool := &Tool{
+		Name:        "greet",
+		Description: "Greet someone",
+	}
+
+	// Simulate stateless server pattern: new server per request, shared cache.
+	for i := 0; i < 3; i++ {
+		s := NewServer(&Implementation{Name: "test", Version: "1.0"}, &ServerOptions{
+			SchemaCache: cache,
+		})
+		AddTool(s, tool, handler)
+	}
+
+	rt := reflect.TypeFor[GreetInput]()
+	if _, _, ok := cache.getByType(rt); !ok {
+		t.Error("expected schema to be cached by type after multiple AddTool calls")
+	}
+}
diff --git a/mcp/server.go b/mcp/server.go
diff --git a/mcp/server_test.go b/mcp/server_test.go