Skip to content

Conversation

@analytically
Copy link
Contributor

Replace container/list with custom typed doubly-linked list and add node pooling to eliminate allocations in steady-state eviction cycles.

Optimizations:

  • Custom typed linked list eliminates interface{} boxing and type assertions
  • Sentinel head/tail nodes remove nil checks in list operations
  • Node freelist reuses evicted nodes (zero allocations in steady-state)
  • clear() instead of map reallocation reduces GC pressure
  • Slice length reset preserves capacity for reuse

Benchmark improvements (capacity=512, matching pgx default):

  • High hit rate (99% cache hits): 9% faster
  • Steady-state churn: 59% faster, zero allocations (was 4 allocs/op)
  • DeallocateAll cycle: 29% faster
  • Put with eviction: 22% faster, 50% fewer allocations

The largest gains are in churn scenarios where the cache is at capacity and statements are constantly evicted/added - exactly what happens in production with a fixed-size prepared statement cache.

@analytically
Copy link
Contributor Author

LRU Cache Optimization: 59% faster, zero allocations in steady-state

Get() — 9% faster

Aspect Original V2
Type assertion Yes (interface{}*SD) No (typed field)
Validation checks e.list != l, l.root.next == e node.prev == c.head
Nil checks in list ops Multiple None (sentinels guarantee valid pointers)

Put() — 22% faster, 50% fewer allocs

Aspect Original V2
Node allocation Every Put Only when freelist empty
Steady-state allocs 1 per Put 0 per Put

RemoveInvalidated() — 50% faster

Aspect Original V2
Slice Discards backing array Reuses existing capacity
Map New allocation Clears in-place

InvalidateAll() — 29% faster

Aspect Original V2
Per-node type assertion Yes No
Node memory Lost to GC Recycled to freelist
Map New allocation Cleared in-place
List New allocation Reset pointers only

Assembly: Get() — 13% smaller

Metric Original V2
Code size 380 bytes 329 bytes (13% smaller)

Summary of Optimizations

Optimization Location Impact
Typed node field lruNodeV2.sd vs Element.Value No type assertion, direct field access
Sentinel nodes head/tail dummy nodes Eliminates nil checks in all list operations
Node pooling freelist, allocNode(), freeNode() Zero allocations in steady-state churn
Map reuse clear(c.m) vs make() Reduces GC pressure
Slice capacity reuse [:0] vs = nil Preserves backing array
Explicit length c.len field vs c.l.Len() Avoids method call overhead

@analytically
Copy link
Contributor Author

Memory net result: ~40% less memory for node storage, plus the freelist retains nodes after eviction rather than creating GC pressure. The freelist can grow unbounded in theory, but in practice it's capped by the cache capacity since nodes cycle between active use and freelist.

…pooling

Replace container/list with custom typed doubly-linked list and add
node pooling to eliminate allocations in steady-state eviction cycles.

Optimizations:
- Custom typed linked list eliminates interface{} boxing and type assertions
- Sentinel head/tail nodes remove nil checks in list operations
- Node freelist reuses evicted nodes (zero allocations in steady-state)
- clear() instead of map reallocation reduces GC pressure
- Slice length reset preserves capacity for reuse

Benchmark improvements (capacity=512, matching pgx default):
- High hit rate (99% cache hits): 9% faster
- Steady-state churn: 59% faster, zero allocations (was 4 allocs/op)
- DeallocateAll cycle: 29% faster
- Put with eviction: 22% faster, 50% fewer allocations

The largest gains are in churn scenarios where the cache is at capacity
and statements are constantly evicted/added - exactly what happens in
production with a fixed-size prepared statement cache.

Signed-off-by: Mathias Bogaert <[email protected]>
@jackc jackc merged commit 777dec0 into jackc:master Jan 4, 2026
14 checks passed
@jackc
Copy link
Owner

jackc commented Jan 4, 2026

👍

@analytically analytically deleted the perf/lru branch January 4, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants