Reduce contention in attribute metadata caches by Symmetricity · Pull Request #912 · systemed/tilemaker

Symmetricity · 2026-06-04T10:20:15Z

This PR is AI generated.

Summary

This reduces lock contention and repeated metadata work in two hot paths used
while writing OpenMapTiles attributes:

increase the small thread-local AttributeStore lookup caches from 64 to 256
entries
cache vector-layer metadata updates per worker thread so repeated writes of
the same layer/key/type do not repeatedly take vectorLayerMetadataMutex

On a warmed Austria OpenMapTiles run, measured with the same changes stacked on
#902, this reduced wall time by about 4.9%, system CPU by about 43%,
voluntary context switches by about 44%, and peak RSS by about 0.9%.

The patch also applies cleanly to current upstream master independently of
#902, which is how this PR is submitted.

Background

Large OpenMapTiles runs spend a lot of time repeatedly assigning attributes and
recording vector-layer metadata.

AttributeStore already has small thread-local direct-mapped caches to avoid
falling through to the shared sharded stores. This PR keeps that existing
approach, but makes the cache slightly larger so common attribute pairs and
sets survive more collision patterns.

OsmLuaProcessing::setVectorLayerMetadata() records the type of each
layer/key pair for the final vector_layers metadata table. The shared write
still needs a mutex, but most calls repeat the same layer/key/type combination
many times from the same worker thread. This PR skips the mutex path for those
same-thread exact repeats.

Implementation

The AttributeStore change is deliberately small:

cachedAttributePairPointers and cachedAttributePairIndexes grow from 64
to 256 entries per thread
cachedAttributeSetPointers and cachedAttributeSetIndexes grow from 64 to
256 entries per thread
the lookup behavior, shard selection, and locking behavior are otherwise
unchanged

The vector-layer metadata change adds a thread-local cache keyed by layer and
attribute name:

the cache is reset if the worker starts using a different LayerDefinition
instance
exact same-thread repeats of the same layer/key/type return before taking
vectorLayerMetadataMutex
first sightings and type changes still update the shared
layers.layers[layer].attributeMap under the existing mutex

This keeps the shared metadata map as the source of truth and does not remove
the synchronization that protects actual shared writes.

Performance

The performance numbers below were measured with these cache changes stacked on
#902, so they show the cost of these changes after the allocation-churn work in
#902. No separate upstream-master timing run was made for this PR.

Runtime fixture:

fixture: Austria Geofabrik extract
profile: OpenMapTiles profile
output: PMTiles
store: no --store
threads: 8
warmup: PBF plus coastline/landcover sidecar files
method: six alternating warmed runs with /usr/bin/time -v

Combined result:

metric	#902 mean	#902 + this PR mean	delta	delta %
wall time	32.795 s	31.177 s	-1.618 s	-4.93%
user CPU	189.337 s	198.053 s	+8.717 s	+4.60%
system CPU	41.465 s	23.575 s	-17.890 s	-43.14%
peak RSS	4,523,051 KiB	4,480,713 KiB	-42,337 KiB	-0.94%
minor faults	1,223,322	1,222,723	-599	-0.05%
filesystem outputs	1,111,761	1,111,955	+193	+0.02%
voluntary context switches	333,560	185,866	-147,695	-44.28%
involuntary context switches	2,438	2,132	-306	-12.54%

The user-CPU total increased, but the system-time and context-switch reductions
more than offset that in wall-clock time on this fixture.

Possible Regressions

The intended output behavior is unchanged.

Potential costs:

the AttributeStore thread-local caches retain a few more pointer/index
entries per worker thread
the vector-layer metadata cache keeps a small per-thread map of layer/key
types already seen by that worker
if a profile intentionally writes the same layer/key with different types,
type changes still go through the existing mutex and update the shared map

This PR does not make the final metadata type-resolution semantics stricter or
more deterministic. It only avoids repeated same-thread writes that would have
stored the same type again.

Testing

Build:

cmake -S . -B build -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build --parallel 8

Code style:

git diff --check

Semantic checks were run during candidate acceptance on the submitted/open PR
stack used for output comparisons:

fixture: Liechtenstein Geofabrik extract
profile: OpenMapTiles profile

AttributeStore TLS cache size:
  baseline repeat changed_tiles 0
  candidate repeat changed_tiles 0
  baseline -> candidate changed_tiles 0

Vector layer metadata cache:
  baseline repeat changed_tiles 0
  candidate repeat changed_tiles 0
  baseline -> candidate changed_tiles 0

Attribute pair and set lookups are very frequent during profile processing. The existing 64-entry thread-local caches leave many repeated lookups to fall back to the shared stores on larger fixtures. Use a still-small 256-entry cache so more hot lookups are served locally without changing generated tile semantics. On the Austria fixture this was neutral to slightly faster and reduced native RSS in warmed alternating runs. Co-authored-by: Codex <noreply@openai.com>

Layer metadata is shared across Lua processing threads and still needs the mutex added by the thread-safety fix in PR systemed#761. On larger fixtures, repeated Attribute() calls make that coarse lock a contention point. Remember metadata types already recorded by the current thread for the active LayerDefinition, and skip the lock only for exact same-type repeats. This preserves the protected shared map update for first sightings and type changes while avoiding redundant lock traffic. Measured with these cache changes stacked on PR systemed#902, the combined Austria fixture result reduced wall time by about 4.9%, system CPU by about 43%, and voluntary context switches by about 44% without changing decoded Liechtenstein output in the candidate semantic checks. Co-authored-by: Codex <noreply@openai.com>

systemed · 2026-06-04T13:37:56Z

Thank you!

As it happens the repeated writes issue is being discussed over in #877. I have extracted the core of this PR (with some reference to the discussion in #877) and created a single, focused PR at #913.

We can do the 64->256 cache size change in a separate commit.

Symmetricity · 2026-06-04T13:46:55Z

No worries! Sorry for not checking for related issues. I normally do, but just skipped it in this case for some reason.

Want me to close this PR and submit another one for the cache size change?

systemed · 2026-06-04T13:59:35Z

Yes please!

Symmetricity · 2026-06-04T14:10:59Z

Closed as requested. Replacement PR #914 contains only the AttributeStore 64->256 lookup-cache size change.

Symmetricity and others added 2 commits June 4, 2026 12:18

Symmetricity force-pushed the perf/attribute-metadata-contention branch from c910c03 to 05f5e43 Compare June 4, 2026 10:23

systemed mentioned this pull request Jun 4, 2026

Optimise locking when storing types per key #913

Open

systemed mentioned this pull request Jun 4, 2026

Performance regression during PBF scanning/parsing between 2024-02-10 and 2026-02-07 (high system CPU, ~30% slower) #877

Open

Symmetricity mentioned this pull request Jun 4, 2026

Increase AttributeStore lookup cache size #914

Merged

Symmetricity closed this Jun 4, 2026

Symmetricity deleted the perf/attribute-metadata-contention branch June 4, 2026 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce contention in attribute metadata caches#912

Reduce contention in attribute metadata caches#912
Symmetricity wants to merge 2 commits into
systemed:masterfrom
Symmetricity:perf/attribute-metadata-contention

Symmetricity commented Jun 4, 2026

Uh oh!

systemed commented Jun 4, 2026

Uh oh!

Symmetricity commented Jun 4, 2026 •

edited

Loading

Uh oh!

systemed commented Jun 4, 2026

Uh oh!

Symmetricity commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Symmetricity commented Jun 4, 2026

Summary

Background

Implementation

Performance

Possible Regressions

Testing

Uh oh!

systemed commented Jun 4, 2026

Uh oh!

Symmetricity commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

systemed commented Jun 4, 2026

Uh oh!

Symmetricity commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Symmetricity commented Jun 4, 2026 •

edited

Loading