perf: use unsynchronized StringBuilderWriter in std.deepJoin by He-Pin · Pull Request #889 · databricks/sjsonnet

He-Pin · 2026-06-03T05:36:22Z

Motivation

std.deepJoin writes each Val.Str chunk into a java.io.StringWriter inside a tight loop. StringWriter's backing StringBuffer pays a monitor enter/exit on every write/append call, which on a typical deepJoin walk over a deeply nested array can be hundreds of thousands of synchronized writes — wasted overhead in single-threaded jsonnet evaluation.

TomlRenderer and FastMaterializeJsonRenderer already use the unsynchronized package-private StringBuilderWriter for the same reason (#874, #875). std.deepJoin was explicitly left as a follow-up in #875's description ("std.deepJoin keeps StringWriter (separate concern)") — this PR is that follow-up.

Modification

Single change in ManifestModule.scala: swap the new StringWriter() in DeepJoin.evalRhs for new StringBuilderWriter(). No other code changes; output is byte-identical.

Result

Scala Native, hyperfine A/B against master (fc292fa6). Workload: a 50,000-row array of 10 pre-allocated strings → 2 MB of deepJoin output, render-dominated. Four interleaved-order passes, --warmup 10 --min-runs 100 --shell=none:

pass	order	before mean	after mean	before min	after min	min ratio
1	before → after	35.1 ± 16.5 ms	32.2 ± 19.1 ms	23.1 ms	18.7 ms	1.24×
2	after → before	43.7 ± 30.6 ms	29.9 ± 25.3 ms	25.7 ms	20.3 ms	1.27×
3	before → after	30.3 ± 8.5 ms	29.5 ± 7.1 ms	24.6 ms	20.8 ms	1.18×
4	after → before	32.6 ± 7.6 ms	28.0 ± 6.8 ms	24.0 ms	20.7 ms	1.16×

After is faster in every one of the 4 passes; mean is noisy on the host but min values are tight at 1.16–1.27× faster (best observed 18.7 vs 23.1 ms, ~19% reduction). Output byte-identical (2,000,000 bytes both sides).

Test plan

./mill __.reformat
./mill 'sjsonnet.jvm[3.3.7]'.test — 519/519 pass
Scala Native A/B hyperfine — 4 interleaved-order passes, all positive; output byte-identical

Independent of #875; can land in either order. After both land, the import java.io.StringWriter in ManifestModule.scala can be removed in a small cleanup.

Motivation: `std.deepJoin` writes each Val.Str chunk into a `java.io.StringWriter` inside a tight loop. StringWriter's backing `StringBuffer` pays a monitor enter/exit on every `write`/`append` call, which on a typical deepJoin walk over a deeply nested array can be hundreds of thousands of synchronized writes. `TomlRenderer` and `FastMaterializeJsonRenderer` already use the unsynchronized package-private `StringBuilderWriter` for the same reason (see databricks#874, databricks#875). deepJoin was explicitly left as a follow-up in databricks#875's description ("std.deepJoin keeps StringWriter (separate concern)") — this is that follow-up. Modification: Swap the single `new StringWriter()` in `DeepJoin.evalRhs` for `new StringBuilderWriter()`. No other changes; output is byte-identical. Result: Scala Native hyperfine, A/B against master (`fc292fa6`). Workload: a 50000-row array of 10-string rows → 2 MB of deepJoin output, render-dominated. Four interleaved-order passes (`--warmup 10 --min-runs 100 --shell=none`): | pass | order | before mean | after mean | before min | after min | min ratio | |---|---|---:|---:|---:|---:|---:| | 1 | before → after | 35.1 ± 16.5 ms | 32.2 ± 19.1 ms | 23.1 ms | 18.7 ms | 1.24x | | 2 | after → before | 43.7 ± 30.6 ms | 29.9 ± 25.3 ms | 25.7 ms | 20.3 ms | 1.27x | | 3 | before → after | 30.3 ± 8.5 ms | 29.5 ± 7.1 ms | 24.6 ms | 20.8 ms | 1.18x | | 4 | after → before | 32.6 ± 7.6 ms | 28.0 ± 6.8 ms | 24.0 ms | 20.7 ms | 1.16x | After is faster in every one of the 4 passes; min values are tight at 1.16-1.27x faster. Output byte-identical (2,000,000 bytes both sides).

stephenamar-db merged commit e4d44ce into databricks:master Jun 3, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use unsynchronized StringBuilderWriter in std.deepJoin#889

perf: use unsynchronized StringBuilderWriter in std.deepJoin#889
stephenamar-db merged 1 commit into
databricks:masterfrom
He-Pin:perf/deepjoin-stringbuilder-writer

He-Pin commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

He-Pin commented Jun 3, 2026

Motivation

Modification

Result

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants