Reduce allocation rate when marshaling OTLP data#11296
Reduce allocation rate when marshaling OTLP data#11296
Conversation
c4d131f to
d6ff0d7
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d6ff0d706b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
d6ff0d7 to
46e852c
Compare
…any small byte-arrays and move decision whether to export a span into OtlpTraceCollector (avoids re-allocations)
46e852c to
6f5caa3
Compare
dougqh
left a comment
There was a problem hiding this comment.
Overall, looks good to me.
There was one thing that Claude pointed out...
- Behavior change in sampling-priority / process-tag emission (medium). Worth confirming intent. - Old code: if (i == 0 || i == len - 1) metaWriter.includeSamplingTags(); before visitSpan(spans[i]). Because visitSpan writes the previous span via completeSpan, the flag actually applies to span i-1 once consumed. Net effect across two traces [a,b]+[c,d]: a, b, c get _sampling_priority_v1; d does not. Process tags go on a. - New code: sampling tag on the last-written span of each trace boundary (a and c in the same example), plus process+sampling on whichever span completeScope finalizes (the one that ends up first in payload). So b and d lose their sampling tags relative to old behavior.
The functional invariant "≥1 span per trace carries sampling priority" still holds, and the new placement is arguably cleaner (the old emission on second-to-last looked accidental). But there's no test asserting which spans carry these tags, so a downstream agent expecting redundancy across spans wouldn't catch a regression here. Recommend confirming with the agent-side OTLP→Datadog ingest team that the new placement is sufficient.
I haven't fully thought this through myself, but I thought it good to double check.
If you think it is fine, then I'm okay with it.
Yes the old code was a relic of porting the Datadog protocol code over - the new code cleans that up by removing the "last span" notion. (Note some of the expected OTLP behaviour is still being defined, but this passes the current system tests.) |
What Does This Do
Marshals proto messages into a single prepending buffer, instead of many small byte-arrays and moves decision whether to export a span into OtlpTraceCollector (avoids re-allocations)
Motivation
Avoid O(n) allocations when serializing spans, metrics, or logs.
Additional Notes
Tests have been updated and cleaned up with the assistance of Claude
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.