feat(vt): expose carrier frames and fix truncated traces for virtual threads on JDK 21+#450
Open
feat(vt): expose carrier frames and fix truncated traces for virtual threads on JDK 21+#450
Conversation
19ae1ee to
8ed7897
Compare
CI Test ResultsRun: #23944623269 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-04-03 11:51:44 UTC |
Scan-Build Report
Bug Summary
Reports
|
||||||||||||||||||||||||||||||||||||
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
ddprof-test/src/test/java/com/datadoghq/profiler/wallclock/VirtualThreadWallClockTest.java
Fixed
Show fixed
Hide fixed
6070109 to
79cb033
Compare
1a49317 to
56f206e
Compare
With filter= (all-threads mode) the loop called next() twice per iteration, skipping every other thread. Remove the spurious second call.
On JDK 21+ with cstack=vmx, walkVM now crosses the continuation boundary to include carrier-thread (ForkJoinWorkerThread) frames in wall-clock profiles of virtual threads. Two detection paths: - Thawed boundary: cont_entry_return_pc is the return PC of the bottommost thawed frame; traverse enterSpecial to reach the carrier stack. - Frozen boundary: cont_returnBarrier is the return PC when frozen frames remain in the StackChunk; switch to the carrier stack directly. Both addresses are resolved via gHotSpotVMStructs (JDK 27+) or C++ symbol lookup fallback (JDK 21-26). A speculative fallback fires when neither boundary is detected but isCarryingVirtualThread() confirms a mounted VT. carrier_frames is now enabled automatically with cstack=vmx. The wextend=vt_carrier option is removed.
53c8efc to
e220dce
Compare
…ation samples
GetStackTrace on a virtual thread stops at the continuation boundary and never
includes carrier frames. Without a synthetic root the trace appears truncated
to the UI backend ("Missing Frames"). Append a BCI_NATIVE_FRAME named
"JVM Continuation" when isCarryingVirtualThread() is true on the carrier.
Same fix applied to the ASGCT path (fp/dwarf cstack modes): AsyncGetCallTrace
also stops at the continuation boundary for mounted virtual threads.
e220dce to
6272fbe
Compare
VirtualThreadWallClockTest exercises both continuation unwind paths with cstack=vmx: - samplesCarrierFramesFromCpuBoundVT: all frames thawed (cont_entry_return_pc boundary) - samplesCarrierFramesFromBlockingVT: frozen frames in StackChunk (cont_returnBarrier boundary) Both tests assert ForkJoinWorkerThread carrier frames appear in wall-clock samples. Skipped on non-vmx cstack modes and JDK < 21 via assumeTrue.
6272fbe to
4a794c4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Three related fixes for virtual thread profiling on JDK 21+:
cstack=vmxcontinuation unwind —walkVMnow crosses the continuation boundary to include carrier-thread (ForkJoinWorkerThread) frames in wall-clock profiles. Enabled automatically withcstack=vmx; no opt-in flag needed.Allocation samples fix (JVMTI) —
GetStackTraceon a virtual thread returns only the VT's logical stack. Without a root frame the UI backend attributes the trace to "Missing Frames". A synthetic"JVM Continuation"frame is now appended at the boundary.Wall-clock
fp/dwarffix (ASGCT) —AsyncGetCallTracealso stops at the continuation boundary for mounted VTs. Same synthetic frame appended.Also fixes a
wallClock.cppbug wherethread_list->next()was called twice per iteration, silently skipping every other thread infilter=(all-threads) mode.Background: how Project Loom continuations work
A virtual thread runs on top of a carrier thread (a
ForkJoinWorkerThread). The JVM transfers execution between the two using a special stub calledenterSpecial. When a VT is scheduled,enterSpecialthaws (copies from heap to stack) some or all of the VT's frames and jumps into them. The object that tracks this is aContinuationEntrylinked into the carrier's stack.Frames can be in two states:
StackChunkobject on the heap, not on the stack at all.walkVM path: crossing the continuation boundary (
cstack=vmx)flowchart TD A["myMethod() ← top of VT logical stack"] --> B["callerMethod()"] B --> C["..."] C --> D["VirtualThread.run() ← bottommost thawed VT frame"] D -- "ret PC = ContinuationEntry::_return_pc all VT frames thawed (CPU-bound VT, never yielded)" --> E D -- "ret PC = cont_returnBarrier frozen frames remain in StackChunk (VT parked and just remounted)" --> E E["enterSpecial (RuntimeBlob JDK 21–26 · nmethod JDK 27+)"] --> F["Continuation.run()"] F --> G["ForkJoinPool.runWorker()"] G --> H["ForkJoinWorkerThread.run() ← carrier frames"]The return PC of the bottommost thawed VT frame identifies the boundary:
ContinuationEntry::_return_pcStubRoutines::_cont_returnBarrierStackChunk— VT parked and was just remountedBoth PCs are resolved from
gHotSpotVMStructson JDK 27+, with C++ mangled-symbol lookup as a fallback for JDK 21-26.Deriving the
enterSpecialframe pointerflowchart TD A{"type_size() > 0?"} A -- "yes — JDK 27+, or JDK 21–26 builds that expose ContinuationEntry in gHotSpotVMStructs" --> B["Use ContinuationEntry layout via vmStructs entry_fp from struct field"] A -- "no — JDK 21–26 builds that do not expose it" --> C{"all frames thawed?"} C -- "yes (ret PC = cont_entry_return_pc)" --> D["entry_fp = fp fp already holds the enterSpecial frame FP"] C -- "no (ret PC = cont_returnBarrier)" --> E["entry_fp = *(void**)fp saved caller FP at *fp leads to enterSpecial FP"]JVMTI / ASGCT path: synthetic continuation root frame
For profiling modes that cannot cross the continuation boundary, a synthetic root frame is appended to prevent truncated-trace artefacts in the UI.
flowchart TD subgraph JVMTI["JVMTI · allocation samples"] J1["myMethod() ⋮ Continuation.enter() ← GetStackTrace stops here"] end subgraph ASGCT["ASGCT · fp / dwarf wall-clock"] A1["myMethod() ⋮ bottommost reachable frame ← AsyncGetCallTrace stops here"] end J1 & A1 --> D{"isCarryingVirtualThread() on carrier thread?"} D -- "yes · mounted VT on JDK 21+" --> S["JVM Continuation\nBCI_NATIVE_FRAME appended"] D -- "no · non-VT thread or JDK < 21" --> E["trace ends normally"]isCarryingVirtualThread()readsJavaThread::_cont_entrydirectly viaSafeAccess. This field is present ingHotSpotVMStructson all JDK 21+ builds (unlikeContinuationEntrystruct layout, which is absent before JDK 27).Motivation
Wall-clock profiles of JDK 21+ virtual-thread applications showed truncated stack traces — carrier thread frames were never visible, and allocation / fp-mode traces were attributed to "Missing Frames" in the UI.
Additional Notes
cstack=vmx; no separate opt-in flag.DDPROF_DISABLE_CONT_UNWIND=1env var (debug builds only) skips both walkVM unwind paths and emits the synthetic frame — used by negative tests.WALKVM_CONT_BARRIER_HIT,WALKVM_ENTER_SPECIAL_HIT,WALKVM_CONT_ENTRY_NULL,WALKVM_CONT_SPECULATIVE_HIT.VMContinuationEntryis inDECLARE_V21_TYPES_DOand explicitly excluded fromverify_offsets()—ContinuationEntryis not ingHotSpotVMTypesbefore JDK 27, so assertingtype_size() > 0would abort on any JDK 21-26 build.How to test the change?
Integration tests in
VirtualThreadWallClockTest(JDK 21+ only,cstack=vmx):samplesCarrierFramesFromCpuBoundVT— asserts at least one sample showsForkJoinWorkerThreadcarrier frames from a CPU-bound virtual thread (thawed-boundary path).samplesCarrierFramesFromBlockingVT— asserts carrier frames visible after repeated park/unpark cycles (frozen-boundary path; skipped on JDK 25+ pending verification).Run with:
./gradlew ddprof-test:testReleaseFor Datadog employees:
@DataDog/security-design-and-guidance.