Skip to content

feat(vt): expose carrier frames and fix truncated traces for virtual threads on JDK 21+#450

Open
jbachorik wants to merge 5 commits intomainfrom
jb/continuations
Open

feat(vt): expose carrier frames and fix truncated traces for virtual threads on JDK 21+#450
jbachorik wants to merge 5 commits intomainfrom
jb/continuations

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented Mar 31, 2026

What does this PR do?

Three related fixes for virtual thread profiling on JDK 21+:

  1. cstack=vmx continuation unwindwalkVM now crosses the continuation boundary to include carrier-thread (ForkJoinWorkerThread) frames in wall-clock profiles. Enabled automatically with cstack=vmx; no opt-in flag needed.

  2. Allocation samples fix (JVMTI)GetStackTrace on a virtual thread returns only the VT's logical stack. Without a root frame the UI backend attributes the trace to "Missing Frames". A synthetic "JVM Continuation" frame is now appended at the boundary.

  3. Wall-clock fp/dwarf fix (ASGCT)AsyncGetCallTrace also stops at the continuation boundary for mounted VTs. Same synthetic frame appended.

Also fixes a wallClock.cpp bug where thread_list->next() was called twice per iteration, silently skipping every other thread in filter= (all-threads) mode.


Background: how Project Loom continuations work

A virtual thread runs on top of a carrier thread (a ForkJoinWorkerThread). The JVM transfers execution between the two using a special stub called enterSpecial. When a VT is scheduled, enterSpecial thaws (copies from heap to stack) some or all of the VT's frames and jumps into them. The object that tracks this is a ContinuationEntry linked into the carrier's stack.

Frames can be in two states:

  • Thawed — live on the carrier stack, visible to normal stack walking.
  • Frozen — stored in a StackChunk object on the heap, not on the stack at all.

walkVM path: crossing the continuation boundary (cstack=vmx)

flowchart TD
    A["myMethod()
← top of VT logical stack"] --> B["callerMethod()"]
    B --> C["..."]
    C --> D["VirtualThread.run()
← bottommost thawed VT frame"]
    D -- "ret PC = ContinuationEntry::_return_pc
all VT frames thawed
(CPU-bound VT, never yielded)" --> E
    D -- "ret PC = cont_returnBarrier
frozen frames remain in StackChunk
(VT parked and just remounted)" --> E
    E["enterSpecial
(RuntimeBlob JDK 21–26 · nmethod JDK 27+)"] --> F["Continuation.run()"]
    F --> G["ForkJoinPool.runWorker()"]
    G --> H["ForkJoinWorkerThread.run()
← carrier frames"]
Loading

The return PC of the bottommost thawed VT frame identifies the boundary:

Return PC Meaning
ContinuationEntry::_return_pc All VT frames are thawed — CPU-bound VT that never yielded
StubRoutines::_cont_returnBarrier Frozen frames remain in a StackChunk — VT parked and was just remounted

Both PCs are resolved from gHotSpotVMStructs on JDK 27+, with C++ mangled-symbol lookup as a fallback for JDK 21-26.

Deriving the enterSpecial frame pointer

flowchart TD
    A{"type_size() > 0?"}
    A -- "yes — JDK 27+, or JDK 21–26 builds that
expose ContinuationEntry in gHotSpotVMStructs" --> B["Use ContinuationEntry layout via vmStructs
entry_fp from struct field"]
    A -- "no — JDK 21–26 builds that do not expose it" --> C{"all frames thawed?"}
    C -- "yes (ret PC = cont_entry_return_pc)" --> D["entry_fp = fp
fp already holds the enterSpecial frame FP"]
    C -- "no (ret PC = cont_returnBarrier)" --> E["entry_fp = *(void**)fp
saved caller FP at *fp leads to enterSpecial FP"]
Loading

JVMTI / ASGCT path: synthetic continuation root frame

For profiling modes that cannot cross the continuation boundary, a synthetic root frame is appended to prevent truncated-trace artefacts in the UI.

flowchart TD
    subgraph JVMTI["JVMTI · allocation samples"]
        J1["myMethod()
⋮
Continuation.enter()
← GetStackTrace stops here"]
    end
    subgraph ASGCT["ASGCT · fp / dwarf wall-clock"]
        A1["myMethod()
⋮
bottommost reachable frame
← AsyncGetCallTrace stops here"]
    end
    J1 & A1 --> D{"isCarryingVirtualThread()
on carrier thread?"}
    D -- "yes · mounted VT on JDK 21+" --> S["JVM Continuation\nBCI_NATIVE_FRAME appended"]
    D -- "no · non-VT thread or JDK < 21" --> E["trace ends normally"]
Loading

isCarryingVirtualThread() reads JavaThread::_cont_entry directly via SafeAccess. This field is present in gHotSpotVMStructs on all JDK 21+ builds (unlike ContinuationEntry struct layout, which is absent before JDK 27).


Motivation

Wall-clock profiles of JDK 21+ virtual-thread applications showed truncated stack traces — carrier thread frames were never visible, and allocation / fp-mode traces were attributed to "Missing Frames" in the UI.

Additional Notes

  • Carrier-frame unwind is enabled automatically with cstack=vmx; no separate opt-in flag.
  • DDPROF_DISABLE_CONT_UNWIND=1 env var (debug builds only) skips both walkVM unwind paths and emits the synthetic frame — used by negative tests.
  • New counters: WALKVM_CONT_BARRIER_HIT, WALKVM_ENTER_SPECIAL_HIT, WALKVM_CONT_ENTRY_NULL, WALKVM_CONT_SPECULATIVE_HIT.
  • VMContinuationEntry is in DECLARE_V21_TYPES_DO and explicitly excluded from verify_offsets()ContinuationEntry is not in gHotSpotVMTypes before JDK 27, so asserting type_size() > 0 would abort on any JDK 21-26 build.

How to test the change?

Integration tests in VirtualThreadWallClockTest (JDK 21+ only, cstack=vmx):

  • samplesCarrierFramesFromCpuBoundVT — asserts at least one sample shows ForkJoinWorkerThread carrier frames from a CPU-bound virtual thread (thawed-boundary path).
  • samplesCarrierFramesFromBlockingVT — asserts carrier frames visible after repeated park/unpark cycles (frozen-boundary path; skipped on JDK 25+ pending verification).

Run with: ./gradlew ddprof-test:testRelease

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I have requested a review from @DataDog/security-design-and-guidance.
  • This PR does not touch any of that.
  • JIRA: SCP-1110

@jbachorik jbachorik added the AI label Mar 31, 2026
@jbachorik jbachorik force-pushed the jb/continuations branch 2 times, most recently from 19ae1ee to 8ed7897 Compare March 31, 2026 11:31
@dd-octo-sts
Copy link
Copy Markdown

dd-octo-sts bot commented Mar 31, 2026

CI Test Results

Run: #23944623269 | Commit: 2094319 | Duration: 21m 29s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 32 | Failed: 0


Updated: 2026-04-03 11:51:44 UTC

@dd-octo-sts
Copy link
Copy Markdown

dd-octo-sts bot commented Mar 31, 2026

Scan-Build Report

User:runner@runnervmrg6be
Working Directory:/home/runner/work/java-profiler/java-profiler/ddprof-lib/src/test/make
Command Line:make -j4 all
Clang Version:Ubuntu clang version 18.1.3 (1ubuntu1)
Date:Fri Apr 3 11:29:01 2026

Bug Summary

Bug TypeQuantityDisplay?
All Bugs1
Logic error
Stack address stored into global variable1

Reports

Bug Group Bug Type ▾ File Function/Method Line Path Length
Logic errorStack address stored into global variablestackWalker.cppwalkVM99437

@jbachorik jbachorik changed the title fix(walkvm): unwind through virtual thread continuation boundaries feat(walkvm): unwind through virtual thread continuation boundaries Apr 1, 2026
With filter= (all-threads mode) the loop called next() twice per iteration,
skipping every other thread. Remove the spurious second call.
On JDK 21+ with cstack=vmx, walkVM now crosses the continuation boundary to
include carrier-thread (ForkJoinWorkerThread) frames in wall-clock profiles of
virtual threads.

Two detection paths:
- Thawed boundary: cont_entry_return_pc is the return PC of the bottommost
  thawed frame; traverse enterSpecial to reach the carrier stack.
- Frozen boundary: cont_returnBarrier is the return PC when frozen frames
  remain in the StackChunk; switch to the carrier stack directly.

Both addresses are resolved via gHotSpotVMStructs (JDK 27+) or C++ symbol
lookup fallback (JDK 21-26). A speculative fallback fires when neither
boundary is detected but isCarryingVirtualThread() confirms a mounted VT.

carrier_frames is now enabled automatically with cstack=vmx. The wextend=vt_carrier
option is removed.
…ation samples

GetStackTrace on a virtual thread stops at the continuation boundary and never
includes carrier frames. Without a synthetic root the trace appears truncated
to the UI backend ("Missing Frames"). Append a BCI_NATIVE_FRAME named
"JVM Continuation" when isCarryingVirtualThread() is true on the carrier.

Same fix applied to the ASGCT path (fp/dwarf cstack modes): AsyncGetCallTrace
also stops at the continuation boundary for mounted virtual threads.
VirtualThreadWallClockTest exercises both continuation unwind paths with cstack=vmx:
- samplesCarrierFramesFromCpuBoundVT: all frames thawed (cont_entry_return_pc boundary)
- samplesCarrierFramesFromBlockingVT: frozen frames in StackChunk (cont_returnBarrier boundary)

Both tests assert ForkJoinWorkerThread carrier frames appear in wall-clock samples.
Skipped on non-vmx cstack modes and JDK < 21 via assumeTrue.
@jbachorik jbachorik marked this pull request as ready for review April 3, 2026 11:49
@jbachorik jbachorik requested a review from a team as a code owner April 3, 2026 11:49
@jbachorik jbachorik changed the title feat(walkvm): unwind through virtual thread continuation boundaries feat(vt): expose carrier frames and fix truncated traces for virtual threads on JDK 21+ Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant