Skip to content

Move extraction to gen job and add artifact audit trail#116

Merged
mattleibow merged 4 commits into
mainfrom
dev/workflow-extract-in-gen-job
May 21, 2026
Merged

Move extraction to gen job and add artifact audit trail#116
mattleibow merged 4 commits into
mainfrom
dev/workflow-extract-in-gen-job

Conversation

@mattleibow
Copy link
Copy Markdown
Collaborator

@mattleibow mattleibow commented May 21, 2026

Summary

Moves placeholder extraction from the agent runtime into the mechanical regenerate-stubs CI job, adds artifact audit trail, and strengthens the agent prompt to prevent session termination failures.

Companion PR: mono/SkiaSharp#4030 (merge guard, anchored regex, SKILL.md fixes)

Context

Reviewed PR #115 and found the automated docs pipeline had systemic issues:

  • False-positive regex extracting already-documented members
  • No merge guard allowing agent to invent fields
  • No audit trail to compare before/after
  • Agent session termination when waiting for background sub-agents

Workflow Changes

Extraction moved to regenerate-stubs job (mechanical)

  • Extraction now runs in the Windows gen job alongside mdoc, not in the agent container
  • Produces docs-extracted artifact (7-day retention) as an immutable baseline
  • Agent downloads pre-extracted JSON — no extraction at runtime

Artifact audit trail

  • Pre-agent step: copies original JSON to /tmp/gh-aw/agent/docs-work-original/
  • Post-step: copies final JSON to /tmp/gh-aw/agent/docs-work-final/
  • Both are uploaded as part of the agent artifact for diffing

Agent prompt hardening (session termination fix)

  • Problem: In runs 2 and 5, the orchestrator launched background agents, said "waiting" in text, ended its turn with NO active tool call, and the Copilot CLI terminated the session. All work was lost.
  • Fix: Explicit rules requiring read_agent(wait=true) in the same response as agent launch, with multi-agent sequential pattern documented. FORBIDDEN pattern called out. Budget fallback: skip Phase 5 if past 10 minutes.
  • COMPLETION GATE: Session cannot end without create_pull_request or noop call.

Validation Runs

Run Duration Result PR Notes
Run 1 25m #117 First validation
Run 2 10m Session terminated (writer wait)
Run 3 25m #118 Second validation
Run 4 23m #119 Single-agent fix in place
Run 5 13m Session terminated (reviewer wait)
Run 6 16m #120 Multi-agent fix applied
Run 7 23m #121 Multi-agent fix ✅
Run 8 20m #122 Multi-agent fix ✅

Results: 3/3 post-fix runs succeeded (100%). Pre-fix had 2/5 failures (40% failure rate).

- Extract placeholders mechanically in the gen job (Windows) instead of
  pre-agent-steps, making it an immutable artifact baseline
- Upload extracted JSON as 'docs-extracted' artifact (7-day retention)
- Pre-agent-steps now downloads pre-extracted JSON instead of running extract
- Copy original JSON to /tmp/gh-aw/agent/docs-work-original/ for auto-upload
- Post-step copies final JSON to /tmp/gh-aw/agent/docs-work-final/
- This gives full audit trail: diff original vs final = exactly what the agent changed

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@learn-build-service-prod
Copy link
Copy Markdown
Contributor

PoliCheck Scan Report

The following report lists PoliCheck issues in PR files. Before you merge the PR, you must fix all severity-1 and severity-2 issues. The AI Review Details column lists suggestions for either removing or replacing the terms. If you find a false positive result, mention it in a PR comment and include this text: #policheck-false-positive. This feedback helps reduce false positives in future scans.

✅ No issues found

More information about PoliCheck

Information: PoliCheck | Severity Guidance | Term
For any questions: Try searching the learn.microsoft.com contributor guides or post your question in the Learn support channel.

@learn-build-service-prod
Copy link
Copy Markdown
Contributor

Learn Build status updates of commit 2bbaeea:

✅ Validation status: passed

File Status Preview URL Details
.github/workflows/auto-api-docs-writer.md ✅Succeeded

For more details, please refer to the build report.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@learn-build-service-prod
Copy link
Copy Markdown
Contributor

PoliCheck Scan Report

The following report lists PoliCheck issues in PR files. Before you merge the PR, you must fix all severity-1 and severity-2 issues. The AI Review Details column lists suggestions for either removing or replacing the terms. If you find a false positive result, mention it in a PR comment and include this text: #policheck-false-positive. This feedback helps reduce false positives in future scans.

✅ No issues found

More information about PoliCheck

Information: PoliCheck | Severity Guidance | Term
For any questions: Try searching the learn.microsoft.com contributor guides or post your question in the Learn support channel.

@learn-build-service-prod
Copy link
Copy Markdown
Contributor

Learn Build status updates of commit abb99d0:

✅ Validation status: passed

File Status Preview URL Details
.github/workflows/auto-api-docs-writer.lock.yml ✅Succeeded
.github/workflows/auto-api-docs-writer.md ✅Succeeded

For more details, please refer to the build report.

…ination

The orchestrator in run 2 launched the writer agent, said 'waiting', and
ended its turn with no tool call — causing the runtime to terminate the
session. The writer did complete (2.1M tokens consumed) but the orchestrator
never read its result.

Fix: explicit rules requiring read_agent(wait=true) in the same turn as
agent launch, plus a completion gate that prevents exiting without a
safe output call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@learn-build-service-prod
Copy link
Copy Markdown
Contributor

Learn Build status updates of commit 45608f2:

✅ Validation status: passed

File Status Preview URL Details
.github/workflows/auto-api-docs-writer.lock.yml ✅Succeeded
.github/workflows/auto-api-docs-writer.md ✅Succeeded

For more details, please refer to the build report.

@learn-build-service-prod
Copy link
Copy Markdown
Contributor

PoliCheck Scan Report

The following report lists PoliCheck issues in PR files. Before you merge the PR, you must fix all severity-1 and severity-2 issues. The AI Review Details column lists suggestions for either removing or replacing the terms. If you find a false positive result, mention it in a PR comment and include this text: #policheck-false-positive. This feedback helps reduce false positives in future scans.

✅ No issues found

More information about PoliCheck

Information: PoliCheck | Severity Guidance | Term
For any questions: Try searching the learn.microsoft.com contributor guides or post your question in the Learn support channel.

Run 5 reproduced the run 2 failure: orchestrator launched 3 reviewer agents,
said 'waiting', ended its turn with no active tool call, and the session
terminated (13m 44s). The writer phase DID use read_agent correctly (our
previous fix worked) but the parallel reviewers triggered the same pattern.

Fixes:
- Explicit multi-agent pattern: call read_agent sequentially for each agent
- FORBIDDEN pattern documented explicitly (launch → text → end turn)
- Budget fallback: skip Phase 5 if past 10 minutes without merge
- Stronger wording: session WILL terminate (not 'may')

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@learn-build-service-prod
Copy link
Copy Markdown
Contributor

Learn Build status updates of commit 716704d:

✅ Validation status: passed

File Status Preview URL Details
.github/workflows/auto-api-docs-writer.lock.yml ✅Succeeded
.github/workflows/auto-api-docs-writer.md ✅Succeeded

For more details, please refer to the build report.

@learn-build-service-prod
Copy link
Copy Markdown
Contributor

PoliCheck Scan Report

The following report lists PoliCheck issues in PR files. Before you merge the PR, you must fix all severity-1 and severity-2 issues. The AI Review Details column lists suggestions for either removing or replacing the terms. If you find a false positive result, mention it in a PR comment and include this text: #policheck-false-positive. This feedback helps reduce false positives in future scans.

✅ No issues found

More information about PoliCheck

Information: PoliCheck | Severity Guidance | Term
For any questions: Try searching the learn.microsoft.com contributor guides or post your question in the Learn support channel.

mattleibow added a commit to mono/SkiaSharp that referenced this pull request May 21, 2026
[api-docs] Fix false-positive extraction and add merge guard (#4030)

Context: mono/SkiaSharp-API-docs#115
Companion: mono/SkiaSharp-API-docs#116

Review of the automated docs output (PR #115) revealed three systemic
pipeline issues: the extract regex matched legitimate prose containing
"to be added" (e.g. SKPath.AddPath's "elements to be added to the
current path"), the merge step had no guard against agent-invented
fields, and the writer produced wrong domain facts (gamma 2.8 vs 2.2
for BT.470, contradictory bit-packing for Bgra10101010XR).

docs-tool.ps1:
  * Anchor extraction regex to `^\s*To be added\.?\s*$` — only full-text
    placeholder matches trigger extraction
  * Record `_extractedKeys` metadata during extract so the merge phase
    knows which fields were originally placeholders
  * Add merge guard that rejects any agent-added fields not present in
    the original extract (prevents invented documentation)
  * Exclude manifest.json from merge processing
  * Fix PowerShell falsy-empty-array check (`@()` is falsy in boolean
    context; use .PSObject.Properties existence test instead)

SKILL.md:
  * Writer prompt: JSON integrity rules (never add/remove/rename fields),
    trust hierarchy for native type facts (header > reference > knowledge)
  * Factual verifier prompt: standard value verification against
    skia-patterns.md reference file
  * Phase 2: simplified for automated workflow awareness

Validated across 8 workflow runs — 0 tooling regressions post-fix.

Co-authored-by: Matthew Leibowitz <mattleibow@live.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mattleibow mattleibow merged commit f68ee7d into main May 21, 2026
25 of 31 checks passed
@mattleibow mattleibow deleted the dev/workflow-extract-in-gen-job branch May 21, 2026 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant