🤖 fix: handle zero-budget goals and live cost previews#3271
Conversation
|
/coder-agents-review |
|
@codex review Please review the goal budget and live cost tracking changes. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 041a83edbc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
041a83e to
b70b1bc
Compare
|
@codex review Addressed the preview reset issue and added regression coverage. Please take another look. |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Clean design on the preview/record split. The snapshot-only preview path avoids double-counting by re-reading the committed base on each event instead of maintaining an in-memory accumulator. Zero-budget normalization is applied at every write boundary (creation, intent resolution, status transitions, restart recovery), and the defense-in-depth layering in applyBudgetDrivenStatus catches any path that forgets to normalize early. Test coverage is genuine: each new test would fail on the base SHA.
Severity count: 1 P2, 11 P3, 4 Nit.
The P2 is a read-path gap: getGoal returns raw budgetCents: 0 for pre-existing goals that haven't been recovered yet, which leaks into goalPrompts.ts as a false "budget exhausted" signal. The normalization covers every write path but not the read path. Single-line fix in toGoalSnapshot or readGoalFile.
The P3s cluster into three themes: (1) test gaps for the new zero-budget behavior at three entry points (restart migration, command palette, parser), (2) documentation for the new public API surface (hasGoalBudgetLimit, normalizeGoalBudgetCents, previewStreamAccounting, StreamAccountingInput.costUsd), and (3) structural concerns (type guard unsoundness, normalization asymmetry between createGoal and applyMutableFields, cost-accumulation duplication across three methods).
The preview-after-record race (DEREM-2) is theoretical: Killua traced the lock registration order and showed that under the current event dispatch model, preview always chains ahead of record. The three reviewers who flagged it (Hisoka, Nami, Pariston) are right that the ordering guarantee is implicit, not contractual. A per-stream flag or updatedAtMs guard would make the guarantee explicit.
"Shall I show you what happens under concurrent dispatch? ♥" (Hisoka)
src/node/services/workspaceGoalService.ts:1174
P2 [DEREM-1] getGoal returns the raw readGoalFile result without normalization. Pre-existing goals on disk with budgetCents: 0 (created under old semantics as budget_limited) flow un-normalized to toGoalSnapshot and through to:
goalPrompts.ts:69,81: computesbudgetLimitReached = truefor anybudgetCents != nullwhere cost >= budget. Injects "Budget limit: $0.00" and "budget exhausted" into the agent system prompt. The model receives a false stop signal on what should be an active, unbudgeted goal.GoalTab.tsx:251,AgentListItem.tsx:443: renders "$0.00" budget instead of "No budget."
The window is narrow (between app update restart and per-workspace recovery via normalizeGoalLimits), but the prompt injection is the dangerous case: the agent refuses work.
"The schema accepts 0 via
.nonnegative().nullable(). Normalizing inreadGoalFileor adding.transform(v => v === 0 ? null : v)to the schema'sbudgetCentsfield closes every downstream consumer in one place." (Mafuuu)
Fix: normalize in toGoalSnapshot with budgetCents: normalizeGoalBudgetCents(goal?.budgetCents), or in readGoalFile after the Zod parse.
🤖
src/node/services/workspaceGoalService.ts:1003
P3 [DEREM-4] applyMutableFields writes raw input.budgetCents ?? null (where 0 ?? null evaluates to 0), deferring normalization to applyBudgetDrivenStatus. createGoal normalizes eagerly at line 336 with normalizeGoalBudgetCents(input.budgetCents).
This creates an implicit invariant: every write path that constructs a GoalRecordV1 with a raw budgetCents must route through applyBudgetDrivenStatus, and applyBudgetDrivenStatus must never skip its normalization preamble (lines 1035-1043). If applyBudgetDrivenStatus is refactored to drop the normalization (someone may assume createGoal is the canonical site), updates persist budgetCents: 0 to disk.
Fix: normalize at the point of assignment:
...(Object.hasOwn(input, "budgetCents")
? { budgetCents: normalizeGoalBudgetCents(input.budgetCents) }
: {}),The normalization inside applyBudgetDrivenStatus becomes a harmless idempotent layer. (Meruem P3, Knov P3, Pariston)
🤖
🤖 This review was automatically generated with Coder Agents.
b70b1bc to
ab5d66e
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab5d66e0ec
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
ab5d66e to
b89c014
Compare
|
@codex review Addressed the compaction preview feedback. Please take another look. |
|
/coder-agents-review |
There was a problem hiding this comment.
15 of 16 posted findings from Round 1 are addressed in ab5d66e. Good work.
One finding remains unaddressed:
DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields still writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. createGoal normalizes at point of assignment (line 337); the update path does not. The finding was body-folded in the R1 review (line was outside the diff hunk), so there was no threaded comment to reply to. The code at line 1006 is unchanged in ab5d66e.
The current code works correctly because applyBudgetDrivenStatus normalizes downstream. The concern is maintenance: the asymmetry creates an implicit invariant that every write path must route through applyBudgetDrivenStatus to avoid persisting budgetCents: 0.
To unblock the next review round: either apply normalizeGoalBudgetCents at line 1006, or reply acknowledging the asymmetry as intentional (the panel will evaluate).
🤖 This review was automatically generated with Coder Agents.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b89c0146b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
b89c014 to
88233b9
Compare
|
@codex review Addressed the mapped pricing preview feedback. Please take another look. |
|
/coder-agents-review |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Round 3: still blocked on the same finding from Round 2.
DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. This was body-folded in R1 (line outside the diff hunk) and called out explicitly in the R2 review body. The new commit b89c0146b only forwards isCompaction to the preview path; applyMutableFields is unchanged.
To unblock: apply normalizeGoalBudgetCents at line 1006, or reply here acknowledging the asymmetry as intentional.
🤖 This review was automatically generated with Coder Agents.
There was a problem hiding this comment.
Round 4: blocked for the third consecutive round on the same finding.
DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. This has been called out in R1 (review body), R2, and R3. The code is unchanged across all three follow-up commits.
This is a P3 maintenance concern, not a correctness issue (the downstream applyBudgetDrivenStatus normalizes). But the review loop cannot proceed without acknowledgment. A one-line reply ("intentional, applyBudgetDrivenStatus is the canonical normalization site") is sufficient to unblock.
🤖 This review was automatically generated with Coder Agents.
|
/coder-agents-review DEREM-4 is now addressed by normalizing |
88233b9 to
031c25e
Compare
|
@codex review Addressed the remaining mutable budget normalization finding. Please take another look. |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
/coder-agents-review |
There was a problem hiding this comment.
All 16 R1 findings verified as resolved. The zero-budget normalization boundary is now complete: every write path normalizes, every read path heals legacy data, every budget gate uses the canonical helper. The preview/record split is clean: applyCostAccounting eliminates the duplication, restoreGoalAccountingSnapshot handles the error path, and the metadataModel resolution aligns preview pricing with final accounting.
Severity count: 3 P3 new. No P2+.
The main new finding is DEREM-22: the updatedAtMs staleness guard in previewStreamAccounting is broader than its primary purpose (blocking post-record double-counting). It also blocks previews after any mid-stream goal mutation (budget edits, child attribution), silencing the live cost display for the rest of the stream. The durable accounting is always correct. The fix would be a targeted guard (e.g., per-stream flag or createdAtMs-only check) that preserves preview liveness after non-accounting mutations.
The remaining two P3s are untested defensive branches in previewStreamAccounting (paused/complete, compaction, stale-stream guards). No behavior is wrong; these are regression locks.
"The
updatedAtMsguard IS necessary for the post-recording case. But the preview reads fresh disk state on every call; the base is always current. A targeted flag would prevent the double-count without killing previews after unrelated mutations." (Hisoka)
🤖 This review was automatically generated with Coder Agents.
There was a problem hiding this comment.
Round 6: blocked on the three R5 findings. No code pushed since R5 (head SHA unchanged).
Open findings:
- DEREM-22 (P3)
updatedAtMsguard inpreviewStreamAccountingis overbroad; silences live cost display after child attribution or budget edits mid-stream. - DEREM-23 (P3) No test for
previewStreamAccountingpaused/complete guard. - DEREM-24 (P3) No test for
previewStreamAccountingcompaction skip and stale-stream guard.
All three are P3. DEREM-22 is a design concern (live preview goes dark after unrelated goal mutations). DEREM-23 and DEREM-24 are test coverage gaps for defensive branches. To unblock: push fixes, or reply on each thread acknowledging the current behavior as intentional.
🤖 This review was automatically generated with Coder Agents.
031c25e to
4dfa7a8
Compare
|
/coder-agents-review Addressed DEREM-22 through DEREM-24 and added regression coverage. Please take another look. |
|
@codex review Addressed the latest coder-agents-review findings around preview completion guards and coverage. Please take another look. |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
All 19 findings from R1 and R5 are resolved. CI green. The DEREM-22 fix is clean: replacing the overbroad updatedAtMs guard with recordedStreamStartedAtMsByWorkspace preserves live cost display through mid-stream goal mutations while still blocking post-record double-counting. The new test covers the mid-stream budget edit scenario end-to-end. DEREM-23 and DEREM-24 branch coverage is solid.
The zero-budget normalization chain is complete (19 posted findings, 5 dropped, 0 open). The preview/record split is correct and well-tested. Good work across 7 rounds.
"Shall I show you what happens under concurrent dispatch? ♥" (Hisoka, R1)
🤖 This review was automatically generated with Coder Agents.
Summary
Fixes goal budget handling so a $0 budget disables dollar limits instead of immediately exhausting the goal, and updates goal cost display to preview live stream cost while final accounting remains durable and single-counted.
Background
A default or explicit goal budget of $0 was treated as an already-exhausted budget. Goal cost also only updated at stream end, leaving the Goal tab and workspace pill stale during multi-step turns.
Implementation
Validation
make static-checkbun test src/common/utils/goals/budgetPricing.test.ts src/browser/utils/goals/resolveGoalSetIntent.test.ts src/node/services/workspaceGoalService.test.ts src/browser/utils/chatCommands.test.ts src/browser/utils/commands/sources.test.ts src/browser/features/RightSidebar/GoalTab.test.tsx src/browser/features/Settings/Sections/GoalsSection.test.tsxmake typecheckmake lintRisks
Low-to-medium: goal accounting touches stream lifecycle and workspace activity snapshots. The live preview path avoids durable writes and final accounting remains unchanged except for zero-budget normalization.
Generated with
mux• Model:openai:gpt-5.5• Thinking:high• Cost:$44.07