Skip to content

🤖 fix: handle zero-budget goals and live cost previews#3271

Merged
ThomasK33 merged 1 commit into
mainfrom
goal-tracking-fecb
May 12, 2026
Merged

🤖 fix: handle zero-budget goals and live cost previews#3271
ThomasK33 merged 1 commit into
mainfrom
goal-tracking-fecb

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Fixes goal budget handling so a $0 budget disables dollar limits instead of immediately exhausting the goal, and updates goal cost display to preview live stream cost while final accounting remains durable and single-counted.

Background

A default or explicit goal budget of $0 was treated as an already-exhausted budget. Goal cost also only updated at stream end, leaving the Goal tab and workspace pill stale during multi-step turns.

Implementation

  • Normalize zero-cent goal budgets to no dollar limit across goal creation/defaults and budgeted-model gates.
  • Add a live goal accounting preview from cumulative usage deltas that updates activity snapshots without writing final goal accounting.
  • Preserve final stream-end accounting as the source of truth to avoid double-counting.

Validation

  • make static-check
  • bun test src/common/utils/goals/budgetPricing.test.ts src/browser/utils/goals/resolveGoalSetIntent.test.ts src/node/services/workspaceGoalService.test.ts src/browser/utils/chatCommands.test.ts src/browser/utils/commands/sources.test.ts src/browser/features/RightSidebar/GoalTab.test.tsx src/browser/features/Settings/Sections/GoalsSection.test.tsx
  • make typecheck
  • make lint

Risks

Low-to-medium: goal accounting touches stream lifecycle and workspace activity snapshots. The live preview path avoids durable writes and final accounting remains unchanged except for zero-budget normalization.


Generated with mux • Model: openai:gpt-5.5 • Thinking: high • Cost: $44.07

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Please review the goal budget and live cost tracking changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 041a83edbc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from 041a83e to b70b1bc Compare May 12, 2026 11:31
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the preview reset issue and added regression coverage. Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean design on the preview/record split. The snapshot-only preview path avoids double-counting by re-reading the committed base on each event instead of maintaining an in-memory accumulator. Zero-budget normalization is applied at every write boundary (creation, intent resolution, status transitions, restart recovery), and the defense-in-depth layering in applyBudgetDrivenStatus catches any path that forgets to normalize early. Test coverage is genuine: each new test would fail on the base SHA.

Severity count: 1 P2, 11 P3, 4 Nit.

The P2 is a read-path gap: getGoal returns raw budgetCents: 0 for pre-existing goals that haven't been recovered yet, which leaks into goalPrompts.ts as a false "budget exhausted" signal. The normalization covers every write path but not the read path. Single-line fix in toGoalSnapshot or readGoalFile.

The P3s cluster into three themes: (1) test gaps for the new zero-budget behavior at three entry points (restart migration, command palette, parser), (2) documentation for the new public API surface (hasGoalBudgetLimit, normalizeGoalBudgetCents, previewStreamAccounting, StreamAccountingInput.costUsd), and (3) structural concerns (type guard unsoundness, normalization asymmetry between createGoal and applyMutableFields, cost-accumulation duplication across three methods).

The preview-after-record race (DEREM-2) is theoretical: Killua traced the lock registration order and showed that under the current event dispatch model, preview always chains ahead of record. The three reviewers who flagged it (Hisoka, Nami, Pariston) are right that the ordering guarantee is implicit, not contractual. A per-stream flag or updatedAtMs guard would make the guarantee explicit.

"Shall I show you what happens under concurrent dispatch? ♥" (Hisoka)


src/node/services/workspaceGoalService.ts:1174

P2 [DEREM-1] getGoal returns the raw readGoalFile result without normalization. Pre-existing goals on disk with budgetCents: 0 (created under old semantics as budget_limited) flow un-normalized to toGoalSnapshot and through to:

  • goalPrompts.ts:69,81: computes budgetLimitReached = true for any budgetCents != null where cost >= budget. Injects "Budget limit: $0.00" and "budget exhausted" into the agent system prompt. The model receives a false stop signal on what should be an active, unbudgeted goal.
  • GoalTab.tsx:251, AgentListItem.tsx:443: renders "$0.00" budget instead of "No budget."

The window is narrow (between app update restart and per-workspace recovery via normalizeGoalLimits), but the prompt injection is the dangerous case: the agent refuses work.

"The schema accepts 0 via .nonnegative().nullable(). Normalizing in readGoalFile or adding .transform(v => v === 0 ? null : v) to the schema's budgetCents field closes every downstream consumer in one place." (Mafuuu)

Fix: normalize in toGoalSnapshot with budgetCents: normalizeGoalBudgetCents(goal?.budgetCents), or in readGoalFile after the Zod parse.

🤖

src/node/services/workspaceGoalService.ts:1003

P3 [DEREM-4] applyMutableFields writes raw input.budgetCents ?? null (where 0 ?? null evaluates to 0), deferring normalization to applyBudgetDrivenStatus. createGoal normalizes eagerly at line 336 with normalizeGoalBudgetCents(input.budgetCents).

This creates an implicit invariant: every write path that constructs a GoalRecordV1 with a raw budgetCents must route through applyBudgetDrivenStatus, and applyBudgetDrivenStatus must never skip its normalization preamble (lines 1035-1043). If applyBudgetDrivenStatus is refactored to drop the normalization (someone may assume createGoal is the canonical site), updates persist budgetCents: 0 to disk.

Fix: normalize at the point of assignment:

...(Object.hasOwn(input, "budgetCents")
  ? { budgetCents: normalizeGoalBudgetCents(input.budgetCents) }
  : {}),

The normalization inside applyBudgetDrivenStatus becomes a harmless idempotent layer. (Meruem P3, Knov P3, Pariston)

🤖

🤖 This review was automatically generated with Coder Agents.

Comment thread src/node/services/agentSession.ts
Comment thread src/common/utils/goals/budgetPricing.ts Outdated
Comment thread src/node/services/workspaceGoalService.ts
Comment thread src/browser/utils/commands/sources.ts
Comment thread src/browser/features/RightSidebar/GoalTab.tsx Outdated
Comment thread src/node/services/workspaceGoalService.ts Outdated
Comment thread src/browser/utils/goals/resolveGoalSetIntent.ts
Comment thread src/node/services/workspaceGoalService.ts
Comment thread src/browser/features/RightSidebar/GoalTab.tsx Outdated
Comment thread src/node/services/agentSession.ts Outdated
@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from b70b1bc to ab5d66e Compare May 12, 2026 12:00
@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed coder-agents-review feedback and pushed ab5d66e. Please review the updated goal budget/live preview changes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab5d66e0ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentSession.ts
@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from ab5d66e to b89c014 Compare May 12, 2026 12:06
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the compaction preview feedback. Please take another look.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 of 16 posted findings from Round 1 are addressed in ab5d66e. Good work.

One finding remains unaddressed:

DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields still writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. createGoal normalizes at point of assignment (line 337); the update path does not. The finding was body-folded in the R1 review (line was outside the diff hunk), so there was no threaded comment to reply to. The code at line 1006 is unchanged in ab5d66e.

The current code works correctly because applyBudgetDrivenStatus normalizes downstream. The concern is maintenance: the asymmetry creates an implicit invariant that every write path must route through applyBudgetDrivenStatus to avoid persisting budgetCents: 0.

To unblock the next review round: either apply normalizeGoalBudgetCents at line 1006, or reply acknowledging the asymmetry as intentional (the panel will evaluate).

🤖 This review was automatically generated with Coder Agents.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b89c0146b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentSession.ts Outdated
@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from b89c014 to 88233b9 Compare May 12, 2026 12:20
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the mapped pricing preview feedback. Please take another look.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 3: still blocked on the same finding from Round 2.

DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. This was body-folded in R1 (line outside the diff hunk) and called out explicitly in the R2 review body. The new commit b89c0146b only forwards isCompaction to the preview path; applyMutableFields is unchanged.

To unblock: apply normalizeGoalBudgetCents at line 1006, or reply here acknowledging the asymmetry as intentional.

🤖 This review was automatically generated with Coder Agents.

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 4: blocked for the third consecutive round on the same finding.

DEREM-4 (P3, workspaceGoalService.ts:1006) applyMutableFields writes { budgetCents: input.budgetCents ?? null } without normalizeGoalBudgetCents. This has been called out in R1 (review body), R2, and R3. The code is unchanged across all three follow-up commits.

This is a P3 maintenance concern, not a correctness issue (the downstream applyBudgetDrivenStatus normalizes). But the review loop cannot proceed without acknowledgment. A one-line reply ("intentional, applyBudgetDrivenStatus is the canonical normalization site") is sufficient to unblock.

🤖 This review was automatically generated with Coder Agents.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

DEREM-4 is now addressed by normalizing budgetCents in applyMutableFields before schema parsing.

@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from 88233b9 to 031c25e Compare May 12, 2026 13:09
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the remaining mutable budget normalization finding. Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 16 R1 findings verified as resolved. The zero-budget normalization boundary is now complete: every write path normalizes, every read path heals legacy data, every budget gate uses the canonical helper. The preview/record split is clean: applyCostAccounting eliminates the duplication, restoreGoalAccountingSnapshot handles the error path, and the metadataModel resolution aligns preview pricing with final accounting.

Severity count: 3 P3 new. No P2+.

The main new finding is DEREM-22: the updatedAtMs staleness guard in previewStreamAccounting is broader than its primary purpose (blocking post-record double-counting). It also blocks previews after any mid-stream goal mutation (budget edits, child attribution), silencing the live cost display for the rest of the stream. The durable accounting is always correct. The fix would be a targeted guard (e.g., per-stream flag or createdAtMs-only check) that preserves preview liveness after non-accounting mutations.

The remaining two P3s are untested defensive branches in previewStreamAccounting (paused/complete, compaction, stale-stream guards). No behavior is wrong; these are regression locks.

"The updatedAtMs guard IS necessary for the post-recording case. But the preview reads fresh disk state on every call; the base is always current. A targeted flag would prevent the double-count without killing previews after unrelated mutations." (Hisoka)

🤖 This review was automatically generated with Coder Agents.

Comment thread src/node/services/workspaceGoalService.ts Outdated
Comment thread src/node/services/workspaceGoalService.ts
Comment thread src/node/services/workspaceGoalService.ts
Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 6: blocked on the three R5 findings. No code pushed since R5 (head SHA unchanged).

Open findings:

  • DEREM-22 (P3) updatedAtMs guard in previewStreamAccounting is overbroad; silences live cost display after child attribution or budget edits mid-stream.
  • DEREM-23 (P3) No test for previewStreamAccounting paused/complete guard.
  • DEREM-24 (P3) No test for previewStreamAccounting compaction skip and stale-stream guard.

All three are P3. DEREM-22 is a design concern (live preview goes dark after unrelated goal mutations). DEREM-23 and DEREM-24 are test coverage gaps for defensive branches. To unblock: push fixes, or reply on each thread acknowledging the current behavior as intentional.

🤖 This review was automatically generated with Coder Agents.

@ThomasK33 ThomasK33 force-pushed the goal-tracking-fecb branch from 031c25e to 4dfa7a8 Compare May 12, 2026 14:12
@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

Addressed DEREM-22 through DEREM-24 and added regression coverage. Please take another look.

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed the latest coder-agents-review findings around preview completion guards and coverage. Please take another look.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 19 findings from R1 and R5 are resolved. CI green. The DEREM-22 fix is clean: replacing the overbroad updatedAtMs guard with recordedStreamStartedAtMsByWorkspace preserves live cost display through mid-stream goal mutations while still blocking post-record double-counting. The new test covers the mid-stream budget edit scenario end-to-end. DEREM-23 and DEREM-24 branch coverage is solid.

The zero-budget normalization chain is complete (19 posted findings, 5 dropped, 0 open). The preview/record split is correct and well-tested. Good work across 7 rounds.

"Shall I show you what happens under concurrent dispatch? ♥" (Hisoka, R1)

🤖 This review was automatically generated with Coder Agents.

@ThomasK33 ThomasK33 added this pull request to the merge queue May 12, 2026
Merged via the queue into main with commit b9630ab May 12, 2026
24 checks passed
@ThomasK33 ThomasK33 deleted the goal-tracking-fecb branch May 12, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant