refactor(run-control): unify target resolution and SSE handling#2731
refactor(run-control): unify target resolution and SSE handling#2731dgageot wants to merge 2 commits intodocker:mainfrom
Conversation
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity bugs found in the new code. No high-severity issues.
| return r, nil | ||
| } | ||
| } | ||
| return Record{}, fmt.Errorf("no live run with pid %d", pid) |
There was a problem hiding this comment.
[MEDIUM] ErrNoRun sentinel not returned for "target not found" — docstring/implementation mismatch
The doc comment on ErrNoRun states:
"ErrNoRun is returned when no live run can be found that satisfies the caller's request (empty registry, or no record matches the target)."
But the implementation only returns ErrNoRun in the two empty-registry paths (lines 160 and 170). When the registry is non-empty and the specific target is not found, all three miss-paths (pid, addr, session-id) return plain fmt.Errorf(...) values that are not wrapped with ErrNoRun.
// pid miss — plain error, not ErrNoRun
return Record{}, fmt.Errorf("no live run with pid %d", pid)
// addr miss — plain error, not ErrNoRun
return Record{}, fmt.Errorf("no live run at %s", target)
// session miss — plain error, not ErrNoRun
return Record{}, fmt.Errorf("no live run matches %q ...", target)Any caller that does errors.Is(err, runregistry.ErrNoRun) to detect "no matching run" will silently miss these three cases. Given the docstring promises the sentinel for all "not found" outcomes, these paths should either wrap with fmt.Errorf("…: %w", ErrNoRun) or the docstring should be corrected to document the current (narrower) behaviour.
| cmd.Stdin = bytes.NewReader(payload) | ||
| if err := cmd.Run(); err != nil { | ||
| slog.Warn("on-event hook failed", "command", command, "error", err) | ||
| out, err := cmd.CombinedOutput() |
There was a problem hiding this comment.
[MEDIUM] CombinedOutput() is incompatible with "tail-f-style listener" hooks — goroutine leak
cmd.CombinedOutput() blocks until the subprocess exits and its stdout/stderr reaches EOF. A "tail-f-style listener" (the example in the comment directly above this line) never exits, so this call would block indefinitely.
Because the caller spawns a fresh goroutine per event (go runEventHook(...)), every event that fires while a long-running hook is active leaks another goroutine. CombinedOutput() also buffers all subprocess output in memory, so an infinite-output hook also causes unbounded heap growth.
The comment explicitly documents the intent that such hooks should be allowed to run, but the implementation contradicts that intent. Consider using cmd.Start() + non-blocking reads (or simply discarding output for "fire-and-forget" hooks) so the goroutine can finish promptly without waiting for process exit.
90e088b to
ad7b51c
Compare
|
Addressed review feedback:
|
Continues the work merged in #2714 by simplifying and hardening the run
control plane (
run --listen,send,watch,proto,serve mcp --attach).What this changes
One target resolver everywhere. New
runregistry.Find(target)is thesingle entry point used by
send,watch,proto, andserve mcp --attach. It resolves an empty target to the latest live run, a numerictarget by PID, an
http(s)://host:porttarget by address, and anythingelse as a (partial) session id. Exact session-id matches always win over
substring matches, and ambiguous substring matches return a clear error
instead of silently picking one. Replaces the per-command
resolveTargetand the previous PID-only behaviour of
--to/--attach.Shared SSE helper.
cmd/root/sse.gocollapses the duplicated SSEclient code from
watch.goandproto.gointoopenEventStream+readEventStream. Payloads are surfaced asjson.RawMessageso consumerscan either forward the bytes verbatim or decode once.
protonow forwards the raw event bytes instead of doing a redundantunmarshal/marshal round-trip.
io.LimitReader(resp.Body, 4 KiB)so amisbehaving server can't push an unbounded payload into client memory.
readEventStreamreturnsctx.Err()on cancellation;watchmapscontext.Canceledback to a clean exit so Ctrl+C doesn't print aspurious error.
No double responses in
proto.dispatchProtoreports whether italready produced a typed reply (e.g.
transcript); the genericackisskipped in that case.
Atomic registry writes.
runregistry.Writenow writes through asibling temp file +
Renameso concurrent readers never see torn JSON.Latestis simplified withslices.MaxFunc.Cross-platform event hooks.
--on-eventshell commands now go throughshellpath.DetectShell()(PowerShell on Windows,$SHELLon Unix)instead of hard-coded
sh -c. Captured stdout/stderr is logged onfailure, and the use of
context.Background()is documented (hooksintentionally outlive app shutdown so the last event still flushes).
Tests
runregistrycoversFindfor pid / addr / session id, ambiguity, theexact-match-vs-substring regression, and the empty-registry case.
cmd/root/sse_test.gocovers happy path, error-body cap, and ctxcancellation.
dispatchProtosignature.task lintandtask testare green.