Skip to content

fix(cli): verify forward listener before success#1880

Open
shiju-nv wants to merge 3 commits into
NVIDIA:mainfrom
shiju-nv:fix/forward-listener-health
Open

fix(cli): verify forward listener before success#1880
shiju-nv wants to merge 3 commits into
NVIDIA:mainfrom
shiju-nv:fix/forward-listener-health

Conversation

@shiju-nv

@shiju-nv shiju-nv commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR makes openshell forward report success only after the local SSH forward listener is actually reachable. It also makes background forwards fail closed when the forked SSH process cannot be tracked, so OpenShell does not leave users with an unusable or unmanageable forward.

Related Issue

Fixes #1878

Changes

  • Wait for a connectable local listener before printing foreground forward success.
  • Fail background forwarding when OpenShell cannot discover the forked SSH process PID.
  • Probe the local listener before writing the background forward PID file.
  • Terminate the tracked background SSH process if listener health never becomes reachable.
  • Use connectable loopback probe hosts for wildcard binds such as 0.0.0.0 and ::.
  • Add focused unit coverage for reachable and missing forward listeners.
  • Document that OpenShell prints forwarded URLs only after listener health is proven, and that background forwards must be tracked for forward list and forward stop.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

Wait for a connectable local forward listener before reporting
foreground forwarding success.

Fail background forwarding when the forked SSH process cannot be
tracked, probe the listener before writing the PID file, and
terminate the tracked SSH process if the listener never opens.

Document that forwarded URLs are printed only after listener
health is proven.

Signed-off-by: Shiju <shiju@nvidia.com>
@shiju-nv shiju-nv requested review from a team, derekwaynecarr and mrunalp as code owners June 11, 2026 22:48
@johntmyers johntmyers self-assigned this Jun 11, 2026
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

PR Review Status

Validation: This PR is project-valid because it directly fixes reproducible openshell forward behavior from #1878: success should only be reported after a reachable local listener exists, and background forwards should fail closed when they cannot be tracked.
Head SHA: c7ac0b929c9aa61fada4e522ba0abffc84e5a090

Review findings:

  • crates/openshell-cli/src/ssh.rs:513: terminate_forward_pid shells out to kill through PATH. Please use nix::sys::signal::kill directly and guard against unsafe PID values before signaling.
  • crates/openshell-cli/tests/sandbox_create_lifecycle_integration.rs:1417: the happy-path test was updated, but the PR still needs the direct regression test from Sandbox forward reports success without a reachable local listener #1878 where fake ssh exits 0, no background PID is discoverable, and no listener exists. Please assert sandbox_forward(..., background = true, ...) returns an error.

Docs: Fern docs were updated under docs/sandboxes/manage-sandboxes.mdx; no navigation update appears necessary for this inline behavior note.

Next state: gator:in-review

@johntmyers johntmyers added the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 11, 2026
Signed-off-by: Shiju <shiju@nvidia.com>
@johntmyers

Copy link
Copy Markdown
Collaborator

gator-agent

Re-check After Author Update

I re-evaluated latest head 301909f13c11d6dce811c7c0b8f005bf54a08774 after shiju-nv's June 12, 2026 follow-up commit.

Disposition: partially resolved.

Resolved items:

  • terminate_forward_pid no longer shells out to kill; it now uses nix::sys::signal::kill with PID range guards.
  • The background regression from Sandbox forward reports success without a reachable local listener #1878 is now covered: fake ssh exits 0, fake pgrep finds no PID, no listener exists, and sandbox_forward(..., background = true, ...) must return an error without writing a PID file.

Remaining items:

  • crates/openshell-cli/src/ssh.rs:381: before terminating the PID returned by find_ssh_forward_pid, please re-validate that it still matches the expected SSH forward, similar to the existing pid_matches_forward guard used by stop_forward. Otherwise a stale or spoofed same-user PID could be signaled. Security mapping: CWE-20.
  • crates/openshell-cli/src/ssh.rs:37: the old 2s foreground grace period is now a hard listener-readiness deadline for both foreground and background startup. Please either raise/rename this as an explicit readiness timeout, for example 10s, or explain why 2s is sufficient for slow SSH auth/proxy startup.

Non-blocking suggestion: consider adding a direct foreground regression test where fake ssh exits 0 without opening a listener, since the foreground path changed materially too.

Docs: Fern docs were updated under docs/sandboxes/manage-sandboxes.mdx; no navigation update appears necessary.

Next state: gator:in-review

Signed-off-by: Shiju <shiju@nvidia.com>
@shiju-nv shiju-nv force-pushed the fix/forward-listener-health branch from 1678fe7 to 4583b5d Compare June 12, 2026 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gator:in-review Gator is reviewing or awaiting PR review feedback

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sandbox forward reports success without a reachable local listener

2 participants