Skip to content

fix: eliminate test port allocation race by running uvicorn in-thread#2263

Draft
maxisbey wants to merge 2 commits intomainfrom
max-158-port-allocation
Draft

fix: eliminate test port allocation race by running uvicorn in-thread#2263
maxisbey wants to merge 2 commits intomainfrom
max-158-port-allocation

Conversation

@maxisbey
Copy link
Contributor

@maxisbey maxisbey commented Mar 10, 2026

Summary

Replaces the subprocess + port-race fixture pattern in tests/shared/test_streamable_http.py with an in-thread uvicorn helper that pre-binds the listening socket, eliminating the TOCTOU race that caused intermittent CI failures under pytest-xdist.

The Problem

@pytest.fixture
def event_server_port() -> int:
    with socket.socket() as s:
        s.bind(("127.0.0.1", 0))
        return s.getsockname()[1]  # ← Port released! Race window opens

@pytest.fixture
def event_server(event_server_port, event_store):
    proc = multiprocessing.Process(target=run_server, kwargs={"port": event_server_port, ...})
    proc.start()
    wait_for_server(event_server_port)  # ← May see WRONG server
    ...

Between getsockname() returning and the subprocess rebinding, another pytest-xdist worker can claim the port — causing httpx.ConnectError: All connection attempts failed (job 63508538456) on the SSE auto-reconnect tests.

PR #1528 attempted worker-specific port ranges but was closed as only reducing (not eliminating) collision probability.

The Fix

tests/test_helpers.py now provides run_uvicorn_in_thread(app) which:

  1. Binds a listening socket with port=0 and calls listen() before starting the server thread
  2. Passes that socket to uvicorn via server.run(sockets=[sock])
  3. Yields the URL immediately — the port is known before the server thread even runs, and the kernel's listen queue buffers any connections that arrive during uvicorn startup

No polling, no race window. The socket is held atomically from bind() until sock.close() at fixture teardown.

Scope

Migrated the four test_streamable_http.py fixtures (basic_server, event_server, json_response_server, context_aware_server) that share create_app(). These include the SSE auto-reconnect tests (test_server_close_sse_stream_via_context, test_streamable_http_client_auto_reconnects, test_streamable_http_multiple_reconnections, test_standalone_get_stream_reconnection, test_streamable_http_client_respects_retry_interval) which genuinely need real TCP to exercise connection lifecycle.

Other files with the same pattern (test_sse.py, test_ws.py, test_http_unicode.py, test_integration.py, security tests) are deliberately left alone here — they don't need real TCP and are being converted to in-memory/ASGITransport in sibling PRs. wait_for_server() is kept for them.

Coverage

Running the server in-process means coverage now tracks transport code that was previously subprocess-invisible. Adjusted # pragma: no cover# pragma: lax no cover on a handful of branches in streamable_http.py and transport_security.py that are now partially covered depending on which tests exercise them.

AI Disclaimer

@maxisbey maxisbey force-pushed the max-158-port-allocation branch 2 times, most recently from a53ffa9 to 4a62434 Compare March 18, 2026 16:00
The previous pattern picked a free port via socket.bind(0), released it,
then started a uvicorn subprocess hoping to rebind — a TOCTOU race that
caused intermittent CI failures when pytest-xdist workers stole the port
between release and rebind (ConnectError, 404 against wrong server).

Added run_uvicorn_in_thread() which pre-binds the listening socket with
port=0 and passes it to uvicorn via server.run(sockets=[sock]). The port
is held atomically from bind until shutdown and is known before the
server thread even starts — no polling, no race. The kernel's listen
queue buffers any connections that arrive during uvicorn startup.

Migrated the four test_streamable_http.py fixtures (basic_server,
event_server, json_response_server, context_aware_server) that share
create_app(). These include the SSE auto-reconnect tests that genuinely
need real TCP to exercise connection lifecycle.

Running the server in-process means coverage now tracks transport code
that was previously subprocess-invisible; adjusted pragmas accordingly
(targeted no-cover on unreached error paths, lax no-cover on
timing-dependent branches).

wait_for_server() is kept for files not touched by this PR.
…t mid-drain

Without timeout_graceful_shutdown, uvicorn's shutdown() waits indefinitely
for open connections to close. The SSE reconnection tests in
test_streamable_http.py can leave streams open at fixture teardown, so
the 5s thread.join times out and abandons the thread mid-shutdown.

On Windows Proactor, the abandoned connection transports still have
pending Overlapped Recv operations when the event loop is torn down.
GC later finds them during an unrelated test, surfacing as
PytestUnraisableExceptionWarning. Previously hidden by subprocess
isolation.

timeout_graceful_shutdown=1 gives uvicorn a bounded window to drain
connections, then cancels remaining tasks via asyncio — transports
unwind through CancelledError instead of being abandoned.
@maxisbey maxisbey force-pushed the max-158-port-allocation branch from 4a62434 to 7e091ba Compare March 18, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant