feat: enable full peer registration and reachability over WebSocket#657
feat: enable full peer registration and reachability over WebSocket#657nour-s wants to merge 5 commits into
Conversation
Previously, clients connecting via WebSocket (e.g. through a reverse
proxy or Cloudflare tunnel) could only initiate connections — they could
not register as hosts or receive incoming connections. This made
WebSocket-only deployments one-directional.
Root cause: four places in the code assumed every registered peer is
reachable via UDP, which is not true for WebSocket-only peers.
Changes:
1. Handle RegisterPk in the WebSocket path (handle_tcp, ws=true)
The RegisterPk arm unconditionally returned NOT_SUPPORT for all
TCP/WebSocket connections. This patch runs the same full registration
logic that the UDP path uses, then stores the WebSocket sink in
tcp_punch and returns true to keep the connection alive. This allows
the peer to receive subsequent server-pushed messages.
2. Route punch-hole requests to WebSocket peers via their stored sink
(handle_udp_punch_hole_request, RequestRelay in handle_tcp)
When the target peer registered over WebSocket it has no real UDP
port, so its stored socket_addr has port 0. Both handlers now check
for port == 0 and call send_to_tcp instead of enqueuing a UDP
datagram that would be silently dropped.
3. Make send_to_tcp fall back to UDP and re-insert the sink
- If the target has no TCP/WS sink in tcp_punch, fall back to UDP so
that responses to UDP-registered peers (e.g. the punch-hole
response sent back to the initiator) are not silently dropped.
- After successfully delivering a message to a WebSocket peer,
re-insert the sink so the peer can receive future messages without
needing to reconnect.
4. Same UDP fallback in send_to_tcp_sync for consistency.
How port == 0 works as a sentinel:
handle_listener_inner extracts the real client IP from X-Real-IP /
X-Forwarded-For when accepting a WebSocket upgrade, but there is no
corresponding real UDP port, so addr is set to "<real-ip>:0". UDP
peers always have a non-zero port, making port == 0 an unambiguous
indicator that a peer registered over WebSocket.
End-to-end flow enabled by this patch (both directions):
WebSocket peer A ──WS──► Cloudflare ──► hbbs :21118
UDP peer B ──UDP──► hbbs :21116
A → B: unchanged (already worked; A's sink is stored on PunchHoleRequest)
B → A: hbbs delivers PunchHole to A via stored WS sink (patch 2),
A responds with PunchHoleSent, hbbs delivers PunchHoleResponse
to B via UDP fallback (patch 3), both connect to relay → bridged.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughWebSocket clients gain full RegisterPk/RegisterPeer/Online handling with validation, rate limiting, and stored connection sinks; message routing treats encoded port 0 as a WebSocket target, sending via stored TCP/WS sink (re-inserted synchronously) and falling back to UDP for non-WS peers. ChangesWebSocket Peer Registration and Delivery
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/rendezvous_server.rs (2)
985-997:⚠️ Potential issue | 🟠 Major | ⚡ Quick winApply the
port() == 0WebSocket routing on the TCP punch-hole path too.
handle_udp_punch_hole_request()now special-cases WebSocket targets, but this sibling path still always queues UDP whento_addrexists. A TCP/WS initiator still can't reach a WebSocket-registered target here becauseport 0is sent to UDP instead ofsend_to_tcp().Suggested fix
async fn handle_tcp_punch_hole_request( &mut self, addr: SocketAddr, ph: PunchHoleRequest, key: &str, ws: bool, ) -> ResultType<()> { let (msg, to_addr) = self.handle_punch_hole_request(addr, ph, key, ws).await?; if let Some(addr) = to_addr { - self.tx.send(Data::Msg(msg.into(), addr))?; + if addr.port() == 0 { + self.send_to_tcp(msg, addr).await; + } else { + self.tx.send(Data::Msg(msg.into(), addr))?; + } } else { self.send_to_tcp_sync(msg, addr).await?; } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 985 - 997, handle_tcp_punch_hole_request currently forwards to UDP whenever handle_punch_hole_request returns a to_addr, which breaks WebSocket targets encoded as port()==0; replicate the UDP-path special-case: after calling handle_punch_hole_request in handle_tcp_punch_hole_request, check if let Some(addr) = to_addr and if ws && addr.port() == 0 then call send_to_tcp_sync(msg, addr).await? (or the async TCP send used elsewhere) instead of tx.send(Data::Msg(...)); otherwise keep the existing tx.send(Data::Msg(...)) behavior, and preserve the existing else branch that calls send_to_tcp_sync for the original addr. Ensure you reference handle_tcp_punch_hole_request, handle_punch_hole_request, tx.send(Data::Msg(...)), and send_to_tcp_sync when making the change.
969-980:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReinsert the sink after
send_to_tcp_sync.This path consumes the stored sink and never puts it back. After the first
RelayResponseor error reply, the WebSocket peer stays connected but can no longer receive later server-push messages.Suggested fix
async fn send_to_tcp_sync( &mut self, msg: RendezvousMessage, addr: SocketAddr, ) -> ResultType<()> { let addr_v4 = try_into_v4(addr); let mut sink = self.tcp_punch.lock().await.remove(&addr_v4); if sink.is_some() { Self::send_to_sink(&mut sink, msg).await; + if let Some(s) = sink { + self.tcp_punch.lock().await.insert(addr_v4, s); + } } else { self.tx.send(Data::Msg(msg.into(), addr)).ok(); } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 969 - 980, send_to_tcp_sync currently removes the sink from self.tcp_punch and never puts it back, so after the first send the peer stops receiving pushes; fix by reinserting the sink after calling Self::send_to_sink: when you do let mut sink = self.tcp_punch.lock().await.remove(&addr_v4); if sink.is_some() { call Self::send_to_sink(&mut sink, msg).await; then lock self.tcp_punch again and insert the sink back (e.g. if let Some(s) = sink { self.tcp_punch.lock().await.insert(addr_v4, s); }), ensuring the sink is re-stored whether send_to_sink succeeds or fails; leave the else branch that uses self.tx.send(...) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rendezvous_server.rs`:
- Around line 934-949: The code removes the sink from tcp_punch before spawning
a task which creates a time window where concurrent sends see no sink and fall
back to UDP; instead, in send_to_tcp obtain a clone of the sink without removing
the map entry (e.g., look up/get and clone the Sink value referenced by
tcp_punch for addr_v4), spawn the task using that cloned sink and leave the
original map entry intact (so other sends still find the sink); update
references to tcp, send_to_sink, and tcp_punch.lock().await.remove/insert
accordingly so you no longer remove then re-insert the tcp entry.
---
Outside diff comments:
In `@src/rendezvous_server.rs`:
- Around line 985-997: handle_tcp_punch_hole_request currently forwards to UDP
whenever handle_punch_hole_request returns a to_addr, which breaks WebSocket
targets encoded as port()==0; replicate the UDP-path special-case: after calling
handle_punch_hole_request in handle_tcp_punch_hole_request, check if let
Some(addr) = to_addr and if ws && addr.port() == 0 then call
send_to_tcp_sync(msg, addr).await? (or the async TCP send used elsewhere)
instead of tx.send(Data::Msg(...)); otherwise keep the existing
tx.send(Data::Msg(...)) behavior, and preserve the existing else branch that
calls send_to_tcp_sync for the original addr. Ensure you reference
handle_tcp_punch_hole_request, handle_punch_hole_request,
tx.send(Data::Msg(...)), and send_to_tcp_sync when making the change.
- Around line 969-980: send_to_tcp_sync currently removes the sink from
self.tcp_punch and never puts it back, so after the first send the peer stops
receiving pushes; fix by reinserting the sink after calling Self::send_to_sink:
when you do let mut sink = self.tcp_punch.lock().await.remove(&addr_v4); if
sink.is_some() { call Self::send_to_sink(&mut sink, msg).await; then lock
self.tcp_punch again and insert the sink back (e.g. if let Some(s) = sink {
self.tcp_punch.lock().await.insert(addr_v4, s); }), ensuring the sink is
re-stored whether send_to_sink succeeds or fails; leave the else branch that
uses self.tx.send(...) unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
Without this, the in-memory last_reg_time for WebSocket-registered peers was never updated after the initial get_or() call, causing them to appear offline after REG_TIMEOUT (30s) even with an active connection.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/rendezvous_server.rs`:
- Around line 660-665: When an existing peer registers via WebSocket the code
currently skips update_pk(...) when `changed` is false, leaving the old nonzero
`socket_addr` and never flipping the transport marker; after the `if let Some(p)
= self.pm.get_in_memory(&id).await { ... }` block, also update the in-memory
peer's socket address when the incoming registration carries a WebSocket sink:
detect the registered record (e.g. `rk.sink.is_some()` or equivalent) and set
`p.write().await.socket_addr = addr` (or update only when `p.read().await.port()
== 0`) so WebSocket routing recognizes the peer; keep the `last_reg_time` update
but add this socket_addr refresh instead of relying solely on `last_reg_time`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
Four bugs prevented Windows-initiated connections from reaching a
WebSocket-registered peer (e.g. Mac behind Cloudflare):
1. socket_addr not updated when changed=false in RegisterPk handler.
update_pk is only called when the peer's pk/uuid changed, so returning
peers kept socket_addr=0.0.0.0:0 across server restarts. Fixed by
always writing socket_addr=addr after the update_pk branch.
2. handle_tcp_punch_hole_request routed all punch hole targets via UDP.
When the target's socket_addr has port=0 (WebSocket sentinel), this
caused an OS error 22 (invalid argument) that crashed and restarted
the UDP listener. Fixed by applying the same port==0 → send_to_tcp
routing already present in handle_udp_punch_hole_request.
3. send_to_tcp and send_to_tcp_sync fell back to UDP for port-0 targets
when the WebSocket sink was not found (peer temporarily disconnected).
Sending UDP to port 0 is invalid; added a port != 0 guard so the
message is silently dropped instead.
4. OnlineRequest and RegisterPeer messages on a WebSocket connection
fell into the _ => {} catch-all and returned false, closing the
connection and evicting the peer's sink from tcp_punch. Added explicit
ws=true branches: RegisterPeer updates socket_addr/last_reg_time and
keeps the connection open; OnlineRequest builds the online bitmask,
sends OnlineResponse via tcp_punch (since the sink was moved there
during RegisterPk), and keeps the connection open.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/rendezvous_server.rs (1)
1026-1040:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSink permanently removed — subsequent messages to WebSocket peer will be lost.
send_to_tcp_syncremoves the sink at line 1033 but never re-inserts it after sending. For WebSocket peers (port == 0), this means all future server-initiated messages to this address will be silently dropped until the client sends anotherRegisterPeerheartbeat that re-stores the sink.This is particularly problematic for
RelayResponse(line 537), where the peer may need to receive additional messages after the relay is established.🐛 Proposed fix — re-insert sink after sending
async fn send_to_tcp_sync( &mut self, msg: RendezvousMessage, addr: SocketAddr, ) -> ResultType<()> { let addr_v4 = try_into_v4(addr); let mut sink = self.tcp_punch.lock().await.remove(&addr_v4); if sink.is_some() { Self::send_to_sink(&mut sink, msg).await; + // Re-insert sink so WebSocket peer can receive future messages + if let Some(s) = sink { + self.tcp_punch.lock().await.insert(addr_v4, s); + } } else if addr.port() != 0 { self.tx.send(Data::Msg(msg.into(), addr)).ok(); } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 1026 - 1040, send_to_tcp_sync currently removes the WebSocket sink from self.tcp_punch (via lock().await.remove(&addr_v4)) and sends once via Self::send_to_sink, but never re-inserts it, so future server-initiated messages are lost; change send_to_tcp_sync (and uses of tcp_punch/tcp_punch.lock().await) to, after awaiting Self::send_to_sink(&mut sink, msg).await when sink.is_some(), re-insert the sink back into the tcp_punch map (e.g., put the Option back under addr_v4) so the sink persists for future messages; ensure ownership is handled correctly when moving the sink out and back, and preserve the existing branch that falls back to self.tx.send(Data::Msg(...)) for non-zero ports.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/rendezvous_server.rs`:
- Around line 1026-1040: send_to_tcp_sync currently removes the WebSocket sink
from self.tcp_punch (via lock().await.remove(&addr_v4)) and sends once via
Self::send_to_sink, but never re-inserts it, so future server-initiated messages
are lost; change send_to_tcp_sync (and uses of tcp_punch/tcp_punch.lock().await)
to, after awaiting Self::send_to_sink(&mut sink, msg).await when sink.is_some(),
re-insert the sink back into the tcp_punch map (e.g., put the Option back under
addr_v4) so the sink persists for future messages; ensure ownership is handled
correctly when moving the sink out and back, and preserve the existing branch
that falls back to self.tx.send(Data::Msg(...)) for non-zero ports.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/rendezvous_server.rs (1)
719-728:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRace condition: spawned task creates a window for concurrent message loss.
This handler removes the sink from
tcp_punch, then spawns an async task to send and re-insert. This is the same race condition pattern that was fixed insend_to_tcp(lines 993-1001) by eliminating the spawn. During the window between removal and the spawned re-insertion, another concurrent send to the same WebSocket peer will find no sink and silently drop the message (sinceport() == 0).For consistency with the
send_to_tcpfix, await inline rather than spawning.🔧 Proposed fix
let addr_v4 = try_into_v4(addr); let tcp_punch = self.tcp_punch.clone(); let mut stored = tcp_punch.lock().await.remove(&addr_v4); if stored.is_some() { - tokio::spawn(async move { - Self::send_to_sink(&mut stored, msg_out).await; - if let Some(s) = stored { - tcp_punch.lock().await.insert(addr_v4, s); - } - }); + Self::send_to_sink(&mut stored, msg_out).await; + if let Some(s) = stored { + tcp_punch.lock().await.insert(addr_v4, s); + } } else { Self::send_to_sink(sink, msg_out).await; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 719 - 728, The code removes the sink from tcp_punch and then tokio::spawn's an async block, which creates a race where concurrent sends see no sink; instead, mirror the send_to_tcp fix by handling the send inline: call try_into_v4(addr), lock tcp_punch and remove into mut stored, then directly await Self::send_to_sink(&mut stored, msg_out).await (no tokio::spawn), and after the await, if let Some(s) = stored { tcp_punch.lock().await.insert(addr_v4, s); } so the remove/send/reinsert sequence happens without a concurrent window that drops messages.
🧹 Nitpick comments (2)
src/rendezvous_server.rs (2)
568-683: 🏗️ Heavy liftConsider extracting shared registration logic to reduce duplication.
The WebSocket
RegisterPkflow duplicates approximately 100 lines of validation and state-update logic from the UDP handler (lines 342-425). This includes UUID/IP validation, rate limiting checks, IP change tracking, and peer state updates. While functional, this duplication creates a maintenance burden—future bug fixes or policy changes must be applied in both places.Consider extracting the common validation and state-update logic into a shared helper function that both handlers call, with the transport-specific response delivery handled by the caller.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 568 - 683, Extract the duplicated registration/validation/state-update logic into a shared helper (e.g., validate_and_update_registration or handle_register_pk_common) that both the UDP and WebSocket flows call: move UUID/IP checks, check_ip_blocker call, reg_pk rate-limit manipulation (using peer.read().await.reg_pk), IP change tracking using IP_CHANGES and IP_CHANGE_DUR, and the logic that computes (changed, ip_changed) and invokes pm.update_pk(id, peer, addr, rk.uuid, rk.pk, ip). Return a result struct indicating success/failure and flags (changed, ip_changed, ip, id, peer) so the caller can perform transport-specific responses via Self::send_to_sink, update socket_addr/last_reg_time in-memory, and insert into tcp_punch (using try_into_v4) for WebSocket sinks; keep transport-specific sink handling and response construction out of the helper.
996-998: 💤 Low valueClarify: a small race window still exists during the async send.
The comment states "no window where a concurrent send sees the map entry missing," but there is still a window between the
remove(line 993) andinsert(line 1000) whilesend_to_sinkawaits. A concurrent caller for the same address could observe the missing entry and either fall back to UDP (if port ≠ 0) or silently drop (if port == 0).This is likely acceptable given that
Sinkcannot be cloned and holding the lock during send would block unrelated peers. However, the comment should acknowledge the limitation rather than claim there's no window.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 996 - 998, Update the misleading comment around the remove/insert + await section in rendezvous_server.rs to acknowledge the race window: explain that between the map.remove(...) and map.insert(...) while awaiting send_to_sink(...) there is a brief period where a concurrent send may observe the entry missing and fall back to UDP or drop the message; mention this is a tradeoff because Sink is not Clone and holding the lock during the await would block other peers, so the current approach accepts the small race. Refer to send_to_sink, Sink, and the map.remove/map.insert operations when editing the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/rendezvous_server.rs`:
- Around line 719-728: The code removes the sink from tcp_punch and then
tokio::spawn's an async block, which creates a race where concurrent sends see
no sink; instead, mirror the send_to_tcp fix by handling the send inline: call
try_into_v4(addr), lock tcp_punch and remove into mut stored, then directly
await Self::send_to_sink(&mut stored, msg_out).await (no tokio::spawn), and
after the await, if let Some(s) = stored {
tcp_punch.lock().await.insert(addr_v4, s); } so the remove/send/reinsert
sequence happens without a concurrent window that drops messages.
---
Nitpick comments:
In `@src/rendezvous_server.rs`:
- Around line 568-683: Extract the duplicated
registration/validation/state-update logic into a shared helper (e.g.,
validate_and_update_registration or handle_register_pk_common) that both the UDP
and WebSocket flows call: move UUID/IP checks, check_ip_blocker call, reg_pk
rate-limit manipulation (using peer.read().await.reg_pk), IP change tracking
using IP_CHANGES and IP_CHANGE_DUR, and the logic that computes (changed,
ip_changed) and invokes pm.update_pk(id, peer, addr, rk.uuid, rk.pk, ip). Return
a result struct indicating success/failure and flags (changed, ip_changed, ip,
id, peer) so the caller can perform transport-specific responses via
Self::send_to_sink, update socket_addr/last_reg_time in-memory, and insert into
tcp_punch (using try_into_v4) for WebSocket sinks; keep transport-specific sink
handling and response construction out of the helper.
- Around line 996-998: Update the misleading comment around the remove/insert +
await section in rendezvous_server.rs to acknowledge the race window: explain
that between the map.remove(...) and map.insert(...) while awaiting
send_to_sink(...) there is a brief period where a concurrent send may observe
the entry missing and fall back to UDP or drop the message; mention this is a
tradeoff because Sink is not Clone and holding the lock during the await would
block other peers, so the current approach accepts the small race. Refer to
send_to_sink, Sink, and the map.remove/map.insert operations when editing the
comment.
send_to_tcp previously removed the sink from tcp_punch before spawning a task, creating a window where a back-to-back call saw no entry and fell back to UDP (undeliverable for port-0 WebSocket peers). Replace the spawn with an inline await so the re-insert is atomic relative to the calling task and no concurrent caller can observe a missing entry. Apply the same re-insert-after-send fix to send_to_tcp_sync, which was also consuming the sink without putting it back.
3ab6160 to
9b558e4
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/rendezvous_server.rs (1)
721-728:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftKeep the sink entry stable while the send is in flight.
Line 993 and Line 1032 still reintroduce the same race in a smaller window: the entry is removed before
send_to_sink(...).await, so a concurrent delivery can still observe “no sink” and drop the next message. Forport() == 0peers that becomes a silent loss, and theOnlineRequestpath at Line 721 has the same pattern.Suggested direction
- tcp_punch: Arc<Mutex<HashMap<SocketAddr, Sink>>>, + tcp_punch: Arc<Mutex<HashMap<SocketAddr, Arc<TokioMutex<Sink>>>>>, - let mut tcp = self.tcp_punch.lock().await.remove(&addr_v4); - if tcp.is_some() { - Self::send_to_sink(&mut tcp, msg).await; - if let Some(s) = tcp { - self.tcp_punch.lock().await.insert(addr_v4, s); - } + let sink = self.tcp_punch.lock().await.get(&addr_v4).cloned(); + if let Some(sink) = sink { + let mut sink = sink.lock().await; + Self::send_to_sink_locked(&mut *sink, msg).await;That keeps lookups stable and serializes on the sink itself instead of on map membership. Apply the same model to the
OnlineRequestresponse path too.Also applies to: 991-1001, 1026-1037
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rendezvous_server.rs` around lines 721 - 728, The code removes the sink entry from tcp_punch (remove(&addr_v4)) before awaiting send_to_sink, which creates a race where concurrent deliveries can observe “no sink” and drop messages; instead keep the map entry stable and serialize on the sink object itself: change the map value to an Arc/Box containing a per-sink Mutex or send-queue and do a lookup (e.g., tcp_punch.lock().await.get(&addr_v4) or entry API) to clone the Arc/handle, then call Self::send_to_sink with that cloned handle (locking the per-sink mutex inside send_to_sink) so the map membership is not removed while awaiting; apply the same pattern to the OnlineRequest response path and the other occurrences noted (the blocks around send_to_sink and the tcp_punch remove calls) so entries are never transiently removed before async sends complete.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/rendezvous_server.rs`:
- Around line 721-728: The code removes the sink entry from tcp_punch
(remove(&addr_v4)) before awaiting send_to_sink, which creates a race where
concurrent deliveries can observe “no sink” and drop messages; instead keep the
map entry stable and serialize on the sink object itself: change the map value
to an Arc/Box containing a per-sink Mutex or send-queue and do a lookup (e.g.,
tcp_punch.lock().await.get(&addr_v4) or entry API) to clone the Arc/handle, then
call Self::send_to_sink with that cloned handle (locking the per-sink mutex
inside send_to_sink) so the map membership is not removed while awaiting; apply
the same pattern to the OnlineRequest response path and the other occurrences
noted (the blocks around send_to_sink and the tcp_punch remove calls) so entries
are never transiently removed before async sends complete.
…nd_to_tcp comment The OnlineRequest response path had the same remove-spawn-reinsert race as send_to_tcp: the sink was taken from tcp_punch before spawning, leaving a window where a concurrent delivery would find no entry and silently drop. Fixed by awaiting the send inline, matching the pattern used in send_to_tcp. Also update the send_to_tcp comment to accurately acknowledge the small race window that remains between remove and re-insert during the await, rather than claiming no window exists.
|
Close since it is our Pro feature. |
Problem
Clients that can only reach hbbs via WebSocket (e.g. through a Cloudflare tunnel or any HTTP reverse proxy) could not register as hosts. They showed "Connecting to RustDesk server" permanently and could not receive incoming connections. Only the direction from the WebSocket peer to a UDP-registered peer worked.
Root cause
Four places in
rendezvous_server.rsassumed every registered peer is reachable via UDP:handle_tcpreturnedNOT_SUPPORTforRegisterPkon all TCP/WebSocket connections.handle_udp_punch_hole_requestalways enqueued the punch-hole message on the UDP channel, even when the target peer has no real UDP port.RequestRelayarm inhandle_tcpdid the same.send_to_tcp/send_to_tcp_syncsilently dropped messages when the target had no stored TCP sink, with no fallback for UDP peers.How WebSocket peers are detected
handle_listener_inneralready extracts the real client IP fromX-Real-IP/X-Forwarded-Forwhen accepting a WebSocket upgrade, but there is no real UDP port, soaddrbecomes<real-ip>:0. UDP peers always have a non-zero port. Port == 0 is therefore an unambiguous sentinel for a WebSocket-registered peer.Changes
handle_tcp(RegisterPk,ws=true)tcp_punch; returntrueto keep the connection alive for future server-push messageshandle_udp_punch_hole_request,RequestRelaypeer_addr.port() == 0; callsend_to_tcpinstead of the UDP channelsend_to_tcpsend_to_tcp_syncEnd-to-end connection flow (both directions)
Setup: WebSocket peer A behind a reverse proxy, UDP peer B on a local network.
A → B (unchanged): A opens a WebSocket, sends
PunchHoleRequest; sink stored intcp_punch; hbbs deliversFetchLocalAddrto B via UDP; B responds via TCP withPunchHoleSent; hbbs deliversPunchHoleResponseto A via the stored sink.B → A (new): B sends
PunchHoleRequestvia UDP; hbbs detectsA.socket_addr.port() == 0and deliversPunchHoleto A viasend_to_tcp(sink re-inserted after use); A connects to relay and sendsPunchHoleSent;handle_hole_sentcallssend_to_tcpfor B's addr — no sink found, falls back to UDP → B receivesPunchHoleResponse; both sides connect to relay → bridged.Testing
Tested end-to-end with:
allow-websocket = Y, connecting to hbbs via a Cloudflare tunnel (WebSocket over port 443)Both directions (
A → BandB → A) established relay connections successfully. Prior to this patch,B → Aalways timed out.Summary by CodeRabbit
New Features
Improvements
Bug Fixes