Skip to content

fix: extract target normalisation and fix IPv6 parsing crashes#521

Merged
sleipnir merged 6 commits into
elixir-grpc:masterfrom
arctarus:fix/ipv6-normalize-target-crash
Apr 29, 2026
Merged

fix: extract target normalisation and fix IPv6 parsing crashes#521
sleipnir merged 6 commits into
elixir-grpc:masterfrom
arctarus:fix/ipv6-normalize-target-crash

Conversation

@arctarus
Copy link
Copy Markdown
Contributor

The bug

GRPC.Client.Connection contained two private functions, normalize_target_and_opts/2 and split_host_port/1, responsible for parsing and normalising the raw target string passed to GRPC.Stub.connect/2.

Both functions had untested edge cases around IPv6 addresses that caused runtime crashes in production:

1. normalize_target_and_opts/2CaseClauseError on IPv6 targets

The function delegated schemeless targets to a cond block that only handled two shapes: "a string with one colon" (host:port) and "a string with no colon" (unix socket). When an IPv6 address was passed — which always contains multiple colons — no branch matched and Elixir raised a CaseClauseError, crashing the Connection process before a connection was even attempted.

Affected inputs included any of:

GRPC.Stub.connect("[::1]:50051", [])       # bracketed IPv6 — crash
GRPC.Stub.connect("::1:50051", [])         # bare IPv6 with port — crash
GRPC.Stub.connect("2001:db8::1:8080", [])  # full IPv6 address — crash

2. split_host_port/1 — incorrect parsing of bracketed IPv6 without port

When a bracketed IPv6 address was provided with no port (e.g. "[::1]"), the regex for the [addr]:port pattern did not match. The fallback called strip_scheme/1, which splits on the first : character. Applied to "[::1]" this produced the host ":1]" — a silently wrong value that would cause a connection failure at the adapter level with a confusing error.

Why these bugs matter

Both failures happen before any network activity. The Connection process is a GenServer: a CaseClauseError during init/1 means the process never starts, and the supervision tree restarts it immediately with the same arguments, causing a restart loop. The bracketed-IPv6-without-port bug silently constructs a malformed host and attempts to connect to it, producing a cryptic adapter error. Neither failure points back to IPv6 as the root cause, making both hard to diagnose in production.

While most Kubernetes clusters today connect to gRPC services via DNS names rather than raw IP addresses, there are real and growing scenarios where a caller passes an IPv6 address directly to GRPC.Stub.connect/2: local development and testing ([::1]:50051 is the natural loopback on any IPv6-enabled machine), dual-stack clusters (stable since Kubernetes 1.21, increasingly enabled by cloud providers), telco and edge deployments (5G core network functions are specified over IPv6, where gRPC is a primary protocol), and bare-metal clusters going IPv6-first to avoid RFC 1918 exhaustion.

A bug that causes a silent restart loop with no actionable error message warrants a fix even if it affects a minority of users today.

The fix

Extraction to GRPC.Client.Connection.Target

The normalisation logic was extracted from Connection into a dedicated internal module, GRPC.Client.Connection.Target, with two public functions:

  • normalize/2 — takes a raw target string and an optional %GRPC.Credential{}, returns {norm_target, scheme, cred}.
  • split_host_port/1 — takes a normalised target string, returns {host, port}.

This separation keeps Connection focused on lifecycle concerns and makes the parsing logic independently testable.

IPv6 handling in normalize/2

The schemeless branch now explicitly handles three shapes before falling through:

  1. Bracketed IPv6 with port"[::1]:50051" matched by ~r/^\[([^\]]+)\]:(\d+)$/, yielding "ipv4:::1:50051".
  2. Bracketed IPv6 without port"[::1]" matched by a fallback ~r/\[([^\]]+)\]/ regex, stripping the brackets and returning the default port.
  3. Bare IPv6 with port"2001:db8::1:8080" handled by inspecting whether the last colon-separated segment is an integer, treating it as the port and joining the remaining segments as the address.

Bracketed IPv6 fallback in split_host_port/1

The _ -> fallback when the [addr]:port regex does not match now uses a second regex ~r/\[([^\]]+)\]/ to extract the address from the brackets, rather than calling strip_scheme/1 which blindly splits on the first :.

Interface simplification

normalize/2 previously accepted a keyword() opts list and returned the full opts back to the caller with :cred injected. The function only ever read one key from opts (opts[:cred]). The return type now reflects the module's actual concern:

# before
normalize(target, opts) :: {norm_target, norm_opts, scheme}

# after
normalize(target, cred) :: {norm_target, scheme, cred}

Connection now calls Target.normalize(target, opts[:cred]) and reassembles its own state directly, keeping opts management in Connection where it belongs.

The nimble_options dependency that was previously added to validate opts[:cred] was removed, as it is no longer needed — the type is enforced by the is_struct(cred, GRPC.Credential) guard clause.

Tests

A new test file, grpc/test/grpc/client/connection/target_test.exs, provides exhaustive coverage of Target.normalize/2 and Target.split_host_port/1, including all IPv6 input shapes, scheme inference, credential propagation, and the ArgumentError raised when a credential is passed alongside an http:// target.

The test suite was written first as failing tests to reproduce the original crashes, then made green by the implementation. The bracketed-IPv6-without-port bug in split_host_port/1 was itself discovered by the new tests during the review pass.

@arctarus arctarus force-pushed the fix/ipv6-normalize-target-crash branch 4 times, most recently from 35635e0 to 9aa9e12 Compare April 13, 2026 20:50
Copy link
Copy Markdown
Collaborator

@sleipnir sleipnir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the PR.

Could you reduce the number of comments in the code that don't translate into some high-level explanation? I've seen many comments that are self-explanatory within the code itself, below the comment.

Comment thread grpc/test/grpc/client/connection/target_test.exs Outdated
Comment thread grpc/test/grpc/client/connection/target_test.exs Outdated
Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
`GRPC.Client.Connection` crashed with a `CaseClauseError` when any IPv6
target was passed (e.g. `"[::1]:50051"`, `"::1:50051"`, `"2001:db8::1:8080"`),
because the schemeless branch only handled single-colon strings. A second
bug in `split_host_port/1` silently produced a malformed host (`":1]"`)
for bracketed IPv6 addresses without an explicit port.

Both bugs are fixed by extracting the parsing logic to a dedicated
`GRPC.Client.Connection.Target` module:

- `normalize/2` now handles bracketed IPv6, bare IPv6, and all resolver
  scheme prefixes explicitly.
- `split_host_port/1` uses a bracket-aware regex fallback instead of
  splitting on the first colon when no port is present.

The function signatures are also simplified: `normalize/2` now accepts
`cred :: GRPC.Credential.t() | nil` directly instead of a full opts
keyword list, and returns `{norm_target, scheme, cred}` instead of
threading opts through. The `nimble_options` dependency is removed.

An exhaustive test suite is added for `Target`, covering all IPv6 input
shapes, scheme inference, credential propagation, and error cases. The
bracketed-IPv6-without-port bug in `split_host_port/1` was itself
discovered during the test review pass.

Made-with: Cursor
Address PR review feedback from sleipnir: strip narrating comments
that restate what the code does and decorative section-divider banners
that duplicate structure already provided by `describe` blocks and
`defp` visibility.

Made-with: Cursor
@sleipnir
Copy link
Copy Markdown
Collaborator

@arctarus We also have an error in the CI.

@arctarus arctarus force-pushed the fix/ipv6-normalize-target-crash branch from 0a7cc48 to d1de4f1 Compare April 28, 2026 14:45
The Target.normalize/2 refactor changed the return value from
{norm_target, norm_opts, scheme} to {norm_target, scheme, cred},
but the connect_opts assignment was not updated, causing a
compilation error.

Made-with: Cursor
@arctarus
Copy link
Copy Markdown
Contributor Author

@sleipnir feedback applied and errors fixed! Let me know if there is something else I can do.

Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
Comment thread grpc/lib/grpc/client/connection/target.ex Outdated
Comment thread grpc/lib/grpc/client/connection/endpoint_resolver.ex Outdated
Comment thread grpc/lib/grpc/client/connection/endpoint_resolver.ex Outdated
Rename GRPC.Client.Connection.Target to
GRPC.Client.Connection.EndpointResolver for clarity. Update all
references in Connection and tests, and rename the source and test
files accordingly.

Made-with: Cursor
normalize/2 previously hardcoded the "ipv4:" resolver prefix for all
addresses, including IPv6. This routed IPv6 targets to the wrong
resolver. Introduce resolver_prefix/1 to detect colons in the host
and emit "ipv6:" when appropriate.

Made-with: Cursor
Replace the string heuristic with :inet.parse_address/1 to detect
the address family, matching the existing pattern in GRPC.Stub.

Made-with: Cursor
@arctarus arctarus force-pushed the fix/ipv6-normalize-target-crash branch from 91f474d to 138a94d Compare April 28, 2026 15:46
@sleipnir sleipnir merged commit 8c784a2 into elixir-grpc:master Apr 29, 2026
7 checks passed
@sleipnir
Copy link
Copy Markdown
Collaborator

@arctarus thank you
Great job!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants