resample: preserve integer precision in nodata mask comparison (#2570)#2592
Open
brendancol wants to merge 2 commits into
Open
resample: preserve integer precision in nodata mask comparison (#2570)#2592brendancol wants to merge 2 commits into
brendancol wants to merge 2 commits into
Conversation
`_resolve_nodata` cast the sentinel through `float()` and `_apply_nodata_mask` compared the data array against that float. For int64 values above 2**53 the cast is lossy, so distinct integers collided after rounding and valid cells were masked to NaN (e.g. nodata=2**60+1 also masked a valid 2**60 cell). Fix: keep the sentinel in the input's native dtype when the input is integer/bool, and compare in integer space before promoting the output to float32. Float inputs still route through float() so NaN semantics keep working. The masking helper feeds every backend (numpy / cupy / dask+numpy / dask+cupy), so the change lives in one place.
brendancol
commented
May 28, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: resample integer nodata precision fix
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
xrspatial/resample.py:1162-1165-- thetry/except TypeErroraround the NaN check is dead code.isinstance(nodata, float)already gates the path, andnp.isnanon a Python float will never raise TypeError. Drop the try wrapper and keep theif isinstance(...) and np.isnan(...)check on its own line.
Nits (optional improvements)
xrspatial/resample.py:1166--np.asarray(nodata).astype(agg.dtype).item()will silently truncate a non-integer float sentinel (e.g.nodata=-9999.5on int32 data becomes -9999). Realistically nobody passes a fractional sentinel, but a one-line check likeif isinstance(nodata, float) and not nodata.is_integer(): raise ValueError(...)would catch the obvious user error. Skip if you'd rather keep the helper minimal.xrspatial/tests/test_resample.py:1456--test_int64_sentinel_above_2pow53_preserves_distinct_valuesis the headline regression test. Consider assertingout.values[1, 1] == np.float32(1 << 60)so a future regression that masks the cell to NaN AND a regression that swaps the surviving value both get caught. The currentnp.isfinitecheck is correct but a bit loose.
What looks good
- The split between float-input and integer-input paths in
_resolve_nodatais the right call. It keeps the float NaN short-circuit working while routing integer comparisons through the input dtype. _apply_nodata_maskcomputes the mask before promoting to float32, which is exactly the surgical change the bug needs. The mask carries over correctly toagg.where(mask)after the cast.- Backend coverage is right: the masking helper sits upstream of every per-backend dispatch path, so numpy, cupy, dask+numpy, and dask+cupy all inherit the fix automatically. The PR body confirms numpy and dask+numpy were exercised locally.
- Test coverage hits the exact failure mode (int64 sentinel above 2**53), the explicit-kwarg and attr paths, and the unaffected float branches.
Checklist
- Algorithm matches reference / intent (integer equality in input dtype)
- All implemented backends produce consistent results (single shared helper)
- NaN handling is correct (float NaN short-circuit preserved; NaN sentinel on int data is a no-op)
- Edge cases are covered by tests (int64 > 2**53, int32, uint16, float NaN, float finite)
- No dask chunk boundary concerns (no neighborhood op in this change)
- No premature materialization or unnecessary copies
- Benchmark: not applicable for a correctness fix in a helper
- README feature matrix: not applicable, no new function or backend
- Docstrings updated to reflect the new dispatch on input dtype
…en test (#2570) - Remove the `try/except TypeError` wrapper around the NaN check; the `isinstance(nodata, float)` gate makes the except branch unreachable. - Raise ValueError when a fractional float sentinel (e.g. -9999.5) is passed for an integer input -- silent truncation to int would mask cells the caller never asked to mask. - Tighten the int64-above-2**53 regression test to assert the surviving cell holds the right value, not just that it's finite.
brendancol
commented
May 28, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review
Re-ran the review after the address-nits commit (1adea6f). All previous findings are resolved:
- Suggestion (dead try/except): FIXED -- the wrapper is gone; the
isinstance+np.isnancheck stands alone. - Nit (silent fractional truncation): FIXED --
_resolve_nodatanow raisesValueError("nodata=... is not representable in integer dtype ...")when a non-integer float sentinel is passed for an integer input. Addedtest_fractional_float_sentinel_on_int_input_raisesto lock it in. - Nit (loose
isfiniteassertion): FIXED -- the int64-above-2**53 regression test now asserts the surviving cell equalsnp.float32(1 << 60), so masking-to-NaN and value-swap regressions both get caught.
Full test_resample.py suite passes locally (185 tests). No remaining blockers, suggestions, or nits.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2570.
Summary
_resolve_nodatano longer routes integer sentinels throughfloat(); it casts to the input dtype instead so the equality test in_apply_nodata_maskruns in integer space.nodata = 2**60 + 1), valid cells that share the rounded float representation (e.g.2**60) are no longer masked out.float()so the existing NaN short-circuit keeps working.Backend coverage
The fix lives in the shared masking helper that every backend goes through (numpy / cupy / dask+numpy / dask+cupy). Verified on numpy and dask+numpy locally.
Test plan
TestNodataIntegerPrecisionclass inxrspatial/tests/test_resample.pycovering:_FillValueattrTestNodatacases still passtest_resample.pysuite (184 tests) greentest_resample_coverage_2026_05_27.pyandtest_resample_signature_annot_2544.pygreen