Skip to content

V2 quantizer: fix IO-boundary shared clusters left in float#20291

Open
rascani wants to merge 1 commit into
pytorch:mainfrom
rascani:export-D108662081
Open

V2 quantizer: fix IO-boundary shared clusters left in float#20291
rascani wants to merge 1 commit into
pytorch:mainfrom
rascani:export-D108662081

Conversation

@rascani

@rascani rascani commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary:
Shared-op clusters (e.g. cat, view, reshape) on the quantized IO boundary were silently left in float by the composable TOSA quantizer (_TOSAQuantizerV2), causing them to fall off the Ethos-U integer delegate onto CPU.

SharedQspecQuantizer propagates a qspec only from already-quantized neighbors. A cluster whose only quantized neighbors are a uint8 model input (intentionally skipped by _skip_shared_qspec_from_io to confine uint8 to the IO boundary) and/or an input-state placeholder with no output_qspec had no qspec to propagate, so it was rejected and remained in float.

The fix adds _is_quantized_io_boundary, which detects annotated placeholder/output nodes that signal the cluster is on the quantized data path even when their qspec is filtered. _get_shared_clique now returns a touches_quantized_io flag alongside the usual results. When _annotate_shared_cluster finds an empty adjacent_qspecs but a boundary-touching cluster, it initiates quantization from the global config input-activation qspec instead of rejecting. _TOSAQuantizerV2.set_global now also propagates to shared_qspec_quantizer.global_config so the fallback is wired automatically.

This restores the correctness fix from D107320847, which was abandoned because its other fix (parameter-operand weight misclassification) had already been resolved via the is_weight PARAMETER_TARGETS refactor.

This change was developed with assistance from Claude.

Differential Revision: D108662081

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Summary:
Shared-op clusters (e.g. `cat`, `view`, `reshape`) on the quantized IO boundary were silently left in float by the composable TOSA quantizer (`_TOSAQuantizerV2`), causing them to fall off the Ethos-U integer delegate onto CPU.

`SharedQspecQuantizer` propagates a qspec only from already-quantized neighbors. A cluster whose only quantized neighbors are a uint8 model input (intentionally skipped by `_skip_shared_qspec_from_io` to confine uint8 to the IO boundary) and/or an input-state placeholder with no `output_qspec` had no qspec to propagate, so it was rejected and remained in float.

The fix adds `_is_quantized_io_boundary`, which detects annotated `placeholder`/`output` nodes that signal the cluster is on the quantized data path even when their qspec is filtered. `_get_shared_clique` now returns a `touches_quantized_io` flag alongside the usual results. When `_annotate_shared_cluster` finds an empty `adjacent_qspecs` but a boundary-touching cluster, it initiates quantization from the global config input-activation qspec instead of rejecting. `_TOSAQuantizerV2.set_global` now also propagates to `shared_qspec_quantizer.global_config` so the fallback is wired automatically.

This restores the correctness fix from D107320847, which was abandoned because its other fix (parameter-operand weight misclassification) had already been resolved via the `is_weight` `PARAMETER_TARGETS` refactor.

This change was developed with assistance from Claude.

Differential Revision: D108662081
@rascani rascani requested a review from digantdesai as a code owner June 15, 2026 22:37
@pytorch-bot

pytorch-bot Bot commented Jun 15, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20291

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit d405304 with merge base e257a71 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 15, 2026
@meta-codesync

meta-codesync Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@rascani has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108662081.

@github-actions github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Jun 15, 2026
@rascani rascani requested a review from AdrianLundell June 15, 2026 22:38
@github-actions

Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. meta-exported module: arm Issues related to arm backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant