V2 quantizer: fix IO-boundary shared clusters left in float#20291
V2 quantizer: fix IO-boundary shared clusters left in float#20291rascani wants to merge 1 commit into
Conversation
Summary: Shared-op clusters (e.g. `cat`, `view`, `reshape`) on the quantized IO boundary were silently left in float by the composable TOSA quantizer (`_TOSAQuantizerV2`), causing them to fall off the Ethos-U integer delegate onto CPU. `SharedQspecQuantizer` propagates a qspec only from already-quantized neighbors. A cluster whose only quantized neighbors are a uint8 model input (intentionally skipped by `_skip_shared_qspec_from_io` to confine uint8 to the IO boundary) and/or an input-state placeholder with no `output_qspec` had no qspec to propagate, so it was rejected and remained in float. The fix adds `_is_quantized_io_boundary`, which detects annotated `placeholder`/`output` nodes that signal the cluster is on the quantized data path even when their qspec is filtered. `_get_shared_clique` now returns a `touches_quantized_io` flag alongside the usual results. When `_annotate_shared_cluster` finds an empty `adjacent_qspecs` but a boundary-touching cluster, it initiates quantization from the global config input-activation qspec instead of rejecting. `_TOSAQuantizerV2.set_global` now also propagates to `shared_qspec_quantizer.global_config` so the fallback is wired automatically. This restores the correctness fix from D107320847, which was abandoned because its other fix (parameter-operand weight misclassification) had already been resolved via the `is_weight` `PARAMETER_TARGETS` refactor. This change was developed with assistance from Claude. Differential Revision: D108662081
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20291
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New FailuresAs of commit d405304 with merge base e257a71 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@rascani has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108662081. |
This PR needs a
|
Summary:
Shared-op clusters (e.g.
cat,view,reshape) on the quantized IO boundary were silently left in float by the composable TOSA quantizer (_TOSAQuantizerV2), causing them to fall off the Ethos-U integer delegate onto CPU.SharedQspecQuantizerpropagates a qspec only from already-quantized neighbors. A cluster whose only quantized neighbors are a uint8 model input (intentionally skipped by_skip_shared_qspec_from_ioto confine uint8 to the IO boundary) and/or an input-state placeholder with nooutput_qspechad no qspec to propagate, so it was rejected and remained in float.The fix adds
_is_quantized_io_boundary, which detects annotatedplaceholder/outputnodes that signal the cluster is on the quantized data path even when their qspec is filtered._get_shared_cliquenow returns atouches_quantized_ioflag alongside the usual results. When_annotate_shared_clusterfinds an emptyadjacent_qspecsbut a boundary-touching cluster, it initiates quantization from the global config input-activation qspec instead of rejecting._TOSAQuantizerV2.set_globalnow also propagates toshared_qspec_quantizer.global_configso the fallback is wired automatically.This restores the correctness fix from D107320847, which was abandoned because its other fix (parameter-operand weight misclassification) had already been resolved via the
is_weightPARAMETER_TARGETSrefactor.This change was developed with assistance from Claude.
Differential Revision: D108662081
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell