PoC Cortex_m backend: Add support for CMSIS-NN scratch buffers #16580

mansnils · 2026-01-14T12:43:06Z

Use exir.memory.alloc for CMSIS-NN scratch buffers, which is ideal since it has a TensorSpec and gets memory planned but creates no additional operator overhead.
Use CMSIS-NN pybind wrapper to get correct buffer size.

Improves: #16041

Comparing w/wo patch memory consumption for conv2d_x3 unit test case. Without patch:

I [executorch:arm_executor_runner.cpp:842 log_mem_status()] model_pte_program_size: 10208 bytes.
I [executorch:arm_executor_runner.cpp:843 log_mem_status()] model_pte_loaded_size: 10208 bytes.
I [executorch:arm_executor_runner.cpp:848 log_mem_status()] input_file_allocator_used: 10976 / 62914560 free: 62903584 ( used: 0 % )
I [executorch:arm_executor_runner.cpp:860 log_mem_status()] method_allocator_used: 5433 / 62914560 free: 62909127 ( used: 0 % )
I [executorch:arm_executor_runner.cpp:867 log_mem_status()] method_allocator_planned: 2560 bytes
I [executorch:arm_executor_runner.cpp:871 log_mem_status()] method_allocator_loaded: 2857 bytes
I [executorch:arm_executor_runner.cpp:875 log_mem_status()] method_allocator_input: 16 bytes
I [executorch:arm_executor_runner.cpp:876 log_mem_status()] method_allocator_executor: 0 bytes
I [executorch:arm_executor_runner.cpp:879 log_mem_status()] temp_allocator: 2097152

With patch:
I [executorch:arm_executor_runner.cpp:846 log_mem_status()] model_pte_program_size: 10336 bytes.
I [executorch:arm_executor_runner.cpp:847 log_mem_status()] model_pte_loaded_size: 10336 bytes.
I [executorch:arm_executor_runner.cpp:852 log_mem_status()] input_file_allocator_used: 11104 / 62914560 free: 62903456 ( used: 0 % )
I [executorch:arm_executor_runner.cpp:864 log_mem_status()] method_allocator_used: 6414 / 62914560 free: 62908146 ( used: 0 % )
I [executorch:arm_executor_runner.cpp:871 log_mem_status()] method_allocator_planned: 2560 bytes
I [executorch:arm_executor_runner.cpp:875 log_mem_status()] method_allocator_loaded: 3838 bytes
I [executorch:arm_executor_runner.cpp:879 log_mem_status()] method_allocator_input: 16 bytes
I [executorch:arm_executor_runner.cpp:880 log_mem_status()] method_allocator_executor: 0 bytes

Summary:

The big temp_allocator used for scratch is removed in patch and no longer used except for Linear/FC but this is a PoC/Draft-PR anway
method_allocator_planned: 2560 bytes is the same in both cases => This means there is 100% reuse of the scratch buffers
increased model size and method_allocator_loaded delta => This can be explained by increased number of planned objects that describe scratch reuse (more meta data). This meta data should stay consistent if scratch buffer sizes increased so I think this acceptable

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

Use exir.memory.alloc for CMSIS-NN scratch buffers, which is ideal since it has a TensorSpec and gets memory planned but creates no additional operator overhead. Use CMSIS-NN pybind wrapper to get correct buffer size. Change-Id: Ia7ec8eda87833888a0639b480e531fd17818298a

pytorch-bot · 2026-01-14T12:43:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16580

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCm runners linux.rocm.gpu.gfx942.1.b failing

❌ 2 New Failures, 1 Unrelated Failure

As of commit c197701 with merge base 8e8d97e ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for backends/cortex_m/passes/convert_to_cortex_m_pass.py:
pull / unittest-arm-backend-with-no-deps (test_run_tosa) / linux-job (gh)
RuntimeError: Command docker exec -t c982d77b54332529a967f59bdf7011371dff5ef95487c26a93f37f58c5a468ea /exec failed with exit code 1

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / android / run-emulator (gh) (#16137)
Timeout waiting for emulator to boot.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-01-14T12:44:01Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

mansnils added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 14, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PoC Cortex_m backend: Add support for CMSIS-NN scratch buffers #16580

PoC Cortex_m backend: Add support for CMSIS-NN scratch buffers #16580

mansnils commented Jan 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PoC Cortex_m backend: Add support for CMSIS-NN scratch buffers #16580

Are you sure you want to change the base?

PoC Cortex_m backend: Add support for CMSIS-NN scratch buffers #16580

Conversation

mansnils commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16580

❗ 1 Active SEVs

❌ 2 New Failures, 1 Unrelated Failure

Uh oh!

github-actions bot commented Jan 14, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mansnils commented Jan 14, 2026 •

edited

Loading

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

This PR needs a `release notes:` label