Skip to content

Test build perf v2#835

Draft
mjp41 wants to merge 30 commits intomicrosoft:mainfrom
mjp41:test-build-perf-v2
Draft

Test build perf v2#835
mjp41 wants to merge 30 commits intomicrosoft:mainfrom
mjp41:test-build-perf-v2

Conversation

@mjp41
Copy link
Copy Markdown
Member

@mjp41 mjp41 commented Mar 28, 2026

No description provided.

mjp41 added 19 commits March 27, 2026 11:49
Pure file move. allocconfig.h contains only compile-time constants with
dependencies only on bits.h and mitigations.h (both in ds_core/).
Moving it enables sizeclassstatic.h (next step) to live in ds_core/
without pulling in ds/.

Changes:
- git mv ds/allocconfig.h -> ds_core/allocconfig.h
- Add #include bits.h and mitigations.h to moved file
- Update include paths in ds/ds.h, ds/mpmcstack.h
- Add allocconfig.h to ds_core/ds_core.h umbrella
Create ds_core/sizeclassstatic.h with compile-time sizeclass functions
extracted from mem/sizeclasstable.h:
- smallsizeclass_t (size_t typedef)
- size_to_sizeclass_const()
- NUM_SMALL_SIZECLASSES
- sizeclass_to_size_const()
- is_small_sizeclass()

Move sizeclasstable.h from mem/ to ds/ (depends on PAL, not allocator).
Remove redundant sizeclasstable.h includes from mem/ files.
Convert smallsizeclass_t from a size_t typedef to a struct with:
- Explicit construction from size_t (no implicit conversion FROM)
- Implicit conversion to size_t (for array indexing, comparisons)
- Pre/post increment operators

Provides type safety: alloc(size_t) (byte size) cannot be confused
with alloc(smallsizeclass_t) (sizeclass index).

Update all iteration sites to use explicit construction.
Add alloc(smallsizeclass_t) overload to corealloc and globalalloc that
skips the dynamic sizeclass lookup. Refactor small_alloc to accept
(smallsizeclass_t, size_t) and keep the single-arg (size_t) version
as a forwarding overload.

Update alloc<size>() to use compile-time sizeclass when the size is
a known small sizeclass.

Add libc::malloc_small(smallsizeclass_t) and
libc::malloc_small_zero(smallsizeclass_t) as non-template entry points.
These are general-purpose data structures, not allocator-specific.
Moving them reduces the dependency surface of mem/.
Mark dealloc(), debug_teardown(), and all libc namespace functions with
SNMALLOC_USED_FUNCTION so they emit linkable symbols when compiled into
a static library. No behavioural change.
Add __malloc_start_pointer() and __malloc_last_byte_pointer() to libc.h,
alongside the existing __malloc_end_pointer(). These provide the
external_pointer functionality through the libc API surface.
Add snmalloc_testlib.h (thin header with API declarations) and
snmalloc_testlib.cc (single TU compiling full allocator + wrappers).

CMake: build_test_library() creates a static library per flavour.
All test executables link against it (ODR-safe). No tests converted
yet — this just adds the infrastructure.
Convert simple alloc/dealloc tests to use snmalloc_testlib.h instead
of snmalloc.h:
- func/first_operation
- func/teardown
- func/multi_atexit
- func/multi_threadatexit

Replace MAX_SMALL_SIZECLASS_BITS with max_small_sizeclass_bits().
Replace alloc<Zero> with alloc<ZeroMem::YesZero>.
Add pal/pal.h include to testlib header for report_fatal_error.
Convert perf tests that use only the public API:
- perf/startup
- perf/contention
- perf/large_alloc
- perf/low_memory

Replace DefaultPal::tick() with pal_tick(), Aal::pause() with pal_pause().
Convert tests that use only the public API to snmalloc_testlib.h:
- func/bits, func/memory, func/memory_usage, func/pool
- func/protect_fork, func/redblack, func/statistics
- perf/external_pointer, perf/lotsofthreads, perf/post_teardown
- perf/singlethread

Replace DefaultPal/Aal calls with pal_*() wrappers.
Replace external_pointer<> with libc __malloc_*_pointer().
Replace alloc<Zero/Uninit> with alloc<ZeroMem::YesZero/NoZero>.
Fix include order in ds/ds.h for pool.h/pooled.h dependencies.
Convert tests that need allocator internals (sizeclasses, pagemap, etc.)
to use snmalloc_core.h alongside snmalloc_testlib.h:
- func/malloc (replace our_* with testlib_* prefix)
- func/pagemap
- func/release-rounding
- func/sizeclass
Move report_fatal_error() and message() template definitions from
pal/pal.h into ds_core/helpers.h. They now call error() and
message_impl() which are forward-declared non-template functions
in ds_core/defines.h, with implementations in pal/pal.h that
delegate to DefaultPal.

This breaks the dependency of ds_core/ code on pal/: any code
that includes ds_core/helpers.h can call report_fatal_error()
without needing PAL headers. The linker resolves error() and
message_impl() from the translation unit that includes pal.h.

Update testlib.h to include pal/pal_consts.h instead of pal/pal.h.
Fix memory.cc to use ds_aal/ds_aal.h for address_cast.
Fix protect_fork.cc to include ds_aal/prevent_fork.h explicitly.
Revert low-memory.cc to snmalloc.h (needs PalNotificationObject).
The header doesn't use errno itself. Tests that need it can
include it directly.
Replace snmalloc_testlib.h with ds_core/helpers.h + ds_aal/prevent_fork.h.
This test only uses PreventFork and message(), no allocator API.
These tests had indented #include directives under #ifdef RUN_TEST,
which the earlier sed-based conversion missed. Convert them now.
Create ds_core/sizeclassconfig.h with the mitigation-independent
sizeclass constants (INTERMEDIATE_BITS, MIN_ALLOC_STEP_SIZE, etc.).
This file depends only on bits.h — no mitigations.h.

Update sizeclassstatic.h to include sizeclassconfig.h instead of
allocconfig.h. allocconfig.h now includes sizeclassconfig.h and
keeps only the mitigation-dependent constants (MIN_OBJECT_COUNT,
DEALLOC_BATCH_*, MAX_SLAB_SPAN_*, etc.).

This makes the testlib header chain fully mitigation-independent:
testlib.h -> sizeclassstatic.h -> sizeclassconfig.h -> bits.h
Move mitigations.h, allocconfig.h, and cheri.h from ds_core/ to a new
mitigations/ directory. These files depend on SNMALLOC_CHECK_CLIENT
compile-time flags and don't belong in ds_core/ (which should be
mitigation-independent).

New include hierarchy:
  ds_core/ → aal/ → ds_aal/ → mitigations/ → pal/ → ds/ → mem/

ds_core/ is now fully mitigation-independent. The testlib header
(which includes only ds_core/ headers) produces identical output
regardless of SNMALLOC_CHECK_CLIENT, enabling tests to be compiled
once and linked against both fast and check testlib variants.
Tests that only include snmalloc_testlib.h are mitigation-independent:
their compiled output is identical regardless of SNMALLOC_CHECK_CLIENT.
Compile these into a shared OBJECT library and link against both the
fast and check testlib variants, saving 15 redundant compilations.

Build targets: 171 (down from 186). Test count unchanged: 56 func.
@SchrodingerZhu
Copy link
Copy Markdown
Collaborator

We can also try modules if speed is a concern.

@mjp41
Copy link
Copy Markdown
Member Author

mjp41 commented Mar 28, 2026

We can also try modules if speed is a concern.

Mostly there is a lot of redundant work in the build, and I wanted to experiment with improving that.

mjp41 and others added 9 commits March 28, 2026 22:28
Use early return instead of nested if to make MSVC's flow analysis
see that 'last' is always initialized before use in restore().
The testlib static library was missing -fsanitize and
SNMALLOC_THREAD_SANITIZER_ENABLED when sanitizers are enabled,
causing TSan false positives in tests linked against it.
MSVC ARM64EC generates exit-thunks that require strong definitions of
called functions, not just address-taken references. Replace the
volatile pointer array with a cold function that calls through to
dealloc/debug_teardown, ensuring the compiler emits full definitions
of these inline functions in the testlib TU.
The previous fix replaced address-taking with call-through, which
broke x64 MSVC (address-taking was needed for COMDAT emission).
ARM64EC needs call-through for exit-thunk generation. Do both.
The new mitigations/ directory was missing from the install(DIRECTORY)
rules, causing vcpkg consumers to fail with missing header.
Same exit-thunk issue as dealloc/debug_teardown: ARM64EC needs
call-through in addition to address-taking for all inline functions
that tests may call across the testlib boundary.
mjp41 added 2 commits March 29, 2026 16:38
…spec(noinline)

Instead of complex per-function workarounds for ARM64EC exit-thunks,
make SNMALLOC_USED_FUNCTION actually work on MSVC by defining it as
__declspec(noinline). This prevents the compiler from inlining the
functions away, ensuring standalone definitions exist in the testlib
archive for the linker to resolve.

Remove the #ifdef _MSC_VER block with duplicated function bodies and
the _force_emit_calls/_force_emit_libc_calls call-through helpers.
Keep the address-taking arrays as belt-and-suspenders.
Revert SNMALLOC_USED_FUNCTION change (caused C4141 warning conflict
with SNMALLOC_FAST_PATH_INLINE). Instead, use NOINLINE wrapper
functions inside testlib.cc under #ifdef _MSC_VER. Each wrapper
calls the corresponding inline function, forcing MSVC to emit a
standalone definition that ARM64EC exit-thunks can reference.
GCC/Clang path unchanged (uses __attribute__((used)) via address-taking).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants