Fix audio buffer memory leak: use in-place del instead of slice-copy by aliev · Pull Request #228 · GetStream/stream-py

aliev · 2026-03-24T19:09:37Z

Why

pymalloc allocates memory for small objects in large blocks (1MB arenas) to reduce syscalls. An arena can only be returned to the OS when all objects inside it are dead.

AudioStreamTrack.recv() runs every 20ms (50x/sec) and slices the buffer:

self._buffer = self._buffer[self._bytes_per_frame:]

A new bytearray is allocated (new memory from the arena)
Remaining data is copied into it
The old bytearray's refcount drops to 0 and is freed

But "freed" means freed inside pymalloc, not returned to the OS. The old object lived in an arena alongside other objects. If any of those neighbors are still alive, the entire arena (1MB) stays reserved.

At 50 allocations/sec, arenas become fragmented: each one has a mix of live and dead blocks, so none of them can be fully released. RSS grows with every call and never comes back.

Arena (1MB)
├── Pool 1: [free][free][free][free]     <- all free
├── Pool 2: [free][LIVE][free][free]     <- 1 live object
├── Pool 3: [free][free][free][free]     <- all free
└── Pool 4: [free][free][LIVE][free]     <- 1 live object

Result: 64 bytes of live data, 1MB stuck in RSS.

The fix uses in-place deletion:

del self._buffer[:self._bytes_per_frame]

This calls memmove to shift data within the same buffer. No new object, no malloc, no GC, no arena fragmentation.

Changes

Replace self._buffer = self._buffer[n:] with del self._buffer[:n] in two places (line 133 and 195)

Memray profiling results (single call, 48kHz mono s16)

Before: audio_track.py:195 (recv) -> 900MB  (#1 allocator by size)
After:  audio_track.py            -> not in top 5

Summary by CodeRabbit

Refactor
- Improved audio buffer handling to avoid reallocating the underlying buffer when consuming frames or trimming excess data, reducing memory churn and improving runtime efficiency.
Tests
- Added automated tests verifying the internal audio buffer object identity and correct byte-length after data consumption and overflow trimming to prevent regressions.

bytearray slice assignment (self._buffer = self._buffer[n:]) creates a new bytearray every 20ms. At 48kHz mono s16, that's 50 allocations per second, each copying the remaining buffer. pymalloc retains these in arena pools and never returns the memory to the OS. del self._buffer[:n] removes data in-place without allocating a new object. Memray profiling showed this single line accounted for 900MB of allocations per call.

coderabbitai · 2026-03-24T19:09:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 96048b5e-bad4-4f63-b20a-f25c807997c9

📥 Commits

Reviewing files that changed from the base of the PR and between 1ba0b9d and b33a7a3.

📒 Files selected for processing (1)

tests/test_audio_stream_track.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_audio_stream_track.py

📝 Walkthrough

Walkthrough

Replaces buffer re-slicing with in-place deletion on the audio track's bytearray (using del) in write and recv, and adds async tests that assert the track's _buffer object identity is preserved after recv() and overflow trimming.

Changes

Cohort / File(s)	Summary
Audio Buffer Optimization `getstream/video/rtc/audio_track.py`	Replaced buffer re-slicing (`self._buffer = self._buffer[n:]`) with in-place deletion (`del self._buffer[:n]`) in `write` and `recv` to avoid creating a new `bytearray`.
Tests (buffer identity & overflow) `tests/test_audio_stream_track.py`	Added async tests that capture `id(track._buffer)`, call `recv()` and write data that triggers overflow trimming, and assert the `_buffer` identity and expected remaining length remain unchanged after operations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nibble at the bytes with care,
I trim the front but leave it there.
No new array, the buffer stays,
I hop along in tidy ways.
🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: replacing slice-copy buffer operations with in-place deletion to fix a memory leak in the audio buffer.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/audio-track-buffer-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Assert that recv() and overflow trim modify the same bytearray object rather than creating a new one via slice-copy.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/test_audio_stream_track.py (1)

306-355: Use fixtures for AudioStreamTrack setup in these new tests.

Both new tests duplicate track setup and direct object construction. Consider a fixture for the mono/s16 track variants used here.

As per coding guidelines, "Use fixtures to inject objects in tests; test client fixtures can use the Stream API client".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/test_audio_stream_track.py` around lines 306 - 355, The two tests
test_recv_does_not_reallocate_buffer and
test_buffer_overflow_does_not_reallocate duplicate AudioStreamTrack
construction; extract a pytest fixture (e.g., mono_s16_track) that yields an
AudioStreamTrack configured for mono s16 (sample_rate=48000, channels=1,
format="s16"), and update test_recv_does_not_reallocate_buffer to accept that
fixture instead of constructing track locally; for
test_buffer_overflow_does_not_reallocate add a separate fixture or a
parameterized fixture (e.g., mono_s16_track_with_buffer_size or use request
param) to supply audio_buffer_size_ms=100 while keeping the same fixture
pattern, then remove direct constructions and use the fixtures in both tests.
Ensure fixture names match the test parameters and yield a ready-to-use track
for write/recv operations.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_audio_stream_track.py`:
- Around line 330-358: The test test_buffer_overflow_does_not_reallocate only
checks identity of track._buffer, which can pass even if trimming didn't occur;
after writing pcm_large, also assert the buffer length equals the 100ms cap
(compute as sample_rate * channels * audio_buffer_size_ms / 1000) to prove the
overflow path executed — e.g., for sample_rate=48000, channels=1,
audio_buffer_size_ms=100 the expected length is 4800 samples; add this length
assertion immediately after the second await track.write(pcm_large) alongside
the existing id(track._buffer) check.

---

Nitpick comments:
In `@tests/test_audio_stream_track.py`:
- Around line 306-355: The two tests test_recv_does_not_reallocate_buffer and
test_buffer_overflow_does_not_reallocate duplicate AudioStreamTrack
construction; extract a pytest fixture (e.g., mono_s16_track) that yields an
AudioStreamTrack configured for mono s16 (sample_rate=48000, channels=1,
format="s16"), and update test_recv_does_not_reallocate_buffer to accept that
fixture instead of constructing track locally; for
test_buffer_overflow_does_not_reallocate add a separate fixture or a
parameterized fixture (e.g., mono_s16_track_with_buffer_size or use request
param) to supply audio_buffer_size_ms=100 while keeping the same fixture
pattern, then remove direct constructions and use the fixtures in both tests.
Ensure fixture names match the test parameters and yield a ready-to-use track
for write/recv operations.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 86bd7fce-5ac8-4b9c-b080-e74f924e7e53

📥 Commits

Reviewing files that changed from the base of the PR and between 7262262 and 4c2969d.

📒 Files selected for processing (1)

tests/test_audio_stream_track.py

coderabbitai · 2026-03-25T22:26:35Z

tests/test_audio_stream_track.py

+    async def test_buffer_overflow_does_not_reallocate(self):
+        """Test that buffer overflow trims in-place without creating a new buffer object."""
+        track = AudioStreamTrack(
+            sample_rate=48000, channels=1, format="s16", audio_buffer_size_ms=100
+        )
+
+        # Write 50ms of data first to get a buffer reference
+        samples_50ms = np.zeros(2400, dtype=np.int16)
+        pcm = PcmData(
+            samples=samples_50ms,
+            sample_rate=48000,
+            format=AudioFormat.S16,
+            channels=1,
+        )
+        await track.write(pcm)
+        buffer_id = id(track._buffer)
+
+        # Write 200ms of data (exceeds 100ms limit, triggers overflow trim)
+        samples_200ms = np.zeros(9600, dtype=np.int16)
+        pcm_large = PcmData(
+            samples=samples_200ms,
+            sample_rate=48000,
+            format=AudioFormat.S16,
+            channels=1,
+        )
+        await track.write(pcm_large)
+
+        assert id(track._buffer) == buffer_id, "overflow trim should modify buffer in-place, not create a new one"
+


⚠️ Potential issue | 🟡 Minor

Assert trimmed length to prove overflow path executed.

At Line 357, identity-only validation can pass even if overflow trimming fails but _buffer is still the same object. Add a length assertion for the 100ms cap.

Suggested test hardening

await track.write(pcm_large) assert id(track._buffer) == buffer_id, "overflow trim should modify buffer in-place, not create a new one" + expected_max_bytes = int(0.1 * 48000) * 2 # 100ms * sample_rate * bytes_per_sample + assert len(track._buffer) == expected_max_bytes, "buffer should be trimmed to configured max size"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/test_audio_stream_track.py` around lines 330 - 358, The test test_buffer_overflow_does_not_reallocate only checks identity of track._buffer, which can pass even if trimming didn't occur; after writing pcm_large, also assert the buffer length equals the 100ms cap (compute as sample_rate * channels * audio_buffer_size_ms / 1000) to prove the overflow path executed — e.g., for sample_rate=48000, channels=1, audio_buffer_size_ms=100 the expected length is 4800 samples; add this length assertion immediately after the second await track.write(pcm_large) alongside the existing id(track._buffer) check.

id() can be reused after deallocation, and `obj is int` is always False. Save a reference before the operation and compare with `is` after to correctly verify the buffer is the same object.

aliev temporarily deployed to ci March 24, 2026 19:09 — with GitHub Actions Inactive

aliev marked this pull request as ready for review March 24, 2026 19:31

aliev requested a review from dangusev March 24, 2026 20:02

test: verify buffer operations are in-place (no reallocation)

4c2969d

Assert that recv() and overflow trim modify the same bytearray object rather than creating a new one via slice-copy.

aliev temporarily deployed to ci March 25, 2026 22:23 — with GitHub Actions Inactive

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

aliev temporarily deployed to ci March 26, 2026 09:47 — with GitHub Actions Inactive

aliev temporarily deployed to ci March 26, 2026 09:49 — with GitHub Actions Inactive

aliev had a problem deploying to ci March 26, 2026 09:49 — with GitHub Actions Failure

aliev temporarily deployed to ci March 26, 2026 09:49 — with GitHub Actions Inactive

aliev temporarily deployed to ci March 26, 2026 10:01 — with GitHub Actions Inactive

fix: use object identity check instead of id() in buffer tests

b33a7a3

id() can be reused after deallocation, and `obj is int` is always False. Save a reference before the operation and compare with `is` after to correctly verify the buffer is the same object.

aliev had a problem deploying to ci March 26, 2026 12:16 — with GitHub Actions Failure

aliev temporarily deployed to ci March 26, 2026 12:16 — with GitHub Actions Inactive

aliev had a problem deploying to ci March 26, 2026 12:16 — with GitHub Actions Failure

aliev temporarily deployed to ci March 26, 2026 12:16 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix audio buffer memory leak: use in-place del instead of slice-copy#228

Fix audio buffer memory leak: use in-place del instead of slice-copy#228
aliev wants to merge 4 commits intomainfrom
fix/audio-track-buffer-leak

aliev commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aliev commented Mar 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Changes

Memray profiling results (single call, 48kHz mono s16)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aliev commented Mar 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading