Skip to content

Commit 8782f26

Browse files
Add benchmarks framework
Add performance benchmarking infrastructure for fromager including GitHub Actions workflows for nightly and on-demand runs. Signed-off-by: Michael Yochpaz <myochpaz@redhat.com>
1 parent f6e78fb commit 8782f26

14 files changed

Lines changed: 3641 additions & 0 deletions

File tree

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
name: Nightly Integration Benchmarks
2+
3+
on:
4+
schedule:
5+
- cron: "0 2 * * *" # 2 AM UTC daily
6+
workflow_dispatch: # Allow manual trigger
7+
8+
jobs:
9+
integration-benchmark:
10+
runs-on: ubuntu-latest
11+
12+
services:
13+
local-pypi:
14+
image: pypiserver/pypiserver:latest
15+
ports:
16+
- 8080:8080
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Set up Python
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: "3.11"
25+
26+
- name: Install uv
27+
uses: astral-sh/setup-uv@v4
28+
29+
- name: Install dependencies
30+
run: uv sync --extra benchmark
31+
32+
- name: Download packages for local PyPI
33+
run: |
34+
mkdir -p packages
35+
uv pip download -r benchmarks/requirements/packages.txt -d packages
36+
# Note: In production, these would be copied to the pypiserver volume
37+
38+
- name: Run integration benchmarks with CodSpeed
39+
uses: CodSpeedHQ/action@v3
40+
env:
41+
CODSPEED_VALGRIND_ARGS: "--trace-children=yes --trace-children-skip=/bin/sh,/usr/bin/git"
42+
UV_INDEX_URL: "http://localhost:8080/simple"
43+
UV_NO_PROGRESS: "1"
44+
with:
45+
token: ${{ secrets.CODSPEED_TOKEN }}
46+
run: uv run pytest benchmarks/ --codspeed -m "integration"
47+
48+
- name: Generate benchmark JSON (fallback)
49+
if: always()
50+
env:
51+
UV_INDEX_URL: "http://localhost:8080/simple"
52+
UV_NO_PROGRESS: "1"
53+
run: |
54+
uv run pytest benchmarks/ \
55+
--benchmark-only \
56+
--benchmark-json=integration-benchmark-results.json \
57+
-m "integration" || true
58+
59+
- name: Upload benchmark results
60+
uses: actions/upload-artifact@v4
61+
if: always()
62+
with:
63+
name: integration-benchmark-results
64+
path: integration-benchmark-results.json
65+
retention-days: 90
66+
67+
- name: Run memory profiling on integration tests
68+
if: always()
69+
env:
70+
UV_INDEX_URL: "http://localhost:8080/simple"
71+
UV_NO_PROGRESS: "1"
72+
run: |
73+
uv run pytest benchmarks/ \
74+
--memray \
75+
--memray-bin-path=integration-memray-results \
76+
-m "integration" || true
77+
78+
- name: Upload memory results
79+
uses: actions/upload-artifact@v4
80+
if: always()
81+
with:
82+
name: integration-memory-results
83+
path: integration-memray-results/
84+
retention-days: 90

.github/workflows/benchmarks.yml

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
name: Benchmarks
2+
3+
on:
4+
pull_request:
5+
types: [labeled, synchronize]
6+
push:
7+
branches: [main]
8+
9+
jobs:
10+
benchmark-cpu:
11+
# Only run on PRs with 'run-benchmarks' label OR on main branch
12+
if: |
13+
github.event_name == 'push' ||
14+
contains(github.event.pull_request.labels.*.name, 'run-benchmarks')
15+
16+
runs-on: ubuntu-latest
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Set up Python
22+
uses: actions/setup-python@v5
23+
with:
24+
python-version: "3.11"
25+
26+
- name: Install uv
27+
uses: astral-sh/setup-uv@v4
28+
29+
- name: Install dependencies
30+
run: uv sync --extra benchmark
31+
32+
- name: Run benchmarks with CodSpeed
33+
uses: CodSpeedHQ/action@v3
34+
with:
35+
token: ${{ secrets.CODSPEED_TOKEN }}
36+
run: uv run pytest benchmarks/ --codspeed -m "not slow and not integration"
37+
38+
- name: Generate benchmark JSON (fallback)
39+
if: always()
40+
run: |
41+
uv run pytest benchmarks/ \
42+
--benchmark-only \
43+
--benchmark-json=benchmark-results.json \
44+
-m "not slow and not integration" || true
45+
46+
- name: Upload benchmark results
47+
uses: actions/upload-artifact@v4
48+
if: always()
49+
with:
50+
name: benchmark-results
51+
path: benchmark-results.json
52+
retention-days: 30
53+
54+
benchmark-memory:
55+
# Only run on PRs with 'run-benchmarks' label
56+
if: contains(github.event.pull_request.labels.*.name, 'run-benchmarks')
57+
58+
runs-on: ubuntu-latest
59+
60+
steps:
61+
- uses: actions/checkout@v4
62+
63+
- name: Set up Python
64+
uses: actions/setup-python@v5
65+
with:
66+
python-version: "3.11"
67+
68+
- name: Install uv
69+
uses: astral-sh/setup-uv@v4
70+
71+
- name: Install dependencies
72+
run: uv sync --extra benchmark
73+
74+
- name: Run memory benchmarks
75+
run: |
76+
uv run pytest benchmarks/ \
77+
--memray \
78+
--memray-bin-path=memray-results \
79+
-m "not slow and not integration"
80+
81+
- name: Upload memory results
82+
uses: actions/upload-artifact@v4
83+
if: always()
84+
with:
85+
name: memory-results
86+
path: memray-results/
87+
retention-days: 30

benchmarks/README.md

Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
# Fromager Benchmarks
2+
3+
Performance benchmarks for Fromager, a tool for rebuilding complete dependency trees of Python wheels from source.
4+
5+
## Why Benchmarks?
6+
7+
Fromager recursively resolves and builds entire dependency trees from source. A typical bootstrap processes hundreds of packages, each requiring version resolution, source acquisition, patching, and wheel building. Performance regressions in Fromager's core logic compound across these operations.
8+
9+
**The challenge:** Wall-clock benchmarks in shared CI environments vary 10-20% due to noise. A genuine 5% regression becomes indistinguishable from CI variance. These benchmarks focus on pure Python operations where measurements are stable and regressions are detectable.
10+
11+
---
12+
13+
## Quick Start
14+
15+
```bash
16+
# Install dependencies
17+
uv sync --extra benchmark
18+
19+
# Run all benchmarks
20+
uv run pytest benchmarks/
21+
22+
# Fast benchmarks only (skip slow and integration)
23+
uv run pytest benchmarks/ -m "not slow and not integration"
24+
25+
# Integration benchmarks only
26+
uv run pytest benchmarks/ -m "integration"
27+
28+
# With memory profiling
29+
uv run pytest benchmarks/ --memray
30+
31+
# Compare against baseline
32+
uv run pytest benchmarks/ --benchmark-save=baseline
33+
# ... make changes ...
34+
uv run pytest benchmarks/ --benchmark-compare=baseline
35+
36+
# Export to JSON
37+
uv run pytest benchmarks/ --benchmark-json=results.json
38+
```
39+
40+
---
41+
42+
## Understanding Output
43+
44+
```
45+
-------------------------------- benchmark: 3 tests --------------------------------
46+
Name Mean StdDev Rounds
47+
------------------------------------------------------------------------------------
48+
test_constraint_add_and_check 0.85ms 0.05ms 100
49+
test_graph_serialization 1.20ms 0.08ms 100
50+
test_python_version_matching_hot 0.12ms 0.01ms 200
51+
------------------------------------------------------------------------------------
52+
```
53+
54+
**Key metrics:**
55+
- **Mean** — Primary comparison metric
56+
- **StdDev** — Low values indicate reliable measurements
57+
- **Rounds** — More rounds = more statistical confidence
58+
59+
**Comparison output:**
60+
```
61+
Name Mean (now) Mean (base) Ratio
62+
--------------------------------------------------------------------
63+
test_constraint_add_and_check 0.87ms 0.85ms 1.02x
64+
```
65+
66+
- **Ratio < 1.0** — Faster (improvement)
67+
- **Ratio > 1.15** — Investigate before merging
68+
69+
---
70+
71+
## Adding Benchmarks
72+
73+
Create test functions in `test_*.py` files using the `benchmark` fixture:
74+
75+
```python
76+
def test_constraint_satisfaction(benchmark):
77+
"""Benchmark Fromager's constraint checking."""
78+
from fromager.constraints import Constraints
79+
from packaging.version import Version
80+
81+
constraints = Constraints()
82+
constraints.add_constraint("numpy>=1.20,<2.0")
83+
84+
versions = [Version(v) for v in ["1.19.0", "1.25.0", "2.0.0"]]
85+
86+
def check_all():
87+
return [constraints.is_satisfied_by("numpy", v) for v in versions]
88+
89+
result = benchmark(check_all)
90+
assert result == [False, True, False]
91+
```
92+
93+
**Guidelines:**
94+
- Keep setup outside the benchmark function
95+
- Assert correctness to ensure the benchmark actually works
96+
- Mark slow benchmarks with `@pytest.mark.slow`
97+
- Add metadata with `benchmark.extra_info["key"] = value`
98+
99+
---
100+
101+
## Advanced Features
102+
103+
### Benchmark Categories
104+
105+
| Category | File | Characteristics |
106+
|----------|------|-----------------|
107+
| Component | `test_resolution.py` | Fast, pure Python, no subprocess |
108+
| Integration | `test_integration.py` | Slow, uses fixtures, network-isolated |
109+
| Memory | Any with `--memray` | Tracks allocations and peak memory |
110+
111+
### Markers
112+
113+
- `@pytest.mark.slow` — Skip with `-m "not slow"`
114+
- `@pytest.mark.integration` — Requires fixtures (local PyPI, uv shim)
115+
- `@pytest.mark.memory` — Memory-focused benchmarks
116+
117+
### Integration Fixtures
118+
119+
The `fixtures/` module provides isolation for realistic benchmarks:
120+
121+
- **`local_pypi`** — Session-scoped local PyPI server for network isolation
122+
- **`configured_env`** — Configures environment to use local PyPI
123+
- **`uv_shim`** — Creates mock uv binary for subprocess isolation
124+
- **`subprocess_timer`** — Measures subprocess execution time and overhead
125+
126+
### Memory Profiling
127+
128+
Memory benchmarks use pytest-memray (non-Windows):
129+
130+
```bash
131+
uv run pytest benchmarks/ --memray
132+
uv run pytest benchmarks/ --memray --memray-bin-path=./memray-results
133+
```
134+
135+
### CI Integration
136+
137+
Benchmarks run automatically via GitHub Actions:
138+
139+
- **`benchmarks.yml`** — PRs with `run-benchmarks` label or push to main
140+
- **`benchmarks-nightly.yml`** — Nightly integration benchmarks (2 AM UTC)
141+
142+
CodSpeed provides noise-resistant CI benchmarks via instruction counting rather than wall-clock time.
143+
144+
---
145+
146+
## Directory Structure
147+
148+
```
149+
benchmarks/
150+
├── README.md # This file
151+
├── conftest.py # Shared fixtures and markers
152+
├── pytest.ini # Benchmark configuration
153+
├── fixtures/ # Reusable fixture modules
154+
│ ├── __init__.py
155+
│ ├── pypi_server.py # Local PyPI server
156+
│ ├── uv_shim.py # Subprocess isolation
157+
│ └── metrics.py # Timing collectors
158+
├── requirements/ # Package requirements for local PyPI
159+
│ └── packages.txt
160+
├── test_resolution.py # Component benchmarks
161+
└── test_integration.py # Integration benchmarks (slow)
162+
```
163+
164+
---
165+
166+
## Troubleshooting
167+
168+
**High variance:** Close resource-intensive applications. Increase rounds:
169+
```bash
170+
uv run pytest benchmarks/ --benchmark-min-rounds=20
171+
```
172+
173+
**Missing module:** Install dependencies with `uv sync --extra benchmark`
174+
175+
**Debug without timing:** Run benchmarks as regular tests:
176+
```bash
177+
uv run pytest benchmarks/ --benchmark-disable
178+
```
179+
180+
---
181+
182+
## Resources
183+
184+
- [pytest-benchmark documentation](https://pytest-benchmark.readthedocs.io/)
185+
- [CodSpeed documentation](https://docs.codspeed.io/)
186+
- [pytest-memray documentation](https://pytest-memray.readthedocs.io/)

0 commit comments

Comments
 (0)