Skip to content

Commit 039e864

Browse files
authored
Merge pull request #51 from benoitc/refactor/remove-auto-io-pool
Remove auto-started io pool to reduce memory usage
2 parents cfaa452 + 01d0911 commit 039e864

7 files changed

Lines changed: 84 additions & 85 deletions

File tree

CHANGELOG.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@
4343

4444
### Changed
4545

46+
- **Removed auto-started io pool** - The io pool is no longer started automatically at
47+
application startup to reduce memory usage. Users who need a dedicated I/O pool can
48+
create one manually via `py_context_router:start_pool(io, 10, worker)`. The configuration
49+
options `io_pool_size` and `io_pool_mode` have been removed.
50+
4651
- **Removed py_event_router** - Removed legacy `py_event_router` module. The `py_event_worker`
4752
now handles all event loop functionality including FD events, timers, and task processing.
4853
This simplifies the architecture by consolidating event handling into a single worker process.
@@ -226,31 +231,22 @@
226231
- `py:dup_fd/1` - Duplicate fd for independent ownership
227232
- Prevents double-close issues when passing sockets to Python reactor
228233

229-
- **Dual Pool Support** - Separate pools for CPU-bound and I/O-bound operations
230-
- `default` pool - For quick CPU-bound operations, sized to number of schedulers
231-
- `io` pool - For I/O-bound operations, larger pool (default: 10) for concurrency
232-
- `py:call(io, Module, Func, Args)` - Execute on the io pool
233-
- `py:call(io, Module, Func, Args, Kwargs)` - Execute with kwargs on io pool
234-
- Registration-based routing (no call site changes needed):
235-
- `py:register_pool(io, requests)` - Route all `requests.*` calls to io pool
236-
- `py:register_pool(io, {aiohttp, get})` - Route specific function to io pool
237-
- `py:unregister_pool(Module)` - Remove module registration
238-
- `py:unregister_pool({Module, Func})` - Remove function registration
239-
- Automatic routing: `py:call(requests, get, [Url])` goes to io pool when registered
234+
- **Custom Pool Support** - Create pools on demand for CPU-bound and I/O-bound operations
235+
- `default` pool - Automatically started, sized to number of schedulers
240236
- `py_context_router:start_pool/2,3` - Start named pools programmatically
241237
- `py_context_router:stop_pool/1` - Stop a named pool
242238
- `py_context_router:pool_started/1` - Check if a pool is running
243239
- `py_context_router:get_context(Pool)` - Get context from a named pool
244240
- `py_context_router:num_contexts(Pool)` - Get pool size
245241
- `py_context_router:contexts(Pool)` - Get all contexts in a pool
246242
- `py_context_router:lookup_pool(Module, Func)` - Query pool routing
247-
- Configuration via application env:
248-
```erlang
249-
{erlang_python, [
250-
{io_pool_size, 10}, % Size of io pool (default: 10)
251-
{io_pool_mode, worker} % Mode for io pool (default: auto)
252-
]}.
253-
```
243+
- `py:call(PoolName, Module, Func, Args)` - Execute on a specific pool
244+
- Registration-based routing (no call site changes needed):
245+
- `py:register_pool(io, requests)` - Route all `requests.*` calls to io pool
246+
- `py:register_pool(io, {aiohttp, get})` - Route specific function to io pool
247+
- `py:unregister_pool(Module)` - Remove module registration
248+
- `py:unregister_pool({Module, Func})` - Remove function registration
249+
- Automatic routing: `py:call(requests, get, [Url])` goes to io pool when registered
254250
- Backward compatible: existing code using `py:call/3,4,5` works unchanged
255251
- New test suite: `test/py_pool_SUITE.erl`
256252

docs/distributed.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ erlang_python integrates with Erlang's distribution to run Python code on remote
4545
| Async tasks | `rpc:call(Node, py_event_loop, create_task, ...)` |
4646
| Venv management | `py:ensure_venv/2,3` |
4747
| Data streaming | `py_channel` API |
48-
| Pool routing | Dual pool (default/io) |
48+
| Pool routing | Custom pools on demand |
4949

5050
## Basic Remote Execution
5151

@@ -496,9 +496,6 @@ Configure nodes for distributed Python:
496496
%% More contexts for dedicated Python workers
497497
{num_contexts, 16},
498498

499-
%% Larger IO pool for network operations
500-
{io_pool_size, 20},
501-
502499
%% Higher concurrency limit
503500
{max_concurrent, 100}
504501
]},

docs/getting-started.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -466,11 +466,14 @@ end.
466466

467467
See [Security](security.md) for details on blocked operations and recommended alternatives.
468468

469-
## Dual Pool Support
469+
## Custom Pool Support
470470

471-
erlang_python provides two pools to separate CPU-bound and I/O-bound operations:
471+
erlang_python lets you create pools on demand to separate CPU-bound and I/O-bound operations:
472472

473473
```erlang
474+
%% Create io pool for I/O-bound operations
475+
{ok, _} = py_context_router:start_pool(io, 10, worker).
476+
474477
%% Register entire modules to io pool
475478
py:register_pool(io, requests).
476479
py:register_pool(io, psycopg2).
@@ -484,7 +487,7 @@ py:register_pool(io, {db, query}). %% Only db.query goes to io pool
484487
{ok, Rows} = py:call(db, query, [Sql]). %% -> io pool (callable registered)
485488
```
486489

487-
This prevents slow HTTP requests from blocking quick math operations. See [Dual Pool Support](pools.md) for configuration and advanced usage.
490+
This prevents slow HTTP requests from blocking quick math operations. See [Pool Support](pools.md) for configuration and advanced usage.
488491

489492
## Zero-Copy Buffers
490493

@@ -517,7 +520,7 @@ See [Buffer API](buffer.md) for zero-copy memoryview access and fast substring s
517520

518521
## Next Steps
519522

520-
- See [Dual Pool Support](pools.md) for separating CPU and I/O operations
523+
- See [Pool Support](pools.md) for separating CPU and I/O operations
521524
- See [Type Conversion](type-conversion.md) for detailed type mapping
522525
- See [Context Affinity](context-affinity.md) for preserving Python state
523526
- See [Streaming](streaming.md) for working with generators

docs/migration.md

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -533,15 +533,16 @@ ok = py:deactivate_venv().
533533
{ok, #{<<"active">> := true, <<"venv_path">> := Path}} = py:venv_info().
534534
```
535535

536-
### Dual Pool Support
536+
### Custom Pool Support
537537

538-
Separate pools for CPU-bound and I/O-bound operations:
538+
Create pools on demand for CPU-bound and I/O-bound operations:
539539

540540
```erlang
541541
%% Default pool - CPU-bound operations (sized to schedulers)
542542
{ok, Result} = py:call(math, sqrt, [16]).
543543

544-
%% IO pool - I/O-bound operations (larger pool, default 10)
544+
%% Create io pool for I/O-bound operations
545+
{ok, _} = py_context_router:start_pool(io, 10, worker).
545546
{ok, Response} = py:call(io, requests, get, [Url]).
546547

547548
%% Registration-based routing (no call site changes)
@@ -552,14 +553,6 @@ py:register_pool(io, {aiohttp, get}), % Route specific function
552553
{ok, Response} = py:call(requests, get, [Url]). % Goes to io pool
553554
```
554555

555-
Configuration in `sys.config`:
556-
```erlang
557-
{erlang_python, [
558-
{io_pool_size, 10}, % Size of io pool (default: 10)
559-
{io_pool_mode, worker} % Mode for io pool (default: auto)
560-
]}.
561-
```
562-
563556
## Performance Improvements
564557

565558
The v2.0 release includes significant performance improvements:

docs/pools.md

Lines changed: 41 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
1-
# Dual Pool Support
1+
# Pool Support
22

3-
This guide covers erlang_python's dual pool architecture for separating CPU-bound and I/O-bound Python operations.
3+
This guide covers erlang_python's pool architecture for separating CPU-bound and I/O-bound Python operations.
44

55
## Overview
66

7-
erlang_python provides two separate pools of Python contexts:
7+
erlang_python provides a `default` pool that starts automatically, and allows you to create additional pools on demand:
88

99
| Pool | Purpose | Default Size | Use Case |
1010
|------|---------|--------------|----------|
1111
| `default` | Quick CPU-bound operations | Number of schedulers | Math, string processing, data transformation |
12-
| `io` | Slow I/O-bound operations | 10 | HTTP requests, database queries, file I/O |
12+
| custom pools | User-defined pools | User-defined | HTTP requests, database queries, GPU work |
1313

14-
This separation prevents slow I/O operations from blocking quick CPU operations.
14+
Create pools on demand to separate slow I/O operations from blocking quick CPU operations.
1515

1616
## Architecture
1717

@@ -28,19 +28,31 @@ This separation prevents slow I/O operations from blocking quick CPU operations.
2828
│ ┌──────────────┴──────────────┐ │
2929
│ ▼ ▼ │
3030
│ ┌────────────────┐ ┌────────────────┐ │
31-
│ │ default pool │ │ io pool │ │
32-
│ │ (N contexts) │ │ (10 contexts) │ │
31+
│ │ default pool │ │ custom pools │ │
32+
│ │ (N contexts) │ │ (on demand) │ │
3333
│ └────────────────┘ └────────────────┘ │
3434
│ │ │ │
3535
│ ┌────────┴────────┐ ┌────────┴────────┐ │
3636
│ ▼ ▼ ▼ ▼ ▼ ▼ │
37-
│ Ctx1 Ctx2 CtxN Ctx1 Ctx2 Ctx10
37+
│ Ctx1 Ctx2 CtxN Ctx1 Ctx2 CtxN
3838
│ (math) (json) (...) (http) (db) (...) │
3939
└──────────────────────────────────────────────────────────────────┘
4040
```
4141

4242
## Basic Usage
4343

44+
### Creating Custom Pools
45+
46+
Create pools on demand for specific workloads:
47+
48+
```erlang
49+
%% Create an io pool for I/O-bound operations
50+
{ok, _Contexts} = py_context_router:start_pool(io, 10, worker).
51+
52+
%% Create a gpu pool for ML workloads
53+
{ok, _} = py_context_router:start_pool(gpu, 2, worker).
54+
```
55+
4456
### Explicit Pool Selection
4557

4658
Specify the pool directly in the call:
@@ -49,7 +61,7 @@ Specify the pool directly in the call:
4961
%% Use default pool (quick operations)
5062
{ok, 4.0} = py:call(default, math, sqrt, [16]).
5163

52-
%% Use io pool (slow operations)
64+
%% Use io pool (after creating it)
5365
{ok, Response} = py:call(io, requests, get, [Url]).
5466

5567
%% With keyword arguments
@@ -153,31 +165,30 @@ default = py_context_router:lookup_pool(json, dumps). %% Function override
153165

154166
## Configuration
155167

156-
Configure pool sizes via application environment:
168+
Configure default pool size via application environment:
157169

158170
```erlang
159171
%% sys.config
160172
[
161173
{erlang_python, [
162174
%% Default pool size (default: erlang:system_info(schedulers))
163-
{default_pool_size, 8},
164-
165-
%% IO pool size (default: 10)
166-
{io_pool_size, 20},
167-
168-
%% IO pool mode: auto | subinterp | worker (default: auto)
169-
{io_pool_mode, worker}
175+
{default_pool_size, 8}
170176
]}
171177
].
172178
```
173179

174180
### Runtime Configuration
175181

176182
```erlang
177-
%% Start additional custom pool
183+
%% Start io pool for I/O-bound operations
184+
{ok, _} = py_context_router:start_pool(io, 10, worker).
185+
186+
%% Start GPU pool for ML operations
178187
{ok, _} = py_context_router:start_pool(gpu, 2, worker).
179188

180-
%% Register GPU operations
189+
%% Register operations to route to specific pools
190+
ok = py:register_pool(io, requests).
191+
ok = py:register_pool(io, psycopg2).
181192
ok = py:register_pool(gpu, torch).
182193
ok = py:register_pool(gpu, tensorflow).
183194
```
@@ -189,6 +200,9 @@ ok = py:register_pool(gpu, tensorflow).
189200
```erlang
190201
%% At application startup
191202
init_pools() ->
203+
%% Create io pool for I/O-bound operations
204+
{ok, _} = py_context_router:start_pool(io, 10, worker),
205+
192206
%% Register I/O-heavy modules
193207
py:register_pool(io, requests),
194208
py:register_pool(io, httpx),
@@ -212,7 +226,8 @@ handle_request(UserId) ->
212226
### ML Pipeline with I/O
213227

214228
```erlang
215-
%% Register I/O operations
229+
%% Create and configure io pool
230+
{ok, _} = py_context_router:start_pool(io, 10, worker),
216231
py:register_pool(io, boto3), %% S3 access
217232
py:register_pool(io, requests), %% API calls
218233

@@ -254,11 +269,15 @@ process_batch(Items) ->
254269

255270
```erlang
256271
%% Check pool status
257-
true = py_context_router:pool_started(default),
272+
true = py_context_router:pool_started(default).
273+
false = py_context_router:pool_started(io). %% Not started yet
274+
275+
%% Start io pool
276+
{ok, _} = py_context_router:start_pool(io, 10, worker).
258277
true = py_context_router:pool_started(io).
259278

260279
%% Check pool sizes
261-
DefaultSize = py_context_router:num_contexts(default),
280+
DefaultSize = py_context_router:num_contexts(default).
262281
IoSize = py_context_router:num_contexts(io).
263282

264283
%% List all registrations

src/py_context_init.erl

Lines changed: 8 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,18 @@
1515
%%% @doc Initializes the context router during application startup.
1616
%%%
1717
%%% This module provides a supervisor-compatible start function that
18-
%%% initializes the context pools and returns `ignore' (since no
18+
%%% initializes the context pool and returns `ignore' (since no
1919
%%% process needs to stay running after initialization).
2020
%%%
2121
%%% == Pools ==
2222
%%%
23-
%%% Two pools are started by default:
24-
%%% - `default' - For quick CPU-bound operations, sized to number of schedulers
25-
%%% - `io' - For I/O-bound operations, larger pool (default: 10) for concurrency
23+
%%% The `default' pool is started automatically, sized to number of schedulers.
24+
%%% Additional pools can be created on demand via `py_context_router:start_pool/3'.
2625
%%%
27-
%%% Pool sizes can be configured via application env:
26+
%%% Pool size can be configured via application env:
2827
%%% ```
2928
%%% {erlang_python, [
30-
%%% {default_pool_size, 4}, % Number of contexts (default: schedulers)
31-
%%% {io_pool_size, 10}, % I/O pool size (default: 10)
32-
%%% {io_pool_mode, worker} % Mode for io pool (default: worker)
29+
%%% {default_pool_size, 4} % Number of contexts (default: schedulers)
3330
%%% ]}.
3431
%%% '''
3532
%%% @private
@@ -53,17 +50,9 @@ start_link(Opts) ->
5350

5451
case py_context_router:start_pool(default, DefaultSize, DefaultMode) of
5552
{ok, _DefaultContexts} ->
56-
%% Start I/O pool if configured
57-
IoSize = application:get_env(erlang_python, io_pool_size, 10),
58-
IoMode = application:get_env(erlang_python, io_pool_mode, worker),
59-
case py_context_router:start_pool(io, IoSize, IoMode) of
60-
{ok, _IoContexts} ->
61-
%% The contexts are supervised by py_context_sup
62-
%% We don't need a process here, just return ignore
63-
ignore;
64-
{error, IoReason} ->
65-
{error, {io_pool_start_failed, IoReason}}
66-
end;
53+
%% The contexts are supervised by py_context_sup
54+
%% We don't need a process here, just return ignore
55+
ignore;
6756
{error, Reason} ->
6857
{error, {default_pool_start_failed, Reason}}
6958
end.

test/py_pool_SUITE.erl

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
%%% @doc Common Test suite for dual pool support.
1+
%%% @doc Common Test suite for pool support.
22
%%%
3-
%%% Tests the default and io pool separation to ensure:
3+
%%% Tests the default pool and custom pools to ensure:
44
%%% - Pools are independent
55
%%% - Pool-based calls work correctly
66
%%% - Backward compatibility is maintained
@@ -58,6 +58,8 @@ all() ->
5858

5959
init_per_suite(Config) ->
6060
application:ensure_all_started(erlang_python),
61+
%% Create io pool manually (no longer auto-started)
62+
{ok, _} = py_context_router:start_pool(io, 10, worker),
6163
Config.
6264

6365
end_per_suite(_Config) ->
@@ -73,15 +75,15 @@ end_per_testcase(_TestCase, _Config) ->
7375
%% Test cases
7476
%% ============================================================================
7577

76-
%% @doc Verify both pools are started on application start.
78+
%% @doc Verify pools are started (default auto, io from init_per_suite).
7779
test_pools_started(_Config) ->
78-
%% Default pool should be started
80+
%% Default pool should be started automatically
7981
true = py_context_router:pool_started(default),
8082
DefaultSize = py_context_router:num_contexts(default),
8183
true = DefaultSize > 0,
8284
ct:pal("Default pool size: ~p", [DefaultSize]),
8385

84-
%% IO pool should be started
86+
%% IO pool was started in init_per_suite
8587
true = py_context_router:pool_started(io),
8688
IoSize = py_context_router:num_contexts(io),
8789
true = IoSize > 0,
@@ -191,10 +193,10 @@ test_pool_sizes(_Config) ->
191193
ExpectedDefault = erlang:system_info(schedulers),
192194
ct:pal("Default pool: expected ~p, actual ~p", [ExpectedDefault, DefaultSize]),
193195

194-
%% IO pool should be 10 by default (from py_context_init)
196+
%% IO pool was created with size 10 in init_per_suite
195197
IoSize = py_context_router:num_contexts(io),
196198
ct:pal("IO pool size: ~p", [IoSize]),
197-
true = IoSize > 0,
199+
10 = IoSize,
198200

199201
%% Verify we can list all contexts from each pool
200202
DefaultContexts = py_context_router:contexts(default),

0 commit comments

Comments
 (0)