Skip to content

Latest commit

 

History

History
917 lines (720 loc) · 45 KB

File metadata and controls

917 lines (720 loc) · 45 KB

Changelog

2.4.0 (Unreleased)

Added

  • Context thread affinity - Contexts in MULTI_EXECUTOR mode are now assigned a fixed executor thread at creation. All operations (call, eval, exec) from the same context run on the same OS thread, preventing thread state corruption in libraries like numpy and PyTorch that have thread-local state.

Changed

  • py:execution_mode/0 now returns actual mode - Returns worker (default), owngil, free_threaded, or multi_executor based on actual configuration instead of Python capability. Previously returned subinterp even when using worker mode.

  • Removed obsolete subinterp test references - Test suites updated to reflect the removal of subinterpreter mode. Tests now use worker or owngil modes.

2.3.1 (2026-04-01)

Fixed

  • Executor affinity for numpy/torch - Workers are now assigned a fixed executor thread at creation. All calls from the same worker go to the same executor, preventing thread state corruption in libraries like numpy and PyTorch that have thread-local state. Fixes segfaults when using sentence-transformers or other ML libraries.

2.3.0 (2026-03-29)

Removed

  • ASGI/WSGI Support - The py_asgi and py_wsgi modules have been removed
    • py_asgi:run/4,5 - ASGI application runner
    • py_wsgi:run/3,4 - WSGI application runner
    • For web framework integration, use py:call with event loop contexts or the Channel API
    • See Migration Guide for alternatives

Added

  • SharedDict - Process-scoped shared dictionaries for cross-process state
    • py:shared_dict_new/0 - Create a new SharedDict
    • py:shared_dict_get/2,3 - Get value with optional default
    • py:shared_dict_set/3 - Set key-value pair
    • py:shared_dict_del/2 - Delete a key
    • py:shared_dict_keys/1 - List all keys
    • py:shared_dict_destroy/1 - Explicit cleanup
    • Python access via erlang.SharedDict with dict-like interface
    • Mutex-protected for concurrent access (~300k ops/sec)
    • Pickle serialization for complex types
    • See SharedDict documentation for details

2.2.0 (2026-03-24)

Added

  • OWN_GIL Mode - True parallel Python execution with Python 3.14+ subinterpreters

    • Each subinterpreter runs with its own GIL (Py_GIL_OWN) in a dedicated thread
    • Full isolation between interpreters (separate namespaces, modules, state)
    • py_context:start_link(N, owngil) to create OWN_GIL contexts
    • Enables true parallelism for CPU-bound Python workloads
    • See OWN_GIL Internals for architecture details
  • Process-Bound Python Environments - Per-Erlang-process Python namespaces

    • Each Erlang process gets isolated Python globals/locals
    • State persists across calls within the same process
    • Automatic cleanup when Erlang process terminates
    • See Process-Bound Environments for details
  • Event Loop Pool - Process affinity for parallel async execution

    • py_event_loop_pool distributes async tasks across multiple event loops
    • Scheduler-affinity routing for cache-friendly execution
    • Supports worker, subinterp, and owngil modes
  • ByteChannel API - Raw byte streaming without term serialization

    • py_byte_channel:new/0,1 - Create byte channels
    • py_byte_channel:send/2 - Send raw bytes
    • py_byte_channel:recv/1,2 - Receive bytes
    • Python ByteChannel class with sync/async iteration
    • Ideal for HTTP bodies, file streaming, binary protocols
  • PyBuffer API - Zero-copy buffer for streaming input

    • py_buffer:new/0,1 - Create buffers with optional max size
    • py_buffer:write/2 - Write data to buffer
    • Python PyBuffer class with file-like interface (read, readline, readlines)
    • Non-blocking reads for async I/O patterns
    • See Buffer API for details
  • True streaming API - New py:stream_start/3,4 and py:stream_cancel/1 functions for event-driven streaming from Python generators. Unlike py:stream/3,4 which collects all values at once, stream_start sends {py_stream, Ref, {data, Value}} messages as values are yielded. Supports both sync and async generators. Useful for LLM token streaming, real-time data feeds, and processing large sequences incrementally.

  • erlang.whereis(name) - Lookup registered Erlang process PIDs from Python

    • Returns erlang.Pid object or None if not registered
    • Enables Python code to discover and message named processes
  • erlang.schedule_inline(callback) - Inline continuation scheduling

    • Release dirty scheduler and continue with callback in same context
    • Preserves globals/locals across the continuation
    • Useful for cooperative long-running tasks
  • py:spawn_call/3,4,5 - Fire-and-forget with result delivery

    • Executes Python call asynchronously
    • Sends {py_result, Ref, Result} to caller when complete
    • Non-blocking alternative to py:call for async patterns
  • Explicit bytes conversion - {bytes, Binary} tuple for round-trip safety

    • Erlang binaries convert to Python str by default
    • Use {bytes, Binary} to force Python bytes type
    • Ensures correct handling for binary protocols
  • Import caching API - Lazy module import with caching

    • py:import/1,2 - Import and cache modules
    • py:add_import/1,2 - Register imports applied to all contexts
    • py:add_path/1 - Add to sys.path across all contexts
    • Per-interpreter caching with generation tracking
  • Per-interpreter preload code - Execute code in new interpreters

    • Configure via {erlang_python, [{preload_code, <<"import mylib">>}]}
    • Code runs with inherited globals from main interpreter
    • Useful for initializing common imports/state

Fixed

  • Channel notification for create_task - Fixed async channel receive hanging when using py_event_loop:create_task. The event_loop_add_pending() now sends task_ready to the worker, not just pthread_cond_signal. Also fixed Python 3.9 compatibility in ByteChannel (Optional[bytes] instead of bytes | None)

  • Channel waiter race condition - Fixed waiter_exists errors during fast async iteration. Waiter state is now cleared before releasing mutex, preventing race where callback fires before channel_send clears has_waiter

  • Event Loop Isolation and Resource Safety - Three fixes for event loop and atom handling

    • Single-loop-per-interpreter enforcement - Prevents multiple ErlangEventLoop instances from causing event confusion. Added _has_loop_ref() check that detects running loops; attempting to create a second loop while one is running raises RuntimeError
    • Atom creation safety - Added Python-level caching with configurable limit (10000 default, ERLANG_PYTHON_MAX_ATOMS env var) to prevent BEAM atom table exhaustion from untrusted code. The erlang.atom() API now goes through the cached wrapper; internal _atom() NIF still available
    • Global capsule resource leak - Added global_loop_capsule_destructor that properly calls enif_release_resource() when capsule is garbage collected. Previously NULL destructor caused reference leaks on each ErlangEventLoop creation
  • Python 3.14 venv activation - Fixed .pth file processing in subinterpreters. Python 3.14 stricter module isolation prevented sys._venv_site_packages from persisting across eval/exec calls. Now embeds site-packages path directly in the exec code string

  • OWN_GIL Safety Fixes - Critical fixes for OWN_GIL subinterpreter mode

    • Mutex leak in erlang module - async_futures_mutex now always destroyed in erlang_module_free() regardless of pipe_initialized flag
    • ABBA deadlock prevention - Fixed lock ordering in event_loop_down() and event_loop_destructor() to acquire GIL before namespaces_mutex, matching the normal execution path and preventing deadlocks
    • Dangling env pointer detection - Added interp_id validation in owngil_execute_*_with_env() functions to detect and reject env resources created by a different interpreter, returning {error, env_wrong_interpreter}
    • OWN_GIL callback documentation - Documented that erlang.call() from OWN_GIL contexts uses thread_worker_call() rather than suspension/resume protocol; re-entrant calls to the same OWN_GIL context are not supported

Changed

  • py:cast is now fire-and-forget - py:cast/3,4,5 no longer returns a reference. For async calls with result delivery, use the new py:spawn_call/3,4,5 instead.

  • OWN_GIL requires Python 3.14+ - The OWN_GIL subinterpreter mode requires Python 3.14 or later due to C extension compatibility issues in earlier versions. Use worker or subinterp modes for Python 3.12-3.13.

  • Removed auto-started io pool - The io pool is no longer started automatically at application startup to reduce memory usage. Users who need a dedicated I/O pool can create one manually via py_context_router:start_pool(io, 10, worker). The configuration options io_pool_size and io_pool_mode have been removed.

  • Removed py_event_router - Removed legacy py_event_router module. The py_event_worker now handles all event loop functionality including FD events, timers, and task processing. This simplifies the architecture by consolidating event handling into a single worker process. The py_nif:set_shared_router/1 function has been removed.

  • Config-based initialization - Import and path configuration via application environment

    • Configure imports: {erlang_python, [{imports, [{json, dumps}]}]}
    • Configure paths: {erlang_python, [{paths, ["/path/to/modules"]}]}
    • Applied immediately to all running interpreters
    • See Imports documentation for details

Performance

  • Direct NIF channel operations - Channel send/receive bypass erlang.call() overhead for up to 1760x speedup in raw throughput benchmarks

  • nif_process_ready_tasks optimization - ~15% improvement in async task processing

    • Replace asyncio.iscoroutine() with PyCoro_CheckExact C API
    • Use stack buffers for module/func strings
    • Cache asyncio.events module
    • Pool ErlNifEnv allocations with mutex protection

2.1.0 (2026-03-12)

Added

  • Async Task API - uvloop-inspired task submission from Erlang

    • py_event_loop:run/3,4 - Blocking run of async Python functions
    • py_event_loop:create_task/3,4 - Non-blocking task submission with reference
    • py_event_loop:await/1,2 - Wait for task result with timeout
    • py_event_loop:spawn_task/3,4 - Fire-and-forget task execution
    • Thread-safe submission via enif_send (works from dirty schedulers)
    • Message-based result delivery via {async_result, Ref, Result}
    • See Async Task API docs for details
  • erlang.spawn_task(coro) - Spawn async tasks from both sync and async contexts

    • Works in sync code called by Erlang (where asyncio.get_running_loop() fails)
    • Returns asyncio.Task for optional await/cancel (fire-and-forget pattern)
    • Automatically wakes up the event loop in sync context
  • Explicit Scheduling API - Control dirty scheduler release from Python

    • erlang.schedule(callback, *args) - Release scheduler, continue via Erlang callback
    • erlang.schedule_py(module, func, args, kwargs) - Release scheduler, continue in Python
    • erlang.consume_time_slice(percent) - Check if NIF time slice exhausted
    • ScheduleMarker type for cooperative long-running tasks
    • See Scheduling API docs
  • Distributed Python Execution - Documentation and Docker demo

    • Run Python across Erlang nodes using rpc:call
    • Docker Compose setup for testing distributed patterns
    • See Distributed Execution docs

Changed

  • Event Loop Performance Optimizations
    • Growable pending queue with capacity doubling (256 to 16384)
    • Snapshot-detach pattern to reduce mutex contention
    • Callable cache (64 slots) avoids PyImport/GetAttr per task
    • Task wakeup coalescing with atomic flag
    • Drain-until-empty loop for faster task processing

Fixed

  • ensure_venv now always installs dependencies, even if venv exists
  • erlang.sleep() timing in sync context
  • time() returns fresh value when loop not running
  • Handle pooling bugs in ErlangEventLoop
  • Task wakeup race causing batch task stalls

2.0.0 (2026-03-09)

Added

  • Virtual Environment Management - Automatic venv creation and activation

    • py:ensure_venv/2,3 - Create venv if missing, then activate
    • Automatically detects Python executable
    • Supports pip install of dependencies
  • File Descriptor Duplication - Safe socket handoff from Erlang to Python

    • py:dup_fd/1 - Duplicate fd for independent ownership
    • Prevents double-close issues when passing sockets to Python reactor
  • Custom Pool Support - Create pools on demand for CPU-bound and I/O-bound operations

    • default pool - Automatically started, sized to number of schedulers
    • py_context_router:start_pool/2,3 - Start named pools programmatically
    • py_context_router:stop_pool/1 - Stop a named pool
    • py_context_router:pool_started/1 - Check if a pool is running
    • py_context_router:get_context(Pool) - Get context from a named pool
    • py_context_router:num_contexts(Pool) - Get pool size
    • py_context_router:contexts(Pool) - Get all contexts in a pool
    • py_context_router:lookup_pool(Module, Func) - Query pool routing
    • py:call(PoolName, Module, Func, Args) - Execute on a specific pool
    • Registration-based routing (no call site changes needed):
      • py:register_pool(io, requests) - Route all requests.* calls to io pool
      • py:register_pool(io, {aiohttp, get}) - Route specific function to io pool
      • py:unregister_pool(Module) - Remove module registration
      • py:unregister_pool({Module, Func}) - Remove function registration
      • Automatic routing: py:call(requests, get, [Url]) goes to io pool when registered
    • Backward compatible: existing code using py:call/3,4,5 works unchanged
    • New test suite: test/py_pool_SUITE.erl
  • Channel API - Bidirectional message passing between Erlang and Python

    • py_channel:new/0,1 - Create channels with optional backpressure (max_size)
    • py_channel:send/2 - Send Erlang terms to Python (returns busy on backpressure)
    • py_channel:close/1 - Close channel, signals StopIteration to Python
    • Python Channel class with sync and async interfaces:
      • channel.receive() - Blocking receive (suspends Python, yields to Erlang)
      • channel.try_receive() - Non-blocking receive
      • await channel.async_receive() - Asyncio-compatible receive
      • for msg in channel: - Sync iteration
      • async for msg in channel: - Async iteration
    • erlang.channel.reply(pid, term) - Send messages to Erlang processes
    • Zero-copy IOQueue buffering via enif_ioq
    • 8x faster than Reactor for small messages, 2x faster for 16KB messages
  • OWN_GIL Subinterpreter Thread Pool - True parallelism with Python 3.12+ subinterpreters

    • Each subinterpreter runs in its own thread with its own GIL (Py_GIL_OWN)
    • Thread pool manages N subinterpreters for parallel Python execution
    • py:context(N) returns the Nth context PID for explicit context selection
    • py_context_router provides scheduler-affinity routing for automatic distribution
    • Cast operations are 25-30% faster compared to worker mode
    • Full isolation between subinterpreters (separate namespaces, modules, state)
    • New C files: py_subinterp_pool.c, py_subinterp_pool.h
  • erlang.reactor module - FD-based protocol handling for building custom servers

    • reactor.Protocol - Base class for implementing protocols
    • reactor.serve(sock, protocol_factory) - Serve connections using a protocol
    • reactor.run_fd(fd, protocol_factory) - Handle a single FD with a protocol
    • Integrates with Erlang's enif_select for efficient I/O multiplexing
    • Zero-copy buffer management for high-throughput scenarios
    • Supports SHARED_GIL subinterpreters via py_reactor_context
    • Each reactor context has isolated protocol factory when using mode=subinterp
  • ETF encoding for PIDs and References - Full Erlang term format support

    • Erlang PIDs encode/decode properly in ETF binary format
    • Erlang References encode/decode properly in ETF binary format
    • Enables proper serialization for distributed Erlang communication
  • PID serialization - Erlang PIDs now convert to erlang.Pid objects in Python and back to real PIDs when returned to Erlang. Previously, PIDs fell through to None (Erlang→Python) or string representation (Python→Erlang).

  • erlang.send(pid, term) - Fire-and-forget message passing from Python to Erlang processes. Uses enif_send() directly with no suspension or blocking. Raises erlang.ProcessError if the target process is dead.

  • erlang.ProcessError - New exception for dead/unreachable process errors. Subclass of Exception, so it's catchable with except Exception or except erlang.ProcessError.

  • Audit hook sandbox - Block dangerous operations when running inside Erlang VM

    • Uses Python's sys.addaudithook() (PEP 578) for low-level blocking
    • Blocks: os.fork, os.system, os.popen, os.exec*, os.spawn*, subprocess.Popen
    • Raises RuntimeError with clear message about using Erlang ports instead
    • Automatically installed when py_event_loop NIF is available
  • Process-per-context architecture - Each Python context runs in dedicated process

    • py_context_process - Gen_server managing a single Python context
    • py_context_sup - Supervisor for context processes
    • py_context_router - Routes calls to appropriate context process
    • Improved isolation between contexts
    • Better crash recovery and resource management
  • Worker thread pool - High-throughput Python operations

    • Configurable pool size for parallel execution
    • Efficient work distribution across threads
  • py:contexts_started/0 - Helper to check if contexts are ready

Changed

  • py:call_async renamed to py:cast - Follows gen_server convention where call is synchronous and cast is asynchronous. The semantics are identical, only the name changed.

  • Unified erlang Python module - Consolidated callback and event loop APIs

    • erlang.run(coro) - Run coroutine with ErlangEventLoop (like uvloop.run)
    • erlang.new_event_loop() - Create new ErlangEventLoop instance
    • erlang.install() - Install ErlangEventLoopPolicy (deprecated in 3.12+)
    • erlang.EventLoopPolicy - Alias for ErlangEventLoopPolicy
    • Removed separate erlang_asyncio module - all functionality now in erlang
  • Async worker backend replaced with event loop model - The pthread+usleep polling async workers have been replaced with an event-driven model using py_event_loop and enif_select:

    • Removed py_async_worker.erl and py_async_worker_sup.erl
    • Removed py_async_worker_t and async_pending_t structs from C code
    • Deprecated async_worker_new, async_call, async_gather, async_stream NIFs
    • Added py_event_loop_pool.erl for managing event loop-based async execution
    • Added py_event_loop:run_async/2 for submitting coroutines to event loops
    • Added nif_event_loop_run_async NIF for direct coroutine submission
    • Added _run_and_send wrapper in Python for result delivery via erlang.send()
    • Internal change: py:async_call/3,4 and py:await/1,2 API unchanged
  • SuspensionRequired base class - Now inherits from BaseException instead of Exception. This prevents ASGI/WSGI middleware except Exception handlers from intercepting the suspension control flow used by erlang.call().

  • Per-interpreter isolation in py_event_loop.c - Removed global state for proper subinterpreter support. Each interpreter now has isolated event loop state.

  • ErlangEventLoopPolicy always returns ErlangEventLoop - Previously only returned ErlangEventLoop for main thread; now consistent across all threads.

Deprecated

  • py_asgi module - Deprecated in favor of the Channel API (py_channel) or Reactor API (erlang.reactor). The module still works but will be removed in a future release.

  • py_wsgi module - Deprecated in favor of the Channel API (py_channel) or Reactor API (erlang.reactor). The module still works but will be removed in a future release.

Removed

  • Context affinity functions - Removed py:bind, py:unbind, py:is_bound, py:with_context, and py:ctx_* functions. The new py_context_router provides automatic scheduler-affinity routing. For explicit context control, use py_context_router:bind_context/1 and py_context:call/5.

  • Signal handling support - Removed add_signal_handler/remove_signal_handler from ErlangEventLoop. Signal handling should be done at the Erlang VM level. Methods now raise NotImplementedError with guidance.

  • Subprocess support - ErlangEventLoop raises NotImplementedError for subprocess_shell and subprocess_exec. Use Erlang ports (open_port/2) for subprocess management instead.

Fixed

  • py_reactor_context now extends erlang module in subinterpreters - Previously, py_reactor_context with mode=subinterp would fail to import erlang.reactor because the erlang module extension was not applied. Now calls py_context:extend_erlang_module_in_context/1 after context creation.

  • FD stealing and UDP connected socket issues - Fixed file descriptor handling for UDP sockets in connected mode

  • Context test expectations - Updated tests for Python contextvars behavior

  • Unawaited coroutine warnings - Fixed warnings in test suite

  • Timer scheduling for standalone ErlangEventLoop - Fixed timer callbacks not firing for loops created outside the main event loop infrastructure

  • Subinterpreter cleanup and thread worker re-registration - Fixed cleanup issues when subinterpreters are destroyed and recreated

  • ProcessError exception class identity in subinterpreters - Fixed exception class mismatch when raising erlang.ProcessError in subinterpreter contexts. The exception class is now looked up from the current interpreter's erlang module at runtime instead of using a global variable.

  • Thread worker handlers not re-registering after app restart - Workers now properly re-register when application restarts

  • Timeout handling - Improved timeout handling across the codebase

  • Eval locals_term initialization - Fixed uninitialized variable in eval

  • Two race conditions in worker pool - Fixed concurrent access issues

  • activate_venv/1 now processes .pth files - Uses site.addsitedir() instead of sys.path.insert() so that editable installs (uv, pip -e, poetry) work correctly. New paths are moved to the front of sys.path for proper priority.

  • deactivate_venv/0 now restores sys.path - The previous implementation used py:eval with semicolon-separated statements which silently failed (eval only accepts expressions). Switched to py:exec for correct statement execution.

Performance

  • Async coroutine latency reduced from ~10-20ms to <1ms - The event loop model eliminates pthread polling overhead
  • Zero CPU usage when idle - Event-driven instead of usleep-based polling
  • No extra threads - Coroutines run on the existing event loop infrastructure

1.8.1 (2026-02-25)

Fixed

  • ASGI scope caching bug - HTTP method was not treated as a dynamic field in the scope template cache. This caused incorrect method values when the same path was accessed with different HTTP methods (e.g., GET /path followed by POST /path would return method="GET" for both requests).

1.8.0 (2026-02-25)

Added

  • ASGI NIF Optimizations - Six optimizations for high-performance ASGI request handling

    • Direct Response Tuple Extraction - Extract (status, headers, body) directly without generic conversion
    • Pre-Interned Header Names - 16 common HTTP headers cached as PyBytes objects
    • Cached Status Code Integers - 14 common HTTP status codes cached as PyLong objects
    • Zero-Copy Request Body - Large bodies (≥1KB) use buffer protocol for zero-copy access
    • Scope Template Caching - Thread-local cache of 64 scope templates keyed by path hash
    • Lazy Header Conversion - Headers converted on-demand for requests with ≥4 headers
  • erlang_asyncio Module - Asyncio-compatible primitives using Erlang's native scheduler

    • erlang_asyncio.sleep(delay, result=None) - Sleep using Erlang's erlang:send_after/3
    • erlang_asyncio.run(coro) - Run coroutine with ErlangEventLoop
    • erlang_asyncio.gather(*coros) - Run coroutines concurrently
    • erlang_asyncio.wait_for(coro, timeout) - Wait with timeout
    • erlang_asyncio.wait(fs, timeout, return_when) - Wait for multiple futures
    • erlang_asyncio.create_task(coro) - Create background task
    • erlang_asyncio.ensure_future(coro) - Wrap coroutine in Future
    • erlang_asyncio.shield(arg) - Protect from cancellation
    • erlang_asyncio.timeout - Context manager for timeouts
    • Event loop functions: get_event_loop(), new_event_loop(), set_event_loop(), get_running_loop()
    • Re-exports: TimeoutError, CancelledError, ALL_COMPLETED, FIRST_COMPLETED, FIRST_EXCEPTION
  • Erlang Sleep NIF - Synchronous sleep primitive for Python

    • py_event_loop._erlang_sleep(delay_ms) - Sleep using Erlang timer
    • Releases GIL during sleep, no Python event loop overhead
    • Uses pthread condition variables for efficient blocking
    • py_nif:dispatch_sleep_complete/2 - NIF to signal sleep completion
  • Scalable I/O Model - Worker-per-context architecture

    • py_event_worker - Dedicated worker process per Python context
    • Combined FD event dispatch and reselect via handle_fd_event_and_reselect NIF
    • Sleep tracking with sleeps map in worker state
  • New Test Suite - test/py_erlang_sleep_SUITE.erl with 8 tests

    • test_erlang_sleep_available - Verify NIF is exposed
    • test_erlang_sleep_basic - Basic functionality
    • test_erlang_sleep_zero - Zero delay returns immediately
    • test_erlang_sleep_accuracy - Timing accuracy
    • test_erlang_asyncio_module - Module functions present
    • test_erlang_asyncio_gather - Concurrent execution
    • test_erlang_asyncio_wait_for - Timeout support
    • test_erlang_asyncio_create_task - Background tasks

Performance

  • ASGI marshalling optimizations - 40-60% improvement for typical ASGI workloads
    • Direct response extraction: 5-10% improvement
    • Pre-interned headers: 3-5% improvement
    • Cached status codes: 1-2% improvement
    • Zero-copy body buffers: 10-15% for large bodies (≥1KB)
    • Scope template caching: 15-20% for repeated paths
    • Lazy header conversion: 5-10% for apps accessing few headers
  • Eliminates event loop overhead for sleep operations (~0.5-1ms saved per call)
  • Sub-millisecond timer precision via BEAM scheduler (vs 10ms asyncio polling)
  • Zero CPU when idle - event-driven, no polling

1.7.1 (2026-02-23)

Fixed

  • Hex package missing priv directory - Added explicit files configuration to include priv/erlang_loop.py and other necessary files in the hex.pm package

1.7.0 (2026-02-23)

Added

  • Shared Router Architecture for Event Loops

    • Single py_event_router process handles all event loops
    • Timer and FD messages include loop identity for correct dispatch
    • Eliminates need for per-loop router processes
    • Handle-based Python C API using PyCapsule for loop references
  • Per-Loop Capsule Architecture - Each ErlangEventLoop instance has its own isolated capsule

    • Dedicated pending queue per loop for proper event routing
    • Full asyncio support (timers, FD operations) with correct loop isolation
    • Safe for multi-threaded Python applications where each thread needs its own loop
    • See docs/asyncio.md for usage and architecture details

1.6.1 (2026-02-22)

Fixed

  • ASGI headers now correctly use bytes instead of str - Fixed ASGI spec compliance issue where headers were being converted to Python str objects instead of bytes. The ASGI specification requires headers to be list[tuple[bytes, bytes]]. This was causing authentication failures and form parsing issues with frameworks like Starlette and FastAPI, which search for headers using bytes keys (e.g., b"content-type").
    • Added explicit header handling in asgi_scope_from_map() to bypass generic conversion
    • Headers are now correctly converted using PyBytes_FromStringAndSize()
    • Supports both list [name, value] and tuple {name, value} header formats from Erlang
    • Fixes GitHub issue #1

1.6.0 (2026-02-22)

Added

  • Python Logging Integration - Forward Python's logging module to Erlang's logger

    • py:configure_logging/0,1 - Setup Python logging to forward to Erlang
    • erlang.ErlangHandler - Python logging handler that sends to Erlang
    • erlang.setup_logging(level, format) - Configure logging from Python
    • Fire-and-forget architecture using enif_send() for non-blocking messaging
    • Level filtering at NIF level for performance (skip message creation for filtered logs)
    • Log metadata includes module, line number, and function name
    • Thread-safe - works from any Python thread
  • Distributed Tracing - Collect trace spans from Python code

    • py:enable_tracing/0, py:disable_tracing/0 - Enable/disable span collection
    • py:get_traces/0 - Retrieve collected spans
    • py:clear_traces/0 - Clear collected spans
    • erlang.Span(name, **attrs) - Context manager for creating spans
    • erlang.trace(name) - Decorator for tracing functions
    • Span events via span.event(name, **attrs)
    • Automatic parent/child span linking via thread-local storage
    • Error status capture with exception details
    • Duration tracking in microseconds
  • New Erlang modules

    • py_logger - gen_server receiving log messages from Python workers
    • py_tracer - gen_server collecting and managing trace spans
  • New C source

    • c_src/py_logging.c - NIF implementations for logging and tracing
  • Documentation and examples

    • docs/logging.md - Logging and tracing documentation
    • examples/logging_example.erl - Working escript example
    • Updated docs/getting-started.md with logging/tracing section
  • New test suite

    • test/py_logging_SUITE.erl - 9 tests for logging and tracing
  • ATOM_NIL for Elixir nil compatibility in type conversions

Performance

  • Type conversion optimizations - Faster Python ↔ Erlang marshalling

    • Use enif_is_identical for atom comparison instead of strcmp
    • Use PyLong_AsLongLongAndOverflow to avoid exception machinery
    • Cache numpy.ndarray type at init for fast isinstance checks
    • Stack allocate small tuples/maps (≤16 elements) to avoid heap allocation
    • Use enif_make_map_from_arrays for O(n) map building vs O(n²) puts
    • Reorder type checks for web workloads (strings/dicts first)
    • UTF-8 decode with bytes fallback for invalid sequences
  • Fire-and-forget NIF architecture - Log and trace calls never block Python execution

    • Uses enif_send() to dispatch messages asynchronously to Erlang processes
    • Python code continues immediately after sending, no round-trip wait
  • NIF-level log filtering - Messages below threshold are discarded before term creation

    • Volatile bool flags for O(1) receiver availability checks
    • Level threshold stored in C global, no Erlang callback needed
  • Minimal term allocation - Direct Erlang term building without intermediate structures

    • Timestamps captured at NIF level using enif_monotonic_time()

Fixed

  • Python 3.12+ event loop thread isolation - Fixed asyncio timeouts on Python 3.12+
    • ErlangEventLoop now only used for main thread; worker threads get SelectorEventLoop
    • Async worker threads bypass the policy to create SelectorEventLoop directly
    • Per-call ErlNifEnv for thread-safe timer scheduling in free-threaded mode
    • Fail-fast error handling in erlang_loop.py instead of silent hangs
    • Added gil_acquire()/gil_release() helpers to avoid GIL double-acquisition

1.5.0 (2026-02-18)

Added

  • py_asgi module - Optimized ASGI request handling with:

    • Pre-interned Python string keys (15+ ASGI scope keys)
    • Cached constant values (http type, HTTP versions, methods, schemes)
    • Thread-local response pooling (16 slots per thread, 4KB initial buffer)
    • Direct NIF path bypassing generic py:call()
    • ~60-80% throughput improvement over py:call()
    • Configurable runner module via runner option
    • Sub-interpreter and free-threading (Python 3.13+) support
  • py_wsgi module - Optimized WSGI request handling with:

    • Pre-interned WSGI environ keys
    • Direct NIF path for marshalling
    • ~60-80% throughput improvement over py:call()
    • Sub-interpreter and free-threading support
  • Web frameworks documentation - New documentation at docs/web-frameworks.md

1.4.0 (2026-02-18)

Added

  • Erlang-native asyncio event loop - Custom asyncio event loop backed by Erlang's scheduler

    • ErlangEventLoop class in priv/erlang_loop.py
    • Sub-millisecond latency via Erlang's enif_select (vs 10ms polling)
    • Zero CPU usage when idle - no busy-waiting or polling overhead
    • Full GIL release during waits for better concurrency
    • Native Erlang scheduler integration for I/O events
    • Event loop policy via get_event_loop_policy()
  • TCP support for asyncio event loop

    • create_connection() - TCP client connections
    • create_server() - TCP server with accept loop
    • _ErlangSocketTransport - Non-blocking socket transport with write buffering
    • _ErlangServer - TCP server with serve_forever() support
  • UDP/datagram support for asyncio event loop

    • create_datagram_endpoint() - Create UDP endpoints with full parameter support
    • _ErlangDatagramTransport - Datagram transport implementation
    • Parameters: local_addr, remote_addr, reuse_address, reuse_port, allow_broadcast
    • DatagramProtocol callbacks: datagram_received(), error_received()
    • Support for both connected and unconnected UDP
    • New NIF helpers: create_test_udp_socket, sendto_test_udp, recvfrom_test_udp, set_udp_broadcast
    • New test suite: test/py_udp_e2e_SUITE.erl
  • Asyncio event loop documentation

    • New documentation: docs/asyncio.md
    • Updated docs/getting-started.md with link to asyncio documentation

Performance

  • Event loop optimizations
    • Fixed run_until_complete callback removal bug (was using two different lambda references)
    • Cached ast.literal_eval lookup at module initialization (avoids import per callback)
    • O(1) timer cancellation via handle-to-callback_id reverse map (was O(n) iteration)
    • Detach pending queue under mutex, build Erlang terms outside lock (reduced contention)
    • O(1) duplicate event detection using hash set (was O(n) linear scan)
    • Added PERF_BUILD cmake option for aggressive optimizations (-O3, LTO, -march=native)

1.3.2 (2026-02-17)

Fixed

  • torch/PyTorch introspection compatibility - Fixed AttributeError: 'erlang.Function' object has no attribute 'endswith' when importing torch or sentence_transformers in contexts where erlang_python callbacks are registered.
    • Root cause: torch does dynamic introspection during import, iterating through Python's namespace and calling .endswith() on objects. The erlang module's __getattr__ was returning ErlangFunction wrappers for any attribute access.
    • Solution: Added C-side callback name registry. Now __getattr__ only returns ErlangFunction wrappers for actually registered callbacks. Unregistered attributes raise AttributeError (normal Python behavior).
    • New test: test_callback_name_registry in py_reentrant_SUITE.erl

1.3.1 (2026-02-16)

Fixed

  • Hex.pm packaging - Added files section to app.src to include build scripts (do_cmake.sh, do_build.sh) and other necessary files in the hex.pm package

1.3.0 (2026-02-16)

Added

  • Asyncio Support - New erlang.async_call() for asyncio-compatible callbacks
    • await erlang.async_call('func', arg1, arg2) - Call Erlang from async Python code
    • Integrates with asyncio event loop via add_reader()
    • No exceptions raised for control flow (unlike erlang.call())
    • Releases dirty NIF thread while waiting (non-blocking)
    • Works with FastAPI, Starlette, aiohttp, and other ASGI frameworks
    • Supports concurrent calls via asyncio.gather()
    • New test: test_async_call in py_reentrant_SUITE.erl
    • New test module: test/py_test_async.py
    • Updated documentation: docs/threading.md - Added Asyncio Support section

Fixed

  • Flag-based callback detection in replay path - Fixed SuspensionRequired exceptions leaking when ASGI middleware catches and re-raises exceptions. The replay path in nif_resume_callback_dirty now uses flag-based detection (checking tl_pending_callback) instead of exception-type detection.

Changed

  • C code optimizations and refactoring
    • Thread safety fixes: Used pthread_once for async callback initialization, fixed mutex held during Python calls in async event loop thread
    • Timeout handling: Added read_with_timeout() and read_length_prefixed_data() helpers with proper timeouts on all blocking pipe reads (30s for callbacks, 10s for spawns)
    • Code deduplication: Merged create_suspended_state() and create_suspended_state_from_existing() into unified create_suspended_state_ex(), extracted build_pending_callback_exc_args() and build_suspended_result() helpers
    • Performance: Optimized list conversion using enif_make_list_cell() to build lists directly without temporary array allocation
    • Removed unused make_suspended_term() function

1.2.0 (2026-02-15)

Added

  • Context Affinity - Bind Erlang processes to dedicated Python workers for state persistence

    • py:bind() / py:unbind() - Bind current process to a worker, preserving Python state
    • py:bind(new) - Create explicit context handles for multiple contexts per process
    • py:with_context(Fun) - Scoped helper with automatic bind/unbind
    • Context-aware functions: py:ctx_call/4-6, py:ctx_eval/2-4, py:ctx_exec/2
    • Automatic cleanup via process monitors when bound processes die
    • O(1) ETS-based binding lookup for minimal overhead
    • New test suite: test/py_context_SUITE.erl
  • Python Thread Support - Any spawned Python thread can now call erlang.call() without blocking

    • Supports threading.Thread, concurrent.futures.ThreadPoolExecutor, and any other Python threads
    • Each spawned thread lazily acquires a dedicated "thread worker" channel
    • One lightweight Erlang process per Python thread handles callbacks
    • Automatic cleanup when Python thread exits via pthread_key_t destructor
    • New module: py_thread_handler.erl - Coordinator and per-thread handlers
    • New C file: py_thread_worker.c - Thread worker pool management
    • New test suite: test/py_thread_callback_SUITE.erl
    • New documentation: docs/threading.md - Threading support guide
  • Reentrant Callbacks - Python→Erlang→Python callback chains without deadlocks

    • Exception-based suspension mechanism interrupts Python execution cleanly
    • Callbacks execute in separate processes to prevent worker pool exhaustion
    • Supports arbitrarily deep nesting (tested up to 10+ levels)
    • Transparent to users - erlang.call() works the same, just without deadlocks
    • New test suite: test/py_reentrant_SUITE.erl
    • New examples: examples/reentrant_demo.erl and examples/reentrant_demo.py

Changed

  • Callback handlers now spawn separate processes for execution, allowing workers to remain available for nested py:eval/py:call operations
  • Modular C code structure - Split monolithic py_nif.c (4,335 lines) into logical modules for better maintainability:
    • py_nif.h - Shared header with types, macros, and declarations
    • py_convert.c - Bidirectional type conversion (Python ↔ Erlang)
    • py_exec.c - Python execution engine and GIL management
    • py_callback.c - Erlang callback support and asyncio integration
    • Uses #include approach for single compilation unit (no build changes needed)

Fixed

  • Multiple sequential erlang.call() - Fixed infinite loop when Python code makes multiple sequential erlang.call() invocations in the same function. The replay mechanism now falls back to blocking pipe behavior for subsequent calls after the first suspension, preventing the infinite replay loop.
  • Memory safety in C NIF - Fixed memory leaks and added NULL checks
    • nif_async_worker_new: msg_env now freed on pipe/thread creation failure
    • multi_executor_stop: shutdown requests now properly freed after join
    • create_suspended_state: binary allocations cleaned up on failure paths
    • Added NULL checks on all enif_alloc_resource and enif_alloc_env calls
  • Dialyzer warnings - Added {suspended, ...} return type to NIF specs for worker_call, worker_eval, and resume_callback functions
  • Dead code removal - Cleaned up unused code discovered during code review:
    • Removed execute_direct() function in py_exec.c (duplicated inline logic)
    • Removed unused ref field from async_pending_t struct in py_nif.h
    • Removed worker_recv/2 from py_nif.erl (declared but never implemented in C)

Documentation

  • Doxygen-style C documentation - Added documentation to all C source files:
    • Architecture overview with execution mode diagrams
    • Type mapping tables for conversions
    • GIL management patterns and best practices
    • Suspension/resume flow diagrams for callbacks
    • Function-level @param, @return, @pre, @warning, @see annotations

1.1.0 (2026-02-15)

Added

  • Shared State API - ETS-backed storage for sharing data between Python workers

    • state_set/get/delete/keys/clear accessible from Python via from erlang import ...
    • py:state_store/fetch/remove/keys/clear from Erlang
    • Atomic counters with state_incr/decr (Python) and py:state_incr/decr (Erlang)
    • New example: examples/shared_state_example.erl
  • Native Python Import Syntax for Erlang callbacks

    • from erlang import my_func; my_func(args) - most Pythonic
    • erlang.my_func(args) - attribute-style access
    • erlang.call('my_func', args) - legacy syntax still works
  • Module Reload - Reload Python modules across all workers during development

    • py:reload(module) uses importlib.reload() to refresh modules from disk
    • py_pool:broadcast for sending requests to all workers
  • Documentation improvements

    • Added shared state section to getting-started, scalability, and ai-integration guides
    • Added embedding caching example using shared state
    • Added hex.pm badges to README

Fixed

  • Memory safety - Added NULL checks to all enif_alloc() calls in NIF code
  • Worker resilience - Fixed crash in py_subinterp_pool:terminate when workers undefined
  • Streaming example - Fixed to work with worker pool design (workers don't share namespace)
  • ETS table ownership - Moved py_callbacks table creation to supervisor for resilience

Changed

  • Created py_util module to consolidate duplicate code (to_binary/1, send_response/3, normalize_timeout/1-2)
  • Consolidated async_await/2 to call await/2 reducing duplication

1.0.0 (2026-02-14)

Initial release of erlang_python - Execute Python from Erlang/Elixir using dirty NIFs.

Features

  • Python Integration

    • Call Python functions with py:call/3-5
    • Evaluate expressions with py:eval/1-3
    • Execute statements with py:exec/1-2
    • Stream from Python generators with py:stream/3-4
  • Multiple Execution Modes (auto-detected)

    • Free-threaded Python 3.13+ (no GIL, true parallelism)
    • Sub-interpreters Python 3.12+ (per-interpreter GIL)
    • Multi-executor for older Python versions
  • Worker Pools

    • Main worker pool for synchronous calls
    • Async worker pool for asyncio coroutines
    • Sub-interpreter pool for parallel execution
  • Erlang/Elixir Callbacks

    • Register functions callable from Python via py:register_function/2-3
    • Python code calls back with erlang.call('name', args...)
  • Virtual Environment Support

    • Activate venvs with py:activate_venv/1
    • Use isolated package dependencies
  • Rate Limiting

    • ETS-based semaphore prevents overload
    • Configurable max concurrent operations
  • Type Conversion

    • Automatic conversion between Erlang and Python types
    • Integers, floats, strings, lists, tuples, maps/dicts, booleans
  • Memory Management

    • Access Python GC stats with py:memory_stats/0
    • Force garbage collection with py:gc/0-1
    • Memory tracing with py:tracemalloc_start/stop

Examples

  • semantic_search.erl - Text embeddings and similarity search
  • rag_example.erl - Retrieval-Augmented Generation with Ollama
  • ai_chat.erl - Interactive LLM chat
  • erlang_concurrency.erl - 10x speedup with BEAM processes
  • elixir_example.exs - Full Elixir integration demo

Documentation

  • Getting Started guide
  • AI Integration guide
  • Type Conversion reference
  • Scalability and performance tuning
  • Streaming with generators