Skip to content

Fix event loop timer triggering and async task completion#32

Merged
benoitc merged 6 commits intomainfrom
fix/event-loop-trigger
Mar 16, 2026
Merged

Fix event loop timer triggering and async task completion#32
benoitc merged 6 commits intomainfrom
fix/event-loop-trigger

Conversation

@benoitc
Copy link
Owner

@benoitc benoitc commented Mar 16, 2026

Summary

  • Fix async tasks with asyncio.sleep() never completing
  • Add erlang.atom() for explicit atom creation in Python
  • Fix reentrancy when erlang.run() used inside py:exec
  • Prevent shared global capsule destruction on close()

benoitc added 6 commits March 16, 2026 00:32
After dispatching timer or FD events, the worker now sends task_ready to
itself to trigger _run_once processing. Also modified process_ready_tasks
to call _run_once when there are pending events (not just new coroutines).

Without this fix, asyncio.sleep and other async operations would never
complete because the event loop wasn't being driven after timer events.

Changes:
- py_event_worker: Send task_ready after handling {timeout, ...} and
  {select, ...} messages to trigger event loop processing
- py_event_loop.c: Call _run_once when pending_count > 0 in addition
  to when coros_scheduled > 0
Timer events weren't triggering task completion because:
- dispatch_timer didn't notify worker to process ready tasks
- Python strings became binaries, but await expected atoms
- ErlangRef conversion happened before resource type checks

Changes:
- Add global worker PID storage for timer dispatch notification
- Add nif_set_shared_worker NIF and py_nif:set_shared_worker/1
- Add ErlangAtomObject type and erlang.atom() function
- Modify _run_and_send to use atoms for message keys
- Move enif_is_ref check after resource type checks
- Add _get_ready_count method for event loop processing
- Fix lazy loop creation to use ErlangEventLoop directly

All py_async_task_SUITE (26) and py_SUITE (50) tests pass.
- Skip _run_once when Python loop is already running to prevent
  reentrancy issues when erlang.run() is used inside py:exec
- Don't destroy global loop capsule on close() since it's shared
  between multiple Python ErlangEventLoop instances
- Clean up events_module in early return path

Fixes py_async_e2e_SUITE hangs and crashes.
dispatch_timer may be called on a loop different from the one the global
worker manages. The worker already sends task_ready to itself after handling
the timer timeout, so the global send was redundant and incorrect.
This error is expected when the loop is being destroyed while
task_ready messages are still in the worker's mailbox.
- Add _has_loop_ref() to prevent concurrent loops while allowing
  sequential replacement (checks is_running() not just exists)
- Add _clear_loop_ref() called on loop close for proper cleanup
- Add global_loop_capsule_destructor to fix resource leak
- Rename atom() to _atom() in C, add Python wrapper with cache
  and configurable limit (ERLANG_PYTHON_MAX_ATOMS, default 10000)
- Use enif_make_existing_atom() first to avoid duplicate atoms
- Fix venv .pth file processing for Python 3.14 subinterpreters
  by embedding site-packages path directly in exec code
@benoitc benoitc merged commit 1b87ea2 into main Mar 16, 2026
11 checks passed
@benoitc benoitc deleted the fix/event-loop-trigger branch March 16, 2026 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant