finecode-dev
diff --git a/‎.github/workflows/ci-cd.yml‎
Lines changed: 9 additions & 8 deletions b/‎.github/workflows/ci-cd.yml‎
Lines changed: 9 additions & 8 deletions
diff --git a/‎.mcp.json‎
Lines changed: 9 additions & 0 deletions b/‎.mcp.json‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎docs/adr/0001-use-adr.md‎
Lines changed: 47 additions & 0 deletions b/‎docs/adr/0001-use-adr.md‎
Lines changed: 47 additions & 0 deletions
diff --git a/‎docs/adr/0002-port-file-discovery-for-wm-server.md‎
Lines changed: 67 additions & 0 deletions b/‎docs/adr/0002-port-file-discovery-for-wm-server.md‎
Lines changed: 67 additions & 0 deletions
diff --git a/‎docs/adr/0003-process-isolation-per-extension-environment.md‎
Lines changed: 110 additions & 0 deletions b/‎docs/adr/0003-process-isolation-per-extension-environment.md‎
Lines changed: 110 additions & 0 deletions
diff --git a/‎docs/adr/0004-auto-shutdown-on-disconnect-timeout.md‎
Lines changed: 95 additions & 0 deletions b/‎docs/adr/0004-auto-shutdown-on-disconnect-timeout.md‎
Lines changed: 95 additions & 0 deletions
@@ -31,11 +31,11 @@ jobs:
       fail-fast: false
       max-parallel: 1
       matrix:
-        os: [ubuntu-24.04, macos-13, windows-2022]
+        os: [ubuntu-24.04, macos-15, windows-2022]
         include:
           - os: ubuntu-24.04
             name: Linux
-          - os: macos-13
+          - os: macos-15
             name: macOS
           - os: windows-2022
             name: Windows
@@ -82,12 +82,13 @@ jobs:
           python -m finecode run build_artifact
         shell: bash
 
-      # - name: Run unit tests
-      #   if: ${{ !cancelled() }}
-      #   run: |
-      #     source .venvs/dev_workspace/bin/activate
-      #     python -m finecode run test
-      #   shell: bash
+      - name: Run unit tests
+        if: ${{ !cancelled() }}
+        run: |
+          source .venvs/dev_workspace/bin/activate
+          # TODO: test with all supported python versions
+          python -m finecode run run_tests
+        shell: bash
 
       - name: Publish to TestPyPI and verify
         if: runner.os == 'Linux' && github.event_name == 'workflow_dispatch' && inputs.publish_testpypi
 
@@ -0,0 +1,9 @@
+{
+  "mcpServers": {
+    "finecode": {
+      "type": "stdio",
+      "command": ".venvs/dev_workspace/bin/python",
+      "args": ["-m", "finecode", "start-mcp"]
+    }
+  }
+}
@@ -0,0 +1,47 @@
+# ADR-0001: Use ADRs for architecture decisions
+
+- **Status:** accepted
+- **Date:** 2026-03-19
+- **Deciders:** @Aksem
+- **Tags:** meta
+
+## Context
+
+FineCode has several important architectural decisions that
+are currently documented implicitly across code, comments, and CLAUDE.md. When
+new contributors or AI agents work on the codebase, they lack visibility into
+*why* decisions were made, what alternatives were considered, and what
+constraints must be preserved.
+
+As the project grows and automated testing is introduced, we need a lightweight
+way to record decisions so they can be referenced, reviewed, and superseded
+over time.
+
+## Related ADRs Considered
+
+None — this is the first ADR.
+
+## Decision
+
+We will use Architecture Decision Records stored in `docs/adr/` following a
+simplified [MADR](https://adr.github.io/madr/) (Markdown Any Decision Records)
+template. The required sections are Context, Related ADRs Considered, Decision,
+and Consequences. Each ADR is a sequentially numbered Markdown file.
+
+The template also documents optional sections (Alternatives Considered, Risks,
+Related Decisions, References, Implementation Notes, Review Date) that can be
+added when they provide value, but are not required.
+
+ADRs are immutable once accepted. Changed decisions produce a new ADR that
+supersedes the previous one.
+
+## Consequences
+
+- Every architecturally significant decision gets a permanent, discoverable
+  record with its rationale.
+- New contributors and AI agents can understand *why* the codebase is shaped
+  the way it is.
+- Slightly more process overhead per decision — mitigated by keeping the
+  template minimal.
+- Existing implicit decisions can be backfilled as ADRs when they become
+  relevant.
@@ -0,0 +1,67 @@
+# ADR-0002: Port-file discovery for the WM server
+
+- **Status:** accepted
+- **Date:** 2026-03-19
+- **Deciders:** @Aksem
+- **Tags:** ipc, wm-server
+
+## Context
+
+The WM (Workspace Manager) server binds to a random available TCP port on
+startup to avoid conflicts between multiple instances (e.g. different workspaces,
+test runs). Clients such as the LSP server, MCP server, and CLI commands are
+started independently and need a way to find the WM server's port without
+prior coordination or a hard-coded value.
+
+Two modes of use must be supported:
+
+- **Shared mode**: a single long-lived WM server shared by multiple clients in the
+  same workspace (the typical IDE session).
+- **Dedicated mode**: a private WM server started by one client (e.g. MCP,
+  CLI `run`) that must not interfere with the shared instance.
+
+## Related ADRs Considered
+
+None — port/discovery mechanism has no overlap with other ADRs at the time of writing.
+
+## Decision
+
+The WM server writes its listening port as a plain text number to a
+*discovery file* immediately after binding:
+
+- **Shared discovery file** (default): `{venv}/cache/finecode/wm_port`, where
+  `{venv}` is venv where finecode WM server is installed.
+- **Dedicated discovery file**: a caller-specified path passed via
+  `--port-file`. Dedicated instances write to this path instead, leaving the
+  shared file untouched.
+
+Clients discover the server by reading the file and probing the TCP connection. The probe distinguishes a live server
+from a stale file left by a crashed process. The file is deleted on any clean or signal-driven shutdown, and the server directory is created
+recursively (including parent directories) on first startup.
+
+## Consequences
+
+- **No port conflicts**: random binding means multiple WM instances (different
+  workspace, concurrent test runs) coexist without configuration.
+- **Stale-file resilience**: client verifies the TCP connection, not
+  just file existence, so a crashed server does not block future starts.
+- **Test isolation**: each e2e test can pass its own file path as
+  the dedicated port file, running a private WM instance without touching the
+  developer's live shared server or conflicting with other tests.
+- **Cross-process discovery**: any process that can read a file can find the
+  WM, regardless of parent–child relationship (IDE extensions, CLI tools, MCP
+  hosts).
+- **Crash cleanup gap**: if the server process is killed with SIGKILL or
+  crashes before port file is removed, the discovery file is not removed. Clients
+  handle this via the TCP probe, but the stale file persists on disk until the
+  next server start overwrites it.
+
+### Alternatives Considered
+
+- **Fixed/configured port**: eliminates the discovery file but requires port
+  coordination across concurrent instances and breaks test isolation.
+- **Unix domain socket file**: the socket path serves as both identity and
+  transport endpoint, avoiding the TCP-probe step. Rejected because Unix
+  sockets are not available on Windows.
+- **Environment variable**: works only for direct child processes; IDE
+  extensions and independently launched CLI commands cannot inherit it.
@@ -0,0 +1,110 @@
+# ADR-0003: One Extension Runner process per project execution environment
+
+- **Status:** accepted
+- **Date:** 2026-03-19
+- **Deciders:** @Aksem
+- **Tags:** architecture, extension-runner
+
+## Context
+
+FineCode executes action handlers contributed by extensions. Each handler
+declares the **execution environment** (`env`) it runs in and its own set of
+dependencies. An execution environment is a named, isolated context serving a
+specific purpose (e.g. `runtime` for the project's own runtime code,
+`dev_workspace` for workspace tooling, `dev_no_runtime` for dev tools without
+runtime deps). In Python, each execution environment is materialized as a
+project-local virtual environment.
+
+The **Extension Runner (ER)** is an inter-language concept — a process that
+executes handler code inside a specific execution environment. The current
+implementation, `finecode_extension_runner`, is Python-specific. Future
+implementations for other languages (e.g. JavaScript, Rust) would follow the
+same one-process-per-execution-environment model.
+
+The primary requirement is to separate dependencies needed by the project's
+own runtime from dependencies needed only by tooling. FineCode must be able to
+run project code in one execution environment and run development tooling in
+other execution environments without forcing them into a single shared
+dependency set.
+
+Once execution environments are isolated, they can also be made more
+fine-grained by purpose. This allows tooling dependencies to be grouped
+according to their role and makes it possible to move tools with incompatible
+dependency requirements into separate execution environments when needed.
+
+The Workspace Manager (WM) is a long-running server that must stay stable
+across the full user session. A handler bug, crash, or blocking call in one
+execution environment must not take down the WM or interfere with other
+execution environments.
+
+## Related ADRs Considered
+
+None — process isolation model has no overlap with other ADRs at the time of writing.
+
+## Decision
+
+Each execution environment in a project runs as an independent
+**Extension Runner (ER)** subprocess. In the Python implementation, the ER is
+launched using the interpreter from the corresponding project-local virtual
+environment, so each ER has a fully isolated dependency set.
+
+Key properties of this design:
+
+- **One ER per (project, execution environment) pair.** ERs are keyed by
+  `(project_dir_path, env_name)` in the WM's workspace context.
+- **Lazy startup with bootstrap exception.** An ER is started only when the
+  first action request requiring its execution environment arrives, then cached
+  and reused for subsequent requests. The `dev_workspace` execution
+  environment is the exception because it must be started first to resolve
+  presets for other execution environments.
+- **JSON-RPC over TCP.** Each ER binds to a random loopback port on startup
+  and advertises it to the WM. The WM connects via TCP and communicates using
+  JSON-RPC with Content-Length framing (the same wire format as LSP).
+- **Independent lifecycle.** An ER can crash and be restarted without
+  affecting the WM or ERs for other execution environments. Shutdown is
+  cooperative: the WM sends `shutdown` + `exit` JSON-RPC calls; the ER exits
+  cleanly.
+- **`dev_workspace` bootstrap execution environment.** The `dev_workspace`
+  execution environment is always started first; it resolves presets for all
+  other execution environments before they are configured or started.
+
+## Consequences
+
+- **Dependency isolation**: project runtime dependencies and tooling
+  dependencies are kept separate, and tooling can be split further into
+  purpose-specific execution environments when conflicts or different
+  dependency sets require it.
+- **Fault isolation**: a crash or hang in one ER does not affect the WM or
+  other ERs. The WM can restart a failed ER independently.
+- **Startup cost**: launching a Python subprocess and importing handler modules
+  takes time. Mitigated by lazy startup and long-lived reuse.
+- **Higher memory usage**: running multiple ER processes per project uses more
+  RAM than a single shared process. The overhead is expected to be acceptable
+  relative to the benefits of dependency isolation, fault isolation, and
+  long-lived per-environment state.
+- **One virtual environment per execution environment per project**:
+  `prepare-envs` must create and populate the project-local virtual
+  environment for each declared execution environment before the ER can start.
+  Missing virtual environments result in `RunnerStatus.NO_VENV` rather than a
+  crash.
+- **`dev_workspace` is a prerequisite**: preset resolution depends on the
+  `dev_workspace` ER being available. Actions in other execution environments
+  cannot be configured until `dev_workspace` is initialized.
+
+### Alternatives Considered
+
+- **Single shared process for all handlers**: eliminates subprocess overhead
+  but forces runtime code and tooling into one shared dependency set, makes
+  fine-grained environment separation impractical, and means one handler crash
+  can corrupt or kill the entire tool.
+- **Thread per handler invocation**: handlers run in the same process and
+  virtual environment. No dependency isolation; a blocking or crashing handler
+  affects all others.
+- **In-process plugin loading**: simplest architecture but handlers can import
+  conflicting packages and accidentally mutate shared WM state.
+- **New subprocess per handler invocation**: full isolation per call, but
+  Python startup cost makes interactive use (e.g. format-on-save) too slow.
+  It also prevents effective in-process caching between calls because each
+  invocation starts with cold process state. The long-lived ER model amortizes
+  startup cost across many invocations and allows caches to be retained in
+  process when appropriate.
@@ -0,0 +1,95 @@
+# ADR-0004: Auto-shutdown on disconnect timeout
+
+- **Status:** accepted
+- **Date:** 2026-03-19
+- **Deciders:** @Aksem
+- **Tags:** lifecycle, wm-server
+
+## Context
+
+The WM server is a long-running process started on demand by clients (LSP
+server, MCP server, CLI). Clients may terminate without sending an explicit
+shutdown request — for example, when the IDE is force-closed, crashes, or the
+extension is reloaded. Without a self-termination mechanism, the WM would run
+indefinitely as a ghost process, holding the discovery file and consuming
+resources.
+
+Clients may also intentionally stop or restart the WM through an explicit
+shutdown request. This ADR addresses the complementary case where no such
+request is sent and the WM must determine on its own when to exit.
+
+Two distinct scenarios require handling:
+
+1. **No client ever connects** — the WM started successfully but the client
+   failed to connect (e.g. misconfiguration, client crash during startup).
+2. **Last client disconnects** — a normal session end or unexpected client
+   termination.
+
+## Related ADRs Considered
+
+Reviewed [ADR-0002](0002-port-file-discovery-for-wm-server.md) — related topic:
+the WM's shutdown flow performs the discovery-file cleanup defined there.
+
+## Decision
+
+The WM server uses two independent timeout-based shutdown mechanisms:
+
+- **No-client timeout** (default 30 s): started immediately after the server
+  begins listening. If no client connects within this window, the WM performs
+  its normal shutdown and exits.
+- **Disconnect timeout** (default 30 s): started when the last client
+  disconnects. If no client reconnects within this window, the WM performs its
+  normal shutdown and exits.
+
+These timeouts complement, rather than replace, explicit shutdown requests used
+by clients that intentionally stop or restart the WM.
+
+Both timeout paths use the WM's normal shutdown flow, including discovery-file
+cleanup (see [ADR-0002](0002-port-file-discovery-for-wm-server.md)).
+
+The disconnect timeout is configurable so that tests and dedicated instances
+can use a shorter grace period when needed.
+
+Using the same 30-second default for both timeouts keeps lifecycle behavior
+simple and provides a reasonable reconnection window for IDE extension reloads
+and brief transient disconnects without leaving orphaned processes running for
+long.
+
+## Consequences
+
+- **Ghost process prevention**: the WM exits automatically after a client
+  disconnects, without requiring clients to explicitly decide when the shared
+  WM should stop. This is the primary defense against orphaned processes after
+  IDE close or crash.
+- **Reconnection window**: the grace period allows clients to reconnect within
+  the timeout — for example, after an IDE extension reload or a brief
+  disconnection. The WM does not need to be restarted for each reconnection.
+- **Warm reuse across brief idle gaps**: the grace period allows a shared WM
+  to survive short pauses between independent clients, such as sequential CLI
+  commands, preserving in-process state and caches between commands and
+  reducing restart overhead.
+- **Connection-driven lifecycle**: shutdown depends on client liveness rather
+  than completion of previously requested work. Once no clients remain past the
+  grace period, the WM exits through its normal shutdown path.
+- **Discovery file cleanup**: normal shutdown removes the discovery file, so a
+  stale file is never left behind after a timeout-driven shutdown (unlike a
+  SIGKILL).
+
+### Alternatives Considered
+
+- **Immediate shutdown on last disconnect**: safe but breaks IDE extension
+  reload scenarios and brief idle gaps between independent clients, such as
+  sequential CLI commands using a shared WM.
+- **Never auto-shutdown (persistent daemon)**: WM runs until explicitly
+  stopped. Requires external process management and makes
+  it harder to reason about lifecycle in tests and CI.
+- **Client heartbeat / keepalive**: client sends periodic pings; WM shuts down
+  if pings stop. More precise than a fixed timeout for detecting dead connected
+  clients, but it still does not answer how long the WM should remain alive
+  when no clients are connected at all. Shared-WM use cases with brief idle
+  gaps between clients, such as sequential CLI commands, would still require a
+  grace-period timeout or a different persistent-daemon policy. It also
+  requires all clients to implement the heartbeat protocol.
+- **Parent PID tracking**: WM monitors its parent process and exits when the
+  parent dies. Does not work when the WM is started independently of its client
+  (e.g. shared WM).