Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions plugins/claude-code/DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -623,3 +623,135 @@ Docs done 2026-05-28; dogfood is the remaining (human) step.
- **Subagent memory bundling** — explore `memory: project|user` on dedicated BM subagents.
- **Statusline** — small visible presence (active project, last write).
- **`/basic-memory:bm-promote`** — review auto-memory MEMORY.md, graduate observations into BM with proper schema.

## 14. Harness WAL — event capture as producer envelopes

**Status:** v0 shipped (issue [#997](https://github.com/basicmachines-co/basic-memory/issues/997))
**Related:** SPEC-55 (Agent Memory Pipeline — Producer Framework), SPEC-61 (Event-Driven Memory Routines)

### 14.1 What it is

A shared, opt-in "harness WAL" path that normalizes supported hook events into
Basic Memory **producer envelopes** — a structured event record that stamps each
checkpoint with its source, session, hook, and an idempotency key. This is the
producer side of SPEC-55; it feeds SPEC-61 memory routines without turning Basic
Memory into an agent runtime.

### 14.2 The envelope schema

```python
@dataclass(frozen=True)
class HarnessEnvelope:
event_type: str # "session_started", "compaction_imminent", "session_ended"
source: str # "claude-code" or "codex"
session_id: str # harness session identifier
turn_id: str | None # turn identifier when available (Codex)
timestamp: str # ISO 8601
cwd: str # working directory
project_hint: str # basicMemory.primaryProject
hook_name: str # "SessionStart", "PreCompact"
idempotency_key: str # sha256(source:session_id:hook:timestamp_minute)[:16]
payload_summary: dict # safe, redacted payload excerpt
```

The module lives at `plugins/shared/harness_envelope.py` — stdlib-only, no install
step. Both plugins import it via `sys.path.insert`.

### 14.3 V0 event types

V0 captures only events exposed through existing hooks:

| Event | Hook | Plugin | What it does |
| ---------------------- | ------------- | ------------ | ---------------------------------------------- |
| `session_started` | SessionStart | Both | Logs to local event log (read-only hook) |
| `compaction_imminent` | PreCompact | Both | Stamps provenance onto the SessionNote/CodexSession |
| `session_ended` | (future) | (future) | Defined but not captured in v0 |

Events like `tool_called`, `file_changed`, `test_ran` require `PostToolUse` hooks
that don't currently exist — deferred to v1.

### 14.4 What gets stamped on checkpoints

PreCompact checkpoints gain two additions:

1. **Frontmatter fields** — `envelope_source`, `envelope_event`, `envelope_hook`,
`idempotency_key` (and `envelope_turn_id` when available). These make checkpoints
queryable by source and dedup-safe.

2. **Provenance observations** — appended to the `## Observations` section:
```markdown
- [source] claude-code/abc-123-def
- [hook] PreCompact
- [event] compaction_imminent at 2026-06-13T16:48:00+00:00
- [idempotency] b7ff76ee5df8cc0a
```

### 14.5 Local event log

When `captureEvents: true` is set, both SessionStart and PreCompact append a JSONL
record to `<cwd>/.basic-memory/events.jsonl`. This log:

- Is append-only (no reads during hook execution)
- Is capped at 1000 lines (configurable via `eventRetention`); oldest half rotates out
Comment on lines +694 to +695
- Feeds future SPEC-61 memory routines (nightly coalescing, session summaries)
- Never blocks the hook — write failures are silently swallowed

### 14.6 Configuration

New keys in `basicMemory` (both `.claude/settings.json` and `.codex/basic-memory.json`):

| Key | Type | Default | Description |
| ---------------- | ---------- | ------- | ---------------------------------------------- |
| `captureEvents` | `boolean` | `false` | Opt-in for local event log (events.jsonl) |
| `redactKeys` | `string[]` | `[]` | Extra key patterns to redact from payloads |
| `redactPaths` | `string[]` | `[]` | Extra path prefixes to redact |
| `eventRetention` | `number` | `1000` | Max lines in the local event log before rotation |

### 14.7 Privacy and redaction

The envelope module applies layered redaction before any payload summary is stored:

1. **Key-pattern deny list** — keys matching `SECRET`, `TOKEN`, `KEY`, `PASSWORD`,
`CREDENTIAL`, `AUTH` (case-insensitive) are replaced with `[REDACTED]`
2. **Secret-value detection** — values matching `[A-Za-z0-9_]+=.{20,}` (environment
secret pattern) are replaced with `[REDACTED]`
3. **Path deny list** — values starting with `~/.ssh/`, `~/.aws/`, `~/.gnupg/` are
replaced with `[REDACTED_PATH]`. Extended via `redactPaths` config.
4. **Truncation** — any single value over 500 chars is truncated

Constraints from the issue:
- Capture is opt-in (tied to plugin installation + `captureEvents` config)
- Never captures hidden chain-of-thought or private model reasoning
- Prefers summaries and metadata over raw transcript dumps
- Fails fast on missing project mapping (no `primaryProject` → no capture)

### 14.8 Idempotency

The `idempotency_key` is a 16-char hex string derived from
`sha256(source:session_id:hook:timestamp_minute)`. Minute granularity means:

- Repeated hooks within the same minute for the same session produce the same key
- A hook one minute later produces a distinct key (new event)
- No persistent state is required for dedup

The key is written into the note's frontmatter so downstream consumers can detect
and skip duplicates.

### 14.9 ToolLedger schema (forward compatibility)

Both plugins ship a `schemas/tool-ledger.md` picoschema defining the ToolLedger
note type. V0 does not produce ToolLedger notes — the schema exists so that when
`PostToolUse` hooks become available (v1), the note shape is already defined and
schema validation works immediately.

### 14.10 Forward compatibility

- **SPEC-55 Producer Framework** — the envelope shape aligns with SPEC-55's producer
envelope contract. When the Producer SDK lands, the shared module becomes a thin
adapter over the SDK rather than a standalone implementation.
- **SPEC-61 Memory Routines** — the local event log (`events.jsonl`) is the input
feed for event-driven routines. The nightly coalescing routine reads the log,
groups events by session, and produces enriched artifacts.
- **SPEC-56 Consolidation** — provenance observations enable consolidation to trace
which sessions produced which notes, supporting the dream-mode merge.

70 changes: 69 additions & 1 deletion plugins/claude-code/hooks/pre-compact.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,12 @@ else
exit 0
fi

BM_HOOK_INPUT="$input" BM_BIN="$BM" python3 <<'PY' 2>/dev/null || exit 0
# Resolve the hook script's own directory so the inline Python can find the
# shared envelope module (plugins/shared/). __file__ is '<stdin>' inside a
# heredoc, so the Python code can't locate itself — we pass the real path.
hook_dir="$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)"

BM_HOOK_INPUT="$input" BM_BIN="$BM" BM_HOOK_DIR="$hook_dir" python3 <<'PY' 2>/dev/null || exit 0
import json
import os
import re
Expand All @@ -43,6 +48,28 @@ import subprocess
import sys
from datetime import datetime

# --- Load the shared envelope module (lives two directories up in plugins/shared/) ---
# Trigger: this hook wants to stamp provenance and idempotency on the checkpoint.
# Why: the envelope normalizes hook events so downstream consumers (recall,
# consolidation, memory routines) can trace where each note came from.
# Constraint: __file__ is '<stdin>' inside a bash heredoc, so the hook script's
# real directory is passed in via the BM_HOOK_DIR environment variable.
_hook_dir = os.environ.get("BM_HOOK_DIR", "")
if _hook_dir:
_shared_dir = os.path.join(_hook_dir, "..", "..", "shared")
sys.path.insert(0, os.path.normpath(_shared_dir))
Comment on lines +58 to +60

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bundle the shared envelope helper with plugin packages

When the hooks run from the installed plugin package (for example the documented Claude sparse install includes .claude-plugin plugins/claude-code, and the Codex manifest points at plugins/codex), there is no sibling plugins/shared directory in that package. This lookup therefore points outside the packaged plugin, the ImportError is swallowed, and _HAS_ENVELOPE stays false, so installed users never get envelope frontmatter/provenance or captureEvents logging. Vendor the helper inside each plugin or include plugins/shared in the packaged/installable source.

Useful? React with 👍 / 👎.

try:
from harness_envelope import (
COMPACTION_IMMINENT,
append_to_event_log,
create_envelope,
to_frontmatter_fields,
to_provenance_observations,
)
_HAS_ENVELOPE = True
except ImportError:
_HAS_ENVELOPE = False

# May be a single binary ("basic-memory") or a multi-token launcher
# ("uvx basic-memory"); split so it prepends cleanly onto the write command.
bm_cmd = shlex.split(os.environ.get("BM_BIN") or "basic-memory")
Expand Down Expand Up @@ -81,6 +108,9 @@ def load_settings(directory):
cfg = load_settings(cwd)
primary_project = (cfg.get("primaryProject") or "").strip()
capture_folder = (cfg.get("captureFolder") or "sessions").strip()
capture_events = bool(cfg.get("captureEvents", False))
redact_keys = cfg.get("redactKeys") or []
redact_paths = cfg.get("redactPaths") or []

# Trigger: no project pinned for this Claude Code project.
# Why: a checkpoint must land somewhere intentional. Writing to the default graph
Expand Down Expand Up @@ -182,6 +212,32 @@ frontmatter = [
]
if session_id:
frontmatter.append(f"claude_session_id: {session_id}")

# --- Harness envelope: stamp provenance and idempotency onto the checkpoint ---
# Trigger: the shared envelope module is available (always, unless the shared/
# directory is missing). Why: provenance makes each checkpoint traceable
# to its source hook, session, and exact event. Idempotency prevents
# duplicate notes when the hook fires more than once in the same minute.
envelope = None
if _HAS_ENVELOPE:
try:
envelope = create_envelope(
event_type=COMPACTION_IMMINENT,
source="claude-code",
session_id=session_id or "unknown",
cwd=cwd,
project_hint=primary_project,
hook_name="PreCompact",
timestamp=iso,
payload_summary={"opening": clip(opening, 200)} if opening else {},
redact_keys=redact_keys,
redact_paths=redact_paths,
)
for key, value in to_frontmatter_fields(envelope).items():
frontmatter.append(f"{key}: {value}")
except Exception:
pass # envelope creation failure is non-fatal

frontmatter += ["capture: extractive", "---"]

body = [
Expand All @@ -206,6 +262,18 @@ body += [
"- [next_step] Review this checkpoint and continue where the thread left off",
]

# --- Append envelope provenance observations ---
# These stamp the note with its producer source so downstream consumers can
# trace provenance without storing the full raw event.
if _HAS_ENVELOPE and envelope:
body += to_provenance_observations(envelope)

# --- Log the event locally for coalescing ---
# Trigger: captureEvents is enabled. Why: the local event log feeds future
# memory routines (SPEC-61) without requiring the note to carry every detail.
if _HAS_ENVELOPE and envelope and capture_events:
append_to_event_log(envelope, cwd)
Comment on lines +274 to +275

content = "\n".join(frontmatter + body)

# --- Write the checkpoint (best-effort) ---
Expand Down
45 changes: 44 additions & 1 deletion plugins/claude-code/hooks/session-start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,9 @@ fi
# the brief. Python is a guaranteed dependency (basic-memory requires it) and
# avoids brittle shell JSON wrangling. The payload and binary path cross over via
# the environment to sidestep argument-quoting issues.
BM_HOOK_INPUT="$input" BM_BIN="$BM" python3 <<'PY' 2>/dev/null || exit 0
hook_dir="$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)"

BM_HOOK_INPUT="$input" BM_BIN="$BM" BM_HOOK_DIR="$hook_dir" python3 <<'PY' 2>/dev/null || exit 0
import json
import os
import re
Expand All @@ -54,6 +56,25 @@ import subprocess
import sys
from concurrent.futures import ThreadPoolExecutor

# --- Load the shared envelope module (lives two directories up in plugins/shared/) ---
# SessionStart is read-only (no note writes), so the envelope is only used for
# local event logging when captureEvents is enabled.
# Constraint: __file__ is '<stdin>' inside a bash heredoc, so the hook script's
# real directory is passed in via the BM_HOOK_DIR environment variable.
_hook_dir = os.environ.get("BM_HOOK_DIR", "")
if _hook_dir:
_shared_dir = os.path.join(_hook_dir, "..", "..", "shared")
sys.path.insert(0, os.path.normpath(_shared_dir))
try:
from harness_envelope import (
SESSION_STARTED,
append_to_event_log,
create_envelope,
)
_HAS_ENVELOPE = True
except ImportError:
_HAS_ENVELOPE = False

# May be a single binary ("basic-memory") or a multi-token launcher
# ("uvx basic-memory"); split so it prepends cleanly onto each command list.
bm_cmd = shlex.split(os.environ.get("BM_BIN") or "basic-memory")
Expand Down Expand Up @@ -114,6 +135,8 @@ recall_prompt = cfg.get("recallPrompt") or default_prompt
# Without this, setup writes them but they never reach Claude (dead config).
placement_conventions = (cfg.get("placementConventions") or "").strip()
capture_folder = (cfg.get("captureFolder") or "sessions").strip()
capture_events = bool(cfg.get("captureEvents", False))
session_id = payload.get("session_id") or ""

# --- Resolve the shared/team read set ---
# secondaryProjects (read-only recall sources) + teamProjects keys (share targets,
Expand Down Expand Up @@ -290,4 +313,24 @@ elif not primary_project:

lines += ["", "---", recall_prompt]
print("\n".join(lines))

# --- Log the session_started event locally (opt-in) ---
# Trigger: captureEvents is enabled and the envelope module is available.
# Why: the local event log records session starts for later coalescing by memory
# routines (SPEC-61). This is separate from the brief printed above — it's
# durable metadata, not context for the current session.
# Outcome: a JSONL line is appended to <cwd>/.basic-memory/events.jsonl.
if _HAS_ENVELOPE and capture_events and primary_project:
try:
envelope = create_envelope(
event_type=SESSION_STARTED,
source="claude-code",
session_id=session_id or "unknown",
cwd=cwd,
project_hint=primary_project,
hook_name="SessionStart",
)
append_to_event_log(envelope, cwd)
Comment on lines +329 to +333
except Exception:
pass # event logging failure is non-fatal
PY
49 changes: 49 additions & 0 deletions plugins/claude-code/schemas/tool-ledger.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
title: Tool Ledger
type: schema
entity: ToolLedger
version: 1
schema:
summary?: string, what tools were used and their overall outcome
tool_call?(array): string, tool name with abbreviated args summary
tool_result?(array): string, tool outcome summary (success or failure reason)
file_changed?(array): string, paths created or modified by tool calls
decision?(array): string, decisions made based on tool results
settings:
validation: warn
frontmatter:
project: string, the Basic Memory project this ledger belongs to
session_id?: string, harness session identifier
started: string, when the first tool call was recorded
ended?: string, when the last tool call was recorded
status?(enum, lifecycle of the ledger): [open, closed]
type: tool_ledger
source?: string, harness source (claude-code or codex)
idempotency_key?: string, dedup key from the producer envelope
---

# Tool Ledger

A **ToolLedger** records the sequence of tool calls and their outcomes during
an agent session. It complements a SessionNote by capturing the *mechanical*
work — which tools were invoked, what files they touched, what failed — rather
than the *narrative* summary of what happened.

ToolLedger notes are found by structured recall:
`search_notes(metadata_filters={"type": "tool_ledger"}, after_date="7d")`.

## What Goes In A ToolLedger

- **summary** — one paragraph of what the tool sequence accomplished.
- **tool_call** — each significant tool invocation with abbreviated arguments.
- **tool_result** — the outcome of each call (pass/fail/partial).
- **file_changed** — paths touched, useful for resume and conflict detection.
- **decision** — any decisions that emerged from tool results.

## When It's Written

V0 defines the schema for forward compatibility. Actual ToolLedger notes will
be produced when PostToolUse hooks become available (v1). For now, tool-level
observations can be included in SessionNote checkpoints.

Validation is `warn` so ledger creation never blocks the user's flow.
4 changes: 4 additions & 0 deletions plugins/claude-code/settings.example.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
"recallPrompt": "You have Basic Memory available for this project. Before answering recall questions (\"what did we decide\", \"where did we leave off\"), search the graph first — prefer structured filters (search_notes with type/status). When the user makes a material decision, capture it as a note with type: decision. Cite permalinks when referencing prior work.",
"preCompactCapture": "extractive",
"placementConventions": null,
"captureEvents": false,
"redactKeys": [],
"redactPaths": [],
"eventRetention": 1000,
"teamProjects": {
"my-team/notes": { "promoteFolder": "shared" }
}
Expand Down
Loading
Loading