Use this guide when your repo already has Codex instruction files and you want CodexOpt to improve them safely.
CodexOpt works with the same files Codex loads:
AGENTS.md.codex/skills/**/SKILL.md.agents/skills/**/SKILL.md
Run this from the repo where you use Codex:
uv run codexopt improveThis command:
- finds
AGENTS.mdandSKILL.mdfiles - mines starter tasks from git history and skill descriptions
- runs the reflective optimizer in preview mode
- shows what would change
- writes review artifacts under
.codexopt/
The default preview stays offline. It does not spend Codex or API budget unless you ask it to.
Use live mode when you want CodexOpt to evaluate actual Codex behavior:
uv run codexopt improve --liveLive mode uses codex exec as the optimizer and judge. CodexOpt evaluates the
candidate instruction file, captures feedback from the run, proposes a focused
rewrite, and keeps the rewrite only when it improves held-out tasks.
After reviewing the preview, apply validated changes:
uv run codexopt improve --live --applyCodexOpt writes backups before changing files.
Write a markdown report after any run:
uv run codexopt report --output codexopt-report.mdThe report shows:
- files found
- files improved
- validation score movement
- accepted reflective edits
- sampled feedback that led to the edit
- fallback notes when CodexOpt had to use a weaker signal
Use this flow when you want more control than improve:
uv run codexopt init
uv run codexopt scan
uv run codexopt benchmark
uv run codexopt optimize skills --engine reflective
uv run codexopt apply --kind skills --dry-run
uv run codexopt report --output codexopt-report.mdReview the dry-run diff, then apply:
uv run codexopt apply --kind skillsFor AGENTS.md:
uv run codexopt optimize agents --engine reflective --file AGENTS.md
uv run codexopt apply --kind agents --dry-runTask evidence tells CodexOpt what “better” means for your repo.
Create tasks.md:
- Update changelog entries for patch releases.
- Add regression tests before changing parser behavior.
- Summarize risky changes in the final response.Reference it in codexopt.yaml:
evidence:
task_files:
- tasks.mdThen run:
uv run codexopt improveCodexOpt uses these tasks for train and validation splits. A candidate must improve held-out validation score before it can win.
If you do not have task evidence yet, generate a starter file:
uv run codexopt tasks initReview the generated codexopt-tasks.json, trim anything noisy, then add it to
evidence.task_files.
Use command rollouts when a deterministic verifier can decide whether a skill supports a workflow.
Create skill-rollouts.json:
[
{
"name": "release-skill-smoke",
"description": "Verify the release skill mentions changelog and tests.",
"command": "python scripts/verify_release_skill.py",
"timeout_seconds": 30,
"expected_stdout_contains": "ok"
}
]Reference it:
evidence:
task_files:
- skill-rollouts.jsonRun:
uv run codexopt improveCodexOpt copies the repo to a temporary directory, writes the candidate
SKILL.md, runs the verifier, and uses pass rate as a strong reward signal.
Use Codex rollouts when you want to test how Codex behaves with a candidate skill.
Create codex-rollouts.json:
[
{
"name": "codex-release-notes",
"backend": "codex",
"description": "Ask Codex to use the candidate release skill on a release-note task.",
"codex_prompt": "Use the local release skill to update CHANGELOG.md for a patch release.",
"timeout_seconds": 120,
"expected_final_response_contains": "CHANGELOG.md",
"expected_command_contains": "git status",
"expected_file_change": "CHANGELOG.md",
"expected_file_contains": {
"path": "CHANGELOG.md",
"contains": "Patch"
}
}
]Run live mode:
uv run codexopt improve --liveCodexOpt runs codex exec --json in a temporary repo copy and records the
trajectory:
- final response
- command executions
- file changes
- token usage
- errors
CodexOpt now includes SkillOpt-style discipline in the Codex workflow:
- train and validation task splits
- bounded edits
- validation-gated acceptance
- rollout-based reward when available
- textual feedback that drives reflective mutation
For most users, the entry point is still simple:
uv run codexopt improve --live