Skip to content

feat(tasks): capability-driven tasks — declarative steps, non-coding workflows, migrate existing task types #248

@krokoko

Description

@krokoko

Component

Agent (Python runtime), API or orchestration, Scripts / CLI

Describe the feature

Introduce capability-driven tasks: instead of hardcoding workflows in Python (pipeline.py branches on task_type) and a fixed enum (new_task | pr_iteration | pr_review), the platform loads a capability file that declares the ordered steps to run inside the agent container — prompts, tool profiles, pre/post hooks, hydration requirements, and terminal outcomes.

Existing coding workflows become capability files toonew_task, pr_iteration, and pr_review are the first shipped capabilities, not special cases in core code. New domains (research, document drafting, data analysis, email triage, etc.) are new capability files, not new orchestrator branches.

Inspired by the AKW vision in #99 (blueprint-driven, task-mode-agnostic agent loop) but scoped to current ABCA architecture: reuse durable orchestration, AgentCore isolation, Cedar HITL, Guardrails, and the agent asset registry (#246) as the home for published capability artifacts.

Use case

  • Non-coding tasks: Submit work with no GitHub repo — e.g. "summarise these papers", "draft a design doc from attachments" — without forcing clone/PR scaffolding.
  • Composable workflows: Operators and blueprint authors define steps declaratively (hydrate → agent run → verify → deliver) instead of patching pipeline.py for every new workflow.
  • One platform, many domains: Coding (PR open/review/iterate) and knowledge work share admission, memory, policy, cost limits, and observability — only the capability file differs.
  • Gradual migration: Shipped task types keep working via capability files that encode today's behavior; callers can migrate from task_type enum to capability_ref over time.

Proposed solution

Capability file (conceptual schema)

A versioned document (YAML or JSON — format TBD in design PR) describing at minimum:

Field Purpose
id, version Stable identity; pinned via registry (#246) or repo-local path
domain e.g. coding | knowledge | hybrid — drives admission defaults
requires_repo Whether GitHub clone / PR finalization is mandatory
steps Ordered pipeline phases executed in the container
tool_profile Allowed tools / MCP servers / Cedar policy module refs
prompt System prompt fragment or reference to registry prompt asset
hydration Which context sources to assemble (issue, PR, memory, attachments, URLs)
post_hooks e.g. verify_build, verify_lint, ensure_pr, custom deliverables
terminal_outcomes What "done" means (PR URL, review JSON, artifact upload, comment)

Example step kinds (extensible):

  • clone_repo / skip_repo — coding vs knowledge path
  • hydrate_context — shared hydration contract
  • run_agent — Claude Agent SDK loop with capability prompt + tools
  • verify_build / verify_lint — optional quality gates
  • ensure_pr / post_review / deliver_artifact — domain-specific completion

Platform changes (high level)

  1. Create-task API: Accept capability_ref (registry ID + semver constraint) or legacy task_type (maps to built-in capability for backward compatibility).
  2. Orchestrator: Resolve capability at admission; skip repo pre-flight when requires_repo: false; pass resolved capability bundle to compute session.
  3. Agent pipeline: Replace if task_type == "pr_review" branches with a step runner that interprets the capability file's steps list (deterministic phases + agent invocation).
  4. Shipped capabilities: Publish new_task, pr_iteration, pr_review as first-party capability files — behavior parity with today is an acceptance criterion.
  5. Example non-coding capability: One reference capability (e.g. document_draft or web_research) proving repo-optional execution end-to-end.

Phasing

Phase Deliverable
0 Capability schema + step runner skeleton; design doc
1 Migrate new_task to capability file; legacy task_type alias
2 Migrate pr_iteration, pr_review; parity tests
3 Repo-optional knowledge capability + CLI/API capability_ref submit path
4 Registry-native capabilities (#246); inline/repo-local capabilities for dev

Acceptance criteria

  • Capability file schema documented with step types and validation rules.
  • Step runner executes a capability's steps in order inside the container; failures surface as today (terminal FAILED with structured error).
  • new_task, pr_iteration, pr_review each implemented as capability files with no behavioral regression vs current shipped paths (covered by existing tests + feat(cdk): CDK integ-tests for deployed runtime E2E verification #236 integ when available).
  • At least one non-coding reference capability runs end-to-end without repo (attachments + description sufficient).
  • POST /v1/tasks accepts capability_ref; legacy task_type continues to work (deprecated alias mapping documented).
  • CLI: bgagent submit --capability <id>@<constraint> ... (or equivalent); help text explains migration from --task-type.
  • Resolved capability id/version recorded on task metadata for audit and eval.
  • Types synced: cdk/src/handlers/shared/types.tscli/src/types.ts.

Other information

  • Inspired by: #99 (AKW integration — task_mode, blueprint registry, knowledge tasks). This issue is the focused, incremental track aligned with current codebase — not the full AKW merge (Mem0, ToolBuilderAgent, Trust graduation, etc. remain out of scope or separate issues).
  • Depends on / pairs with: #246 (registry stores published capabilities); #245 (attribution on resolved capability); #236 (E2E verification).
  • Roadmap alignment: Tool capability tiers (extended tool surface per repo); Agent asset registry capability descriptors.
  • Out of scope for this issue:

Acknowledgements

  • I may be able to implement this feature
  • This might be a breaking change

Metadata

Metadata

Assignees

Labels

P1Priority 1 — high priorityagent-runtimePython agent container: pipeline, runner, hooks, prompts, tools, DockerfileapprovedWhen an issue has been approved and readyenhancementNew feature or requestorchestrationTask lifecycle, REST API handlers, orchestrator Lambdas, durable executionregistryAgent asset registry: capabilities, skills, plugins, MCP servers, blueprints

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions