Skip to main content

OSS agent execution harness

Route code tasks to cheap workers without losing control.

Use one strong supervisor model as the architect, then route bounded coding work to cheap local workers with deterministic verification, policy guardrails, and operator control.

Standalone OSS, optimized for solo power users running local or mixed-model execution.

agent-harness://quickstartlocal profile
harness init --profile local
harness doctor
harness demo --list-tasks
harness demo --dry-run --task ts-isUUID

# then run one bounded task
harness run tasks/ts-lib/ts-isUUID.json \
  --repo test-repos/ts-lib
3
supported install modes: npm, Docker, source
4
first-run flows: init, doctor, demo, docs
25
CLI integration tests protecting onboarding
OSS
standalone delivery before hosted expansion

What it does

Productize agent delegation without turning it into a black box.

Agent Harness is the execution layer between a high-skill supervisor and a pool of cheaper workers. The point is not raw autonomy. The point is repeatable bounded execution with evidence, routing, verification, and recovery you can actually trust.

N

Bounded worker execution

Delegate narrow coding tasks to local or cheap workers while the supervisor stays focused on planning and review.

A

Deterministic verification

Check expected text, file scope, policy, and test commands before a task is allowed to pass.

D

Profile-driven first run

Generate current-schema configs with local, distributed, or semantic profiles instead of hand-assembling JSON from source code.

M

Operator trust surfaces

Doctor checks, reports, approvals, audit logs, and structured artifacts make failures and recoveries explainable.

L

Routing and provider flexibility

Mix Ollama, LM Studio, Anthropic, OpenAI, and OpenAI-compatible endpoints behind one model registry and router.

C

Distributed when needed

Run a local single-user harness today, then move to controller plus workers without changing the product story.

How it works

Start local, add sophistication only where it buys reliability.

1

Choose an install mode

Start with npm, Docker Compose, or local-source dev mode. Those are the canonical supported paths.

2

Generate and validate config

Use `harness init --profile ...` to create a current config, then `harness doctor` to verify paths, git, and model connectivity.

3

Run a bounded task

Point the harness at a repo and one or more task definitions. A supervisor can route, verify, escalate, or require approval.

4

Inspect artifacts and recover

Use reports, audit logs, approvals, and demo flows to understand outcomes and recover from the common failure cases.

Supervisor agent via MCP or SDK
Worker adapters for local and hosted models
Git worktree isolation
Deterministic verifier
Policy and approvals
Reports, audit, and recovery artifacts

First-run shape

Start with `harness init`, `harness doctor`, and `harness demo`. Expand to distributed workers, semantic context, and approvals only when your workflow actually needs them.

Positioning

Not another general chat UI. A harness for execution quality.

Standalone OSS first

Agent HarnessYesAiderNoOpenHandsNoCursorNo

Local-worker orchestration

Agent HarnessYesAiderNoOpenHandsYesCursorNo

Structured approvals and policy

Agent HarnessYesAiderNoOpenHandsNoCursorNo

Profile-driven init and doctor

Agent HarnessYesAiderNoOpenHandsNoCursorNo

Distributed controller and workers

Agent HarnessYesAiderNoOpenHandsYesCursorNo

Soft-stable reports and CLI surface

Agent HarnessYesAiderNoOpenHandsNoCursorNo

Current product path

Standalone OSS first, team operation later.

The roadmap prioritizes reliable weekly usage, benchmarkable quality, and operator trust before any hosted-control-plane story.

OSS Local

$0

Best-supported path today. Single-user standalone operation with local or mixed-model execution.

  • Yesnpm, Docker Compose, and source install paths
  • YesProfile-driven init and doctor
  • YesTask execution, verification, approvals, and reports
  • YesDocs and examples for local, distributed, semantic, and regulated flows
  • YesOpen source repo and direct file-level control
Read Quickstart

Power User

Current focus

Phase 12 and 13 work: release hardening, evidence-rich reliability, and first-run trust surfaces.

  • YesRelease workflow and compatibility policy
  • YesStructured recovery and evidence plans
  • YesBenchmark-driven routing improvements
  • YesExpanded examples and smoke coverage
  • YesOperator-first failure and recovery guidance
View Roadmap

Team-Ready Later

Planned

Shared-controller and multi-operator features are planned, but not allowed to distort the standalone product path.

  • YesRemote controller patterns
  • YesAuditable operator actions
  • YesExport and backup flows
  • YesStorage abstraction for future DB backends
  • YesStandalone mode remains first-class
Read Docs

FAQ

Practical answers before platform mythology.

Is Agent Harness open source?

Yes. The current product direction is standalone OSS first, with npm, Docker Compose, and source installs as the canonical supported paths.

What problem does it solve?

It gives supervisors a reliable way to delegate bounded coding work to cheaper workers while keeping verification, policy, routing, and recovery under operator control.

Do I need local models?

No, but that is the primary wedge. Agent Harness also supports hosted and OpenAI-compatible providers through the shared adapter layer.

What is the best first run?

Generate a config with `harness init --profile local`, run `harness doctor`, then inspect or run the built-in `harness demo` scenario.

Is this a hosted team product?

Not in this cycle. Team-ready operation is planned later, but the current roadmap explicitly keeps standalone local use as the primary product path.

Where should I start if I want to evaluate it quickly?

Use the install guide, quickstart, and examples docs. They map directly to the current first-run CLI and tested example configs.

Get a real first run in under 10 minutes.

Start with the tested quickstart flow, inspect one demo task, and validate the example configs before you wire it into a bigger agent stack.