AGENTSNº 006
The Harness Above the Harnesses
Omnigent is Databricks' bid to become the orchestration layer above coding agents — one YAML spec driving Claude Code, Codex, Cursor, and Pi behind a governance engine that agents cannot talk their way around.
Two weeks old, 568,000 lines of code, and the pedigree of Spark and MLflow behind it: Omnigent is what happens when Databricks decides the interesting layer isn’t another coding agent, but the thing that governs all of them. It abstracts agent “harnesses” — Claude Code, Codex, Cursor, Pi — behind a single declarative YAML spec, wraps every action in a policy engine, and makes sessions portable across laptop, browser, and phone.
The Premise
The claim is bold category-creation: a meta-harness. Your agent spec contains zero harness-specific code; one field swaps the backend. Above it sits a governance layer that gates every action ALLOW, DENY, or ASK — and, crucially, that agents cannot weaken from the inside. Whether that claim survives contact with the code is exactly what the study tested.
main advanced 70 PR numbers during the 40-minute study.
The Machine
Four cooperating layers. Clients (CLI, mobile-first web UI, native macOS app) attach to one session. A FastAPI server over Postgres — 58 API paths, real auth with argon2id, JWT, OIDC, and three-level RBAC — streams events over SSE and WebSockets. The policy engine registers twenty handlers across six phases, evaluated stricter-wins with DENY short-circuiting. And the meta-harness itself runs nineteen executor modules, either in-process via vendor SDKs or by driving the real vendor CLI through terminal emulation inside an OS sandbox, with a secretless credential proxy so child agents only ever see synthetic placeholders.
The Test Drive
Five deterministic, model-free experiments. Install: six seconds with uv, at the price of a
458 MB virtualenv — 225 MB of which is the bundled Claude Agent SDK. The policy decision suite
went 15 for 15, including failing closed on an unknown model. Engine composition went 4 for 4,
confirming that a DENY beats an ASK even when the ASK is declared first — the strongest evidence
for the governance thesis. The one blemish: version skew. The flagship examples on main fail
validation against the stable 0.2.0 release, because the README advertises harnesses the shipped
validator rejects. Tellingly, a fix for the exact skew the study surfaced landed upstream within
the hour.
The Fine Print
The scan found no critical issues and no backdoors — and some genuinely strong engineering: a
fail-loud sandbox that raises rather than silently degrading, the secretless credential proxy,
and no third-party analytics at all. Two things to watch: a starlette dependency pin that blocks
the fix for six known vulnerabilities in the server path, and the fact that sandboxing is opt-in
per spec — the flagship orchestrator runs with sandbox: none, isolated instead by git worktrees
and a blast-radius policy.
Agents cannot silently weaken their own governance.
The Verdict
“Meta-harness” is partly marketing, but it is marketing backed by real engineering. The orchestration and policy layers are adoptable today; the cross-device server story deserves a pilot, not a standard. The dominant risk is not safety — it is that the project is moving at multiple PRs per hour, and what you adopt on Monday may be renamed by Friday.
The Deck
Open fullscreen ↗