A cross-platform AI coding-agent plugin — 39 skills & 43 agents implementing a "compound engineering" workflow — shipped with a TypeScript converter CLI that retargets it to 10+ agent platforms from one source.
| Field | Details |
|---|---|
| Creators | Every Inc. (every.to) — maintained by Kieran Klaassen (@kieranklaassen) & Trevin Chow (@tmchow, ~60% of commits). Productizes Every's internal "how we code with agents" methodology. |
| Language | TypeScript on the Bun runtime (CLI). Product content is Markdown + YAML frontmatter; co-located Python/Bash skill scripts. |
| Stats | 21,169 ★ · 1,557 forks · 95 open issues · created Oct 2025 · actively pushed (daily). |
| Lines of code | CLI: ~8,900 LOC TypeScript across ~46 src/ files + ~55 test files (1,678 tests). Plugin: 39 skill dirs, 43 agent definitions, ~190 reference/asset files. |
| License | MIT — permissive, commercial-friendly, no copyleft obligations. |
| Security | Low risk. 2 runtime deps · no eval/secrets · hardened path & symlink handling. (full report → slide 11) |
The name says "plugin," but the repository ships two tightly-coupled artifacts. Understanding the split is the key to the whole codebase.
plugins/compound-engineering/ — 39 user-invoked Skills (slash commands) and 43 dispatched Agents (subagents) that encode an opinionated engineering loop: plan deeply, review rigorously, and codify every lesson so the next task is easier.
src/ — a Bun/TypeScript compiler that parses the Claude-format plugin once and emits native bundles for Codex, OpenCode, Gemini, Pi, Kiro, Copilot, Droid & Qwen. Authored once, runs everywhere.
Thesis The plugin is the payload; the CLI is the distribution layer that frees the payload from any single vendor.
Like a transpiler: one parser, an intermediate representation (the Bundle), and a writer per target. Adding a platform = one converter + one writer, nothing else touched.
parsers/claude.ts reads the manifest, agents, skills, hooks & MCP into a typed ClaudePlugin.converters/claude-to-* map tools, models, hooks & permissions into a per-target Bundle (in-memory IR).targets/*.ts emit each Bundle to the platform's real paths with merge semantics.Tools, permissions, hook events & model names are mapped by lookup tables — never by convention. targets/index.ts is the registry that drives --to and --also.
Installs track an install-manifest; removed files move to legacy-backup/<ts>/. opencode.json is deep-merged with a .bak, never clobbered.
ASCII path sanitization, traversal guards, and symlink-ownership checks (only unlink CE-managed links) are enforced by dedicated test suites.
"80% planning & review, 20% execution." Each skill hands a durable artifact to the next; the cycle closes by writing the lesson back into the repo.
brainstorm defines what to build → plan defines how → work executes. Each reads the prior artifact; none is required, but each sharpens the next.
/ce-product-pulse reports what users actually experienced over a window, feeding real signal back into the next strategy and brainstorm.
Supporting skills: ce-debug · ce-doc-review · ce-simplify-code · ce-commit · ce-resolve-pr-feedback · ce-setup · ce-dhh-rails-style · ce-frontend-design · ce-gemini-imagegen · …
/ce-code-review is the showcase: it spawns parallel reviewer sub-agents that return structured JSON, then merges, dedups & gates them. Five reusable primitives make it work:
14 single-lens reviewers. 4 always-on (correctness, testing, maintainability, standards) + conditionals (security, performance, migrations…) selected by agent judgment of the diff, not keyword match.
Findings self-score on a discrete 0/25/50/75/100 scale, each tied to a behavioral test. Default gate suppresses <75 — killing "confident false positives."
Every finding is tagged gated_auto / manual / advisory — encoding how safely a fix may be applied before any code is touched.
Semantic tiers — extraction / generation / ceiling — named per agent so model IDs never hardcode. High-stakes personas inherit the ceiling model; scouts run cheap.
Bulky JSON is written to /tmp/…/run-id/; the orchestrator carries only a compact gist, loading detail from disk only when validating. Scales to many agents without context blowup.
Conditional logic lives in references/ and is pulled in on demand, keeping SKILL.md lean at session-load time — progressive disclosure for prompts.
The differentiator isn't the agents — it's the memory loop. /ce-compound turns every solved problem into a queryable Learning in docs/solutions/ with structured YAML frontmatter.
ce-learnings-researcher filters by frontmatter, then reads only strong matches — scales to hundreds of docs./ce-compound-refresh classifies stale docs as Keep / Update / Consolidate / Replace / Delete against the live code.ce-learnings-researcher, so past lessons resurface inside future work.Conversion is genuine semantic remapping. Capabilities degrade gracefully where a target lacks a primitive (e.g. hooks only survive on OpenCode).
| Target | Agents | Skills | Hooks | MCP / Perms |
|---|---|---|---|---|
| Claude Code | native | native | full | full |
| OpenCode | .md + inferred temp | copied dirs | → TS plugin | merged into opencode.json |
| Codex | .toml custom agents | native + Bun step | suppressed | suppressed (ADR) |
| Gemini CLI | .md | copied dirs | suppressed | mcp.json |
| Pi / Kiro | .md / steering YAML | copied dirs | n/a | mcporter.json / mcp.json |
| Copilot · Droid · Qwen | native (Claude-compatible) | native | suppressed | config |
Model mapping model: sonnet → anthropic/claude-sonnet-4-6 for provider-prefixed targets; subagents drop the field and inherit the session model.
bun run … convert --to {opencode,gemini,codex} all succeeded:
OpenCode → 43 agents · 38 skills · merged config
Codex → 43 agents · native skill tree
Gemini → 43 agents · 189 skill files
(38 vs 39 skills: one is excluded by ce_platforms filtering — exactly as designed.)
The converter infers a low temperature for review personas — not a passthrough copy.
Every skill is a portable directory — no cross-skill imports. That single constraint is what makes the converter possible at all.
Markdown skills are treated as source code: linted by contract tests (ce- prefix, shell safety, relative paths) and versioned via semantic-release.
When a platform can't pick models per-agent, cost falls back to read budgets + output caps — graceful degradation baked into the design.
Skills give an intelligent agent hard rules + judgment room — deliberately under-prescribed, the opposite of brittle scripted flows.
31 real Learnings already in-repo, including one about the team's own release-version drift. The methodology debugged itself.
Agents return a compact gist + a disk dossier — the pattern that lets one orchestrator fan out to a dozen agents without drowning in context.
citty, js-yaml) — tiny supply-chain surface, no postinstall.Bun.spawn only for git with array args (no shell injection).secrets.ts is itself a detector that warns on MCP env vars.claude-opus-4-6…) are a maintenance burden, updated by hand.Yes — as a reference architecture, and as a daily driver if you live on a supported platform.
Bottom line A production-grade, defensively-engineered codebase whose prompt-as-source-code discipline is more instructive than most "agent framework" repos with 10× the code. The methodology's payoff is plausible and well-architected — verify it on your own work before betting the team on it.