nyx-coder
An autonomous coding agent that reads your repo, makes the change, runs the tests, and iterates until it is green, then opens the PR. Self-hosted, on any model, in a sandbox.
nyx-coder is built around verification. It does not call a change done because the model felt confident; it runs your repo's own test gate first, and inside Nyx it hands the result to a Reviewer that opens a real browser and checks it works on desktop and mobile. The edge is the verified, autonomous delivery pipeline, not the model, so you point it at a cheap model for routine work or a frontier one when quality matters, with a single env var.
A complete coding loop, confined and verifiable.
safe by default
Every path and shell command is confined to the task workdir, with a policy that denies destructive or network commands.
small diffs
A codex-style patch channel with fuzzy and multi-occurrence matching, dry-run, and clear conflict reports.
long runs
Folds older turns into a summary so the context window never overflows on a big build.
ships it
Diff summaries, auto-commit, and opt-in branch, push, and PR. Never pushes without your say-so.
teammate
Watches a repo's issues, fixes them through the loop, and opens a PR that closes them. Opt-in, allowlisted.
any provider
planner / worker / vision / image / embedding, each swappable by env. Bring your own key.
You hand it a task. It works a loop, in a sandbox, and does not stop until the tests pass.
In plain terms: it reads the relevant files to understand the code, makes a small patch instead of a full-file rewrite, runs the tests, and if they fail it tries again with the failure in hand. Only when the gate is green does it commit or open a PR. Everything happens inside a confined workdir, so it cannot touch anything outside the task, and it runs on the Nyx spine, so if a step ever hangs the system skips it and moves on instead of getting stuck.
vs most agents
Others write code and stop. nyx-coder runs your real test gate, and a browser review, before calling anything done.
vs IDE assistants
Not a pair-programmer waiting on you. Hand it a goal or an issue and walk away; it finishes and opens a PR.
vs cloud tools
Self-hosted, your machine, your models. No code leaves your box and the cost is just your API key.
vs one-model tools
Swap the model per role with an env var. Cheap for routine work, frontier when it matters.
The heavy compute (the model) lives in the API. The agent itself is small: about 5,000 lines, with zero runtime native dependencies.
Zero runtime native deps. The test suite is network-free, so it runs with no key.
# clone, then npm install npm test # 157 tests, no key needed cp .env.example .env # add OPENROUTER_API_KEY npm run cli -- "add a sum() function in src/math.ts with a test"
There is no interactive prompt and no slash-commands like /clear or /model. nyx-coder is non-interactive on purpose: you drive it two ways, a one-shot CLI or a small HTTP surface, and you pick the models by env, not by command.
# one-shot CLI: a task in, verified code out npm run cli -- "task description" --workdir . --tier fast --max 30 # --workdir confine every file and command to this dir (default: .) # --tier fast = worker model, smart = planner model (default: smart) # --max step ceiling before it halts (default: 30) npm test # run the 157-test gate (no key needed) npm run typecheck # tsc --noEmit # or drive it over HTTP (src/server.ts, bound on 127.0.0.1) POST /runs {"goal":"...","workdir":"."} # start a run GET /runs # list runs GET /runs/:id # one run plus its result GET /runs/:id/events # its full event trace GET /health # { ok: true } # choose the model per role, any OpenAI-compatible provider NYX_MODEL_PLANNER=deepseek/deepseek-v4-pro # high-level reasoning NYX_MODEL_WORKER=deepseek/deepseek-v4-flash # routine, low-level steps NYX_MODEL_VISION=qwen/qwen-2.5-vl-72b-instruct # screenshots, design refs NYX_MODEL_IMAGE=black-forest-labs/flux-1.1-pro # image generation
Why no REPL? nyx-coder is built to be driven by Nyx, not typed at. The orchestrator hands it a goal and walks away, so the surface is a single call that runs to a verified result. A human-facing interactive mode with slash-commands is a clean future add, not something it needs to do its job.
nyx-coder is the coding agent inside Nyx, a self-hostable autonomous-agent system: an orchestrator that runs a team of doers (coder, researcher, reviewer) on a self-healing spine that cannot wedge. The assistant routes and aggregates; nyx-coder does the building.