Skip to main content
Onyx is designed to be agent-first. The CLI handles credentials, local state, eval execution, and sync; the agent uses those primitives to run the research loop.

Start a New Research Direction

Use /onyx from an agent that has the Onyx skill installed:
/onyx Tune my PID controller gains, minimize tracking error
The agent should turn that request into a branch, a repo-side research brief, an eval script, a baseline, and then a loop of measured experiments.

Prompt Shape

Short prompts work, but precise prompts help the agent set up better evals.
/onyx Tune my PID controller gains, minimize error

What the Agent Does

1

Clarifies the setup

The agent asks for any missing goal, metric, direction, files in scope, constraints, or stop conditions.
2

Creates the branch

It creates or resumes an append-only onyx/{name} git branch.
3

Writes `onyx.md`

It creates a research brief that future agents and humans can read.
4

Writes `eval.sh`

It creates a repeatable measurement script that prints METRIC name=value.
5

Runs experiments

It edits code, commits attempts, runs evals, logs results, and pushes or syncs.

Steer an Active Run

Edit onyx/onyx.md when you want to change how the agent behaves. This is more durable than only saying something in chat because a future agent can read it after a context reset. Good steering edits include:
  • add or remove files in scope;
  • add constraints such as “no new dependencies” or “do not change the hardware interface”;
  • define secondary metrics to watch;
  • summarize approaches that failed;
  • add a new promising idea;
  • set stop conditions.
Then tell the agent:
/onyx Continue with the updated onyx.md guidance

Under the Hood

The agent uses these CLI primitives:
onyx branch create
onyx exp run
onyx exp log
onyx push
onyx sync
onyx exp list
You can run them manually when debugging, but normal users should start from /onyx.

Resuming Work

On resume, the agent reads:
  • onyx/onyx.md;
  • recent git history;
  • onyx status;
  • onyx exp list --limit 20;
  • any queued local state under .git/onyx/.
This lets it continue from the best result rather than blindly building on the latest commit.