/onyx ..., the agent creates the repo-side files it needs to run auto research. You can review and edit these files to steer the loop.
onyx/onyx.md: Research Brief and Steering File
onyx.md is the durable context for the agent. It should explain what the agent is optimizing, how to measure progress, what files are in scope, and what constraints matter.
The agent creates the first version. You can edit it at any time between runs.
Common sections:
- objective;
- primary metric and direction;
- secondary metrics or tradeoffs;
- how to run the eval;
- files in scope;
- off-limits files or APIs;
- constraints;
- what has already been tried.
onyx/eval.sh: Measurement Script
eval.sh is the repeatable measurement entry point. The agent creates it so onyx exp run can produce comparable results.
It must print at least one metric line:
Optional: onyx/checks.sh
The agent may create checks.sh when your constraints require correctness backpressure. Checks run after a passing eval and do not affect eval timing.
checks_failed.
How to Steer the Agent
Prefer editingonyx.md when you want to change agent behavior. Prefer editing eval.sh when the measurement itself is wrong or missing useful metrics.
Examples:
| You want to change | Edit |
|---|---|
| What strategy the agent should try next | onyx/onyx.md |
| Which files are safe to modify | onyx/onyx.md |
| The benchmark command | onyx/eval.sh |
| Metric parsing | onyx/eval.sh |
| Correctness tests after successful evals | onyx/checks.sh |
Protected During Measurement
During a measured run, agents should not modify:onyx/eval.shonyx/checks.sh