Agent Index
Start here when using UniRL documentation as coding-agent context.
This page is optimized for future agents that need to read and modify UniRL safely. It is the human-readable guide to agent context; /llms.txt is the machine-readable discovery endpoint.
How Agents Use These Docs
Agents should treat the docs as a routing layer, not as a replacement for source inspection:
- Open
/llms.txtor/md/agents/index.mdto discover the maintained documentation surface. - Use the task table below to choose the closest rendered docs page and package README.
- Read the nearby implementation before editing.
- Prefer
/md/<docs-slug>/index.mdfor focused Markdown context and/llms-full.txtonly when a single-file corpus is useful.
Do not add /llms.txt as a docs category. It is a root-level access path for tools and agents, while this Agents section is the visible documentation category.
First Principles
- Treat
python -m unirl.train_diffusion --config-name=<domain>/<recipe>(andtrain_vlm/train_pe/train_unified_model) as the maintained runtime entry. - Treat the bucketed
examples/<domain>/<recipe>.yamlfiles as the authoritative configuration surface. - Treat package READMEs as local contracts near the code they describe.
- Do not infer runtime behavior from stale scratch docs or ignored local files unless the user explicitly points to them.
Reading Order by Task
| Task | Read first |
|---|---|
| Run or validate a recipe | /en/docs/getting-started/first-run, then the launchers in examples/ |
| Understand configuration | /en/docs/configuration/hydra, then unirl/config/README.md |
| Pick an experiment | /en/docs/configuration/experiments, then examples/<domain>/<name>.yaml |
| Understand runtime flow | /en/docs/architecture/overview, then unirl/README.md |
| Work on rollout engines | unirl/rollout/README.md |
| Work on the train stack or a training backend | /en/docs/architecture/trainer-v2, then unirl/train/readme.md |
| Work on GRPO / NFT / DPPO loss logic | unirl/algorithms/README.md |
| Work on SDE kernels, sigma schedules, or log-probability paths | unirl/sde/README.md |
| Work on rewards | /en/docs/guides/rewards, then unirl/reward/README.md |
| Add or debug trainer-to-rollout weight sync | unirl/distributed/weight_sync/README.md |
| Prepare prompt data | /en/docs/guides/data-preparation |
| Add or mount data/model artifacts | /en/docs/guides/data-and-models |
| Debug multinode runs | /en/docs/guides/multinode |
Machine-Readable Endpoints
Use these endpoints instead of scraping rendered HTML:
| Endpoint | Purpose |
|---|---|
/llms.txt | compact discovery index and access guidance |
/llms-full.txt | full generated Markdown corpus |
/md/agents/index.md | this page as Markdown |
/md/configuration/hydra/index.md | one focused configuration page |
These outputs are generated from the same MDX source as the Fumadocs site, so human and agent documentation stay aligned. Keep endpoint details here instead of duplicating them across the docs sidebar.
Safe Editing Policy
When editing the framework:
- Identify the owning package from the task table.
- Read that package README and the closest existing implementation.
- Prefer a typed config dataclass near the implementation over ad hoc string parsing.
- Add or update one recipe only when the feature changes runnable behavior.
- Run a Hydra compose check before launching Ray work.