Agent Index

This page is optimized for future agents that need to read and modify UniRL safely. It is the human-readable guide to agent context; /llms.txt is the machine-readable discovery endpoint.

How Agents Use These Docs

Agents should treat the docs as a routing layer, not as a replacement for source inspection:

Open /llms.txt or /md/agents/index.md to discover the maintained documentation surface.
Use the task table below to choose the closest rendered docs page and package README.
Read the nearby implementation before editing.
Prefer /md/<docs-slug>/index.md for focused Markdown context and /llms-full.txt only when a single-file corpus is useful.

Do not add /llms.txt as a docs category. It is a root-level access path for tools and agents, while this Agents section is the visible documentation category.

First Principles

Treat python -m unirl.train_diffusion --config-name=<domain>/<recipe> (and train_vlm / train_pe / train_unified_model) as the maintained runtime entry.
Treat the bucketed examples/<domain>/<recipe>.yaml files as the authoritative configuration surface.
Treat package READMEs as local contracts near the code they describe.
Do not infer runtime behavior from stale scratch docs or ignored local files unless the user explicitly points to them.

Reading Order by Task

Task	Read first
Run or validate a recipe	`/en/docs/getting-started/first-run`, then the launchers in `examples/`
Understand configuration	`/en/docs/configuration/hydra`, then `unirl/config/README.md`
Pick an experiment	`/en/docs/configuration/experiments`, then `examples/<domain>/<name>.yaml`
Understand runtime flow	`/en/docs/architecture/overview`, then `unirl/README.md`
Work on rollout engines	`unirl/rollout/README.md`
Work on the train stack or a training backend	`/en/docs/architecture/trainer-v2`, then `unirl/train/readme.md`
Work on GRPO / NFT / DPPO loss logic	`unirl/algorithms/README.md`
Work on SDE kernels, sigma schedules, or log-probability paths	`unirl/sde/README.md`
Work on rewards	`/en/docs/guides/rewards`, then `unirl/reward/README.md`
Add or debug trainer-to-rollout weight sync	`unirl/distributed/weight_sync/README.md`
Prepare prompt data	`/en/docs/guides/data-preparation`
Add or mount data/model artifacts	`/en/docs/guides/data-and-models`
Debug multinode runs	`/en/docs/guides/multinode`

Machine-Readable Endpoints

Use these endpoints instead of scraping rendered HTML:

Endpoint	Purpose
`/llms.txt`	compact discovery index and access guidance
`/llms-full.txt`	full generated Markdown corpus
`/md/agents/index.md`	this page as Markdown
`/md/configuration/hydra/index.md`	one focused configuration page

These outputs are generated from the same MDX source as the Fumadocs site, so human and agent documentation stay aligned. Keep endpoint details here instead of duplicating them across the docs sidebar.

Safe Editing Policy

When editing the framework:

Identify the owning package from the task table.
Read that package README and the closest existing implementation.
Prefer a typed config dataclass near the implementation over ad hoc string parsing.
Add or update one recipe only when the feature changes runnable behavior.
Run a Hydra compose check before launching Ray work.