UniRL
Agents

Agent Index

Start here when using UniRL documentation as coding-agent context.

This page is optimized for future agents that need to read and modify UniRL safely. It is the human-readable guide to agent context; /llms.txt is the machine-readable discovery endpoint.

How Agents Use These Docs

Agents should treat the docs as a routing layer, not as a replacement for source inspection:

  1. Open /llms.txt or /md/agents/index.md to discover the maintained documentation surface.
  2. Use the task table below to choose the closest rendered docs page and package README.
  3. Read the nearby implementation before editing.
  4. Prefer /md/<docs-slug>/index.md for focused Markdown context and /llms-full.txt only when a single-file corpus is useful.

Do not add /llms.txt as a docs category. It is a root-level access path for tools and agents, while this Agents section is the visible documentation category.

First Principles

  • Treat python -m unirl.train_diffusion --config-name=<domain>/<recipe> (and train_vlm / train_pe / train_unified_model) as the maintained runtime entry.
  • Treat the bucketed examples/<domain>/<recipe>.yaml files as the authoritative configuration surface.
  • Treat package READMEs as local contracts near the code they describe.
  • Do not infer runtime behavior from stale scratch docs or ignored local files unless the user explicitly points to them.

Reading Order by Task

TaskRead first
Run or validate a recipe/en/docs/getting-started/first-run, then the launchers in examples/
Understand configuration/en/docs/configuration/hydra, then unirl/config/README.md
Pick an experiment/en/docs/configuration/experiments, then examples/<domain>/<name>.yaml
Understand runtime flow/en/docs/architecture/overview, then unirl/README.md
Work on rollout enginesunirl/rollout/README.md
Work on the train stack or a training backend/en/docs/architecture/trainer-v2, then unirl/train/readme.md
Work on GRPO / NFT / DPPO loss logicunirl/algorithms/README.md
Work on SDE kernels, sigma schedules, or log-probability pathsunirl/sde/README.md
Work on rewards/en/docs/guides/rewards, then unirl/reward/README.md
Add or debug trainer-to-rollout weight syncunirl/distributed/weight_sync/README.md
Prepare prompt data/en/docs/guides/data-preparation
Add or mount data/model artifacts/en/docs/guides/data-and-models
Debug multinode runs/en/docs/guides/multinode

Machine-Readable Endpoints

Use these endpoints instead of scraping rendered HTML:

EndpointPurpose
/llms.txtcompact discovery index and access guidance
/llms-full.txtfull generated Markdown corpus
/md/agents/index.mdthis page as Markdown
/md/configuration/hydra/index.mdone focused configuration page

These outputs are generated from the same MDX source as the Fumadocs site, so human and agent documentation stay aligned. Keep endpoint details here instead of duplicating them across the docs sidebar.

Safe Editing Policy

When editing the framework:

  1. Identify the owning package from the task table.
  2. Read that package README and the closest existing implementation.
  3. Prefer a typed config dataclass near the implementation over ad hoc string parsing.
  4. Add or update one recipe only when the feature changes runnable behavior.
  5. Run a Hydra compose check before launching Ray work.

On this page