# UniRL

> Agent-readable index for UniRL, built from commit: unknown.

- [UniRL Documentation](/en/docs): Agent-first documentation for the UniRL distributed reinforcement learning framework.
- Getting Started
  - [Installation](/en/docs/getting-started/installation): Install UniRL and the optional documentation site.
  - [TransferQueue Installation](/en/docs/getting-started/transfer-queue-installation): Optional rollout→trainer data-plane bus (Simple and Mooncake backends).
  - [First Run](/en/docs/getting-started/first-run): Compose and launch a UniRL experiment recipe.
  - [Docs Site README](/en/docs/getting-started/readme-docs-site): Fumadocs site commands, structure, and maintenance notes.
- Architecture
  - [Concepts & Glossary](/en/docs/architecture/concepts): The core mental model and the domain terms used across UniRL docs and recipes.
  - [Overview](/en/docs/architecture/overview): The main runtime loop, per-domain trainers, rollout engines, train stack, and sync boundaries.
  - [Trainer & Training Stack](/en/docs/architecture/trainer-v2): The single-controller per-domain trainer, the FSDP train stack, and the flat conf recipe shape.
  - [Roadmap](/en/docs/architecture/roadmap): Near-term direction across the Infra, Algorithm, and Model tracks — baselines, goals, and TODOs.
- Configuration
  - [Hydra Configuration](/en/docs/configuration/hydra): How UniRL composes, validates, and overrides runtime configuration.
  - [Experiment Recipes](/en/docs/configuration/experiments): Recipes in the bucketed examples/ tree and how to select one per entrypoint.
- Guides
  - [Data and Models](/en/docs/guides/data-and-models): Prompt data contracts, local datasets, model packages, and checkpoint mounts.
  - [Data Preparation](/en/docs/guides/data-preparation): Prompt file formats, the per-prompt schema, image/condition inputs, and how prompts expand into rollout groups.
  - [Rewards](/en/docs/guides/rewards): Reward service, local and remote backends, and extension points.
  - [Evaluation](/en/docs/guides/evaluation): How quality is measured today (reward scores), the eval plumbing that exists, and what is not wired yet.
  - [Extending UniRL](/en/docs/guides/extending): Where to add models, rollout engines, train-side algorithms, rewards, training backends, and recipes.
  - [Multinode Runs](/en/docs/guides/multinode): Launchers, Ray startup, cluster geometry, and pre-run checks.
  - [Geneval MMCV Setup](/en/docs/guides/geneval-mmcv-setup): Optional MMCV and MMDetection installation for Geneval/OpenMMLab workflows.
- Agents
  - [Agent Index](/en/docs/agents): Start here when using UniRL documentation as coding-agent context.
  - [Agent Task Recipes](/en/docs/agents/task-recipes): Common coding-agent tasks mapped to files, checks, and likely risks.
- Others
  - [GitHub Issues Workflow](/en/docs/others/github-issues-workflow): Issue title, template, labeling, project board, and gh CLI conventions.

## Agent Access Patterns

- Treat this file as a compact discovery endpoint, not as a docs category.
- Start with `/md/agents/index.md` for task routing in Markdown, or `/en/docs/agents` when reading the rendered site.
- Use `/llms-full.txt` for a single-file Markdown corpus.
- Use `/md/<docs-slug>/index.md` for one page as Markdown when focused context is better.

## Authoritative Runtime Entry

- Training entrypoints: `python -m unirl.train_diffusion --config-name=<domain>/<recipe>` (also `train_vlm`, `train_pe`, `train_unified_model`).
- Recipes: self-contained `examples/<domain>/<recipe>.yaml` files grouped by trainer domain (one subdirectory each), selected with `--config-name=<domain>/<recipe>`.