Hydra Configuration
How UniRL composes, validates, and overrides runtime configuration.
UniRL uses Hydra. Recipes are self-contained YAML files under examples/, bucketed by trainer domain (diffusion/, vlm/, llm/, pe/, unified_model/); each training run selects one with --config-name:
python -m unirl.train_diffusion --config-name=<domain>/<recipe>Composition
A recipe instantiates each runtime component directly by _target_ (FSDP backend, train stack, rollout engine, reward service, algorithm, data source, …). Config classes are plain @dataclasses defined next to the code that consumes them — there is no ConfigStore and no registration decorator. A recipe wires a component by pointing its _target_ at the class and nesting the component's config: block (also a _target_). Use the generated Config Package README for the instantiation and validation contracts.
Where Knobs Belong
Keep recipe-defining choices in examples/<domain>/<recipe>.yaml. Keep cluster-local paths, model mounts, output directories, and WandB identity in launcher environment variables or CLI overrides (recipes interpolate them with ${oc.env:...}). A new typed runtime component should define its config @dataclass next to the implementation; a recipe then references it by _target_ (no registration step).
Override precedence is:
CLI Hydra override > launcher env var > YAML defaultRuntime Contracts
Cross-component validators run before Ray worker creation. The implementation details and validator list live in the generated Config Package README.
sync and the tensor transport solve different problems. sync (cfg.sync) sends trainer weights back to dedicated rollout engines, while the tensor transport (unirl/distributed/tensor/) is the data plane for moving bulky rollout outputs between workers.
Use a compose check before launching a large job:
python -m unirl.train_diffusion --config-name=<domain>/<recipe> --cfg job --resolve