NN topology catalog (supported stacks)
This page lists every routing path the training code can build from YAML nn + training.algorithm + (IQN only) btr:. It complements the narrative pages IQN architecture, PPO actor-critic architecture, GRPO: network and training, BTR options (IQN + paper extras) and the field-by-field Neural network YAML (nn) — full reference in Configuration Guide.
DPO and GRPO reuse the same nn routing and built modules as PPO (get_wiring("dpo" | "grpo") → ppo_wiring). For training semantics, see Configuration Guide (DPO configuration (dpo:), GRPO configuration (grpo:)) and GRPO: network and training.
Authoritative schema: config_files/nn_schema.py (NnConfig). Factory: trackmania_rl/agents/policy_models/multimodal_torch_fusion.py (TorchMultimodalActorCritic, build_multimodal_fusion_uncompiled), ppo_wiring.py, iqn.py (build_iqn_network_uncompiled), hf_actor_critic.py.
Vision branch name in code is from infer_vis_branch(nn.vis) in nn_schema: none (no_image), cnn, native_transformer (transformer with use_hf_backbone: false), hf_transformer (transformer with use_hf_backbone: true).
Warning
Most nested nn models use Pydantic extra="ignore". Unknown or misspelled keys under nn.* are silently dropped at load — they do not error. Prefer this catalog + Neural network YAML (nn) — full reference over guesswork.
1. fusion_mode: none (no multimodal stack)
Algorithm |
Vision (effective |
Built module(s) |
Notes |
|---|---|---|---|
IQN |
|
|
Optional |
IQN |
|
|
Float-only; image tensor can be zeros at runtime. |
IQN |
|
|
Requires |
IQN |
|
— |
Not wired: |
PPO |
|
|
CNN kwargs only from |
PPO |
|
|
HF CLS + float MLP + shared trunk + policy/value heads. |
PPO |
|
|
Pitfall: no conv stem is built → float-only behavior (image side zeros). For native patch vision use |
2. Multimodal fusion modes
Here nn.fusion_mode is one of vision_transformer, post_concat, or unified.
Shared body: TorchMultimodalActorCritic (multimodal_torch_fusion.py).
PPO —
include_policy_heads=True(trunk +policy_head/value_head).IQN —
include_policy_heads=False; wrapped byIQNSharedBackboneNetwork+iqn_fc+ dueling heads (same quantile path as classic IQN after fusion hidden).
Float MLP width for fusion builds: nn.encoder.mlp.hidden_dim if set, else nn.float.mlp.hidden_dim (float_hidden_dim_effective()).
Fusion trunk kind (after early tokens / concat): nn.encoder.fusion_encoder if set, else inferred by infer_fusion_encoder in nn_schema:
If
fusion_encoderis set → use it (must agree withencoder.transformer.use_hf_backbone; schema forbidsnative_transformer+ HF backbone on the same encoder slot).Else if
encoder.transformer.use_hf_backbone: true→hf_embedding(HF model withinputs_embeds, e.g. BERT-class; path fromencoder.transformer.model_name_or_pathorencoder.hf_embedding).Else if
fusion_mode == vision_transformer→linear(concat embeddings →bridgeLinear todecoder.dense_hidden_dimension).Else →
native_transformer(torch.nn.TransformerEncoderon the fusion sequence;n_layers: 0means no encoder layer — optional blocks skipped via_make_encoder_optional).
Explicit kinds ``mlp`` / ``cnn`` / ``hf_embedding`` use nn.encoder.fusion_mlp, fusion_cnn, hf_embedding respectively (see Neural network YAML (nn) — full reference).
vision_transformer mode
Image → float MLP → fuse (default trunk linear unless overridden).
|
Image path (if |
Fusion path |
|---|---|---|
|
|
Default |
|
|
Same as above after pooling / embedding. |
|
HF vision backbone + optional |
Same default |
|
No image tokens |
Float-only side still participates in concat / sequence as implemented. |
post_concat mode
Tokenize vision + float, then fusion trunk.
|
Behavior (simplified) |
Typical vision |
|---|---|---|
|
Image branch (CNN / native / HF) and float MLP produce a single fused vector → projected to |
CNN, native patch stack, or HF with |
|
Vision contributes one or many tokens at |
CNN → one vision token; native patches → many; HF with |
unified mode
Joint sequence over image token(s) and learned float token(s).
|
Image tokens |
Constraints |
|---|---|---|
|
One image token (conv → Linear to |
Floats → |
|
Patch grid tokens at |
Schema enforces |
|
N tokens from HF backbone (count derived from processor / backbone); projected to |
Optional native |
float_feature_extractor (2× MLP on floats) is omitted for unified and for post_concat + token_sequence + float_token_input: raw — floats enter tokenization directly where that path applies.
3. IQN decoder and BTR on heads
Applies to classic IQN_Network and shared-backbone IQN (multimodal / HF vision).
Slots
decoder.advantageanddecoder.value: eithermlportransformer(not both per slot). Aliases:mlp.layers↔n_hidden_layers;hidden↔hidden_dim.Transformer slot: native
torch.nn.TransformerEncoderon chunked state; schema requiresdecoder.shared_input: post_tauif any slot usestransformer.BTR dense-head flags (LayerNorm, NoisyNet,
noisy_sigma0) apply viaiqn_btr_mlp_head_kw_from_config(see BTR options (IQN + paper extras)).
4. Warm start and checkpoints
Multimodal PPO:
nn.init_from_pretrained— Rulka fusionsave_pretraineddir; loaded after build inmake_multimodal_fusion_network_pair(unless skipped via utility flag; see Neural network YAML (nn) — full reference).Multimodal IQN: same directory format may exist, but automatic hub load is not guaranteed to mirror PPO — prefer continuing from
weights1.torch/ explicit load in your workflow.Hub JSON may carry
rulka_transformers.vis_cnnfor CNN stems; older bundles without it fall back to default conv kwargs (see PPO actor-critic architecture).
5. Reference YAML files
File |
Role |
|---|---|
|
IQN |
|
IQN + |
|
PPO baseline ( |
|
Minimal PPO CNN + float MLP. |
|
PPO |
|
PPO |
There is no single YAML file covering every cell of the tables above; combine Neural network YAML (nn) — full reference with the closest example and edit nn fields.
See also
IQN architecture — IQN routing and tensors
PPO actor-critic architecture — PPO variants A/B/C
GRPO: network and training — GRPO training (same stacks as PPO)
BTR options (IQN + paper extras) — BTR flags on IQN
Neural network YAML (nn) — full reference in Configuration Guide
config_files/nn_schema.py— validators for mutually exclusive options and geometry