TrackMania RL - Documentation
Welcome to the TrackMania RL project documentation!
This is a fork and extension of the original Linesight project, adapted for reinforcement learning experiments in Trackmania Nations Forever.
The project trains an AI agent to drive in Trackmania Nations Forever using reinforcement learning. The default stack is IQN (Implicit Quantile Networks, distributional off-policy RL). Policy optimization alternatives are PPO (on-policy clipped actor-critic), DPO (preference-based, same network as PPO), and GRPO (group-relative returns; see GRPO: network and training). All can use a CNN image head, Hugging Face vision, or a shared native multimodal fusion graph (nn.fusion_mode); see Model architectures and Configuration Guide.
Key Features:
Distributional RL with IQN (Implicit Quantile Network), default
Optional on-policy PPO, DPO, and GRPO with the shared TM rollout pipeline and PPO-style actor-critic (see PPO configuration (ppo:), DPO configuration (dpo:), GRPO configuration (grpo:) in Configuration Guide; architecture diagrams under Model architectures)
Modular configuration system for easy experimentation
Support for multiple parallel game instances
Hot-reloadable training parameters
TensorBoard integration for monitoring
Virtual checkpoint system for dense progress tracking
All runs produced by this project are Tool Assisted. They must not be submitted to the Official Leaderboards.
User Documentation:
- Installation
- Getting started
- Custom training
- Configuration Guide
- Quick Start
- Configuration Structure (YAML)
- Neural network YAML (
nn) — full reference - Environment Configuration
- Neural Network Configuration
- Training Configuration
- Memory Configuration
- Exploration Configuration
- Rewards Configuration
- Map Cycle Configuration
- Performance Configuration
- Advanced Topics
- Troubleshooting
- Further Reading
- Game inputs and float observation vector
- TMNF replay download and frame capture
- Pipeline: steps to run (in order)
- How it works (details)
- Module layout (replays_tmnf)
- Modes and options (download)
- Pipeline (–track-ids)
- Filter tracks (step 3): filter_track_ids_no_respawn.py
- Filter tracks (step 3a): filter_track_ids_custom_maptype.py
- Main arguments (download)
- Extracting map from replay
- Frame capture (capture_replays_tmnf.py)
- Examples (from project root)
- Level 0 visual pretraining on captured frames
- API (TMNF-X / ManiaExchange)
- Publishing dataset to Hugging Face Hub
- TensorBoard Metrics Reference
- User FAQ
- Troubleshooting
Dev Documentation:
Model Architectures:
Experiments:
Community tips & tricks