TrackMania RL - Documentation

Welcome to the TrackMania RL project documentation!

This is a fork and extension of the original Linesight project, adapted for reinforcement learning experiments in Trackmania Nations Forever.

The project trains an AI agent to drive in Trackmania Nations Forever using reinforcement learning. The default stack is IQN (Implicit Quantile Networks, distributional off-policy RL). Policy optimization alternatives are PPO (on-policy clipped actor-critic), DPO (preference-based, same network as PPO), and GRPO (group-relative returns; see GRPO: network and training). All can use a CNN image head, Hugging Face vision, or a shared native multimodal fusion graph (nn.fusion_mode); see Model architectures and Configuration Guide.

Key Features:

All runs produced by this project are Tool Assisted. They must not be submitted to the Official Leaderboards.

User Documentation:

Experiments:

Community tips & tricks