|
1
|
๐ AI and TM enthusiasts:
|
|
2
|
tmrl enables you to train AIs in TrackMania with minimal effort. Tutorial for you guys here, video of a pre-trained AI here, and beginner introduction to the SAC algorithm here.
|
|
3
|
|
|
4
|
๐ ML developers / roboticists:
|
|
5
|
tmrl is a python library designed to facilitate the implementation of ad-hoc RL pipelines for industrial applications, and most notably real-time control. Minimal example here, full tutorial here and documentation here.
|
|
6
|
|
|
7
|
๐ ML developers who are TM enthusiasts with no interest in learning this huge thing:
|
|
8
|
tmrl provides a Gymnasium environment for TrackMania that is easy to use. Fast-track for you guys here.
|
|
9
|
|
|
10
|
๐ Everyone:
|
|
11
|
tmrl hosts the TrackMania Roborace League, a vision-based AI competition where participants design real-time self-racing AIs in the TrackMania 2020 video game.
|
|
12
|
The TMRL project
|
|
13
|
Introduction
|
|
14
|
tmrl is a python framework designed to help you train Artificial Intelligences (AIs) through deep Reinforcement Learning (RL) in real-time applications (robots, video-games, high-frequency control...).
|
|
15
|
|
|
16
|
As a fun and safe robot proxy for vision-based autonomous driving, tmrl features a readily-implemented example pipeline for the TrackMania 2020 racing video game.
|
|
17
|
|
|
18
|
Note: In the context of RL, an AI is called a policy.
|
|
19
|
|
|
20
|
User features (TrackMania example pipeline):
|
|
21
|
Training algorithms: tmrl comes with a readily implemented example pipeline that lets you easily train policies in TrackMania 2020 with state-of-the-art Deep Reinforcement Learning algorithms such as Soft Actor-Critic (SAC) and Randomized Ensembled Double Q-Learning (REDQ). These algorithms store collected samples in a large dataset, called a replay memory. In parallel, these samples are used to train an artificial neural network (policy) that maps observations (images, speed...) to relevant actions (gas, break, steering angle...).
|
|
22
|
|
|
23
|
Analog control from screenshots: The tmrl example pipeline trains policies that are able to drive from raw screenshots captured in real-time. For beginners, we also provide simpler rangefinder ("LIDAR") observations, which are less potent but easier to learn from. The example pipeline controls the game via a virtual gamepad, which enables analog actions.
|
|
24
|
|
|
25
|
Models: To process LIDAR measurements, the example tmrl pipeline uses a Multi-Layer Perceptron (MLP). To process raw camera images (snapshots), it uses a Convolutional Neural Network (CNN). These models learn the physics of the game from histories or observations equally spaced in time.
|
|
26
|
|
|
27
|
Developer features (real-world applications in Python):
|
|
28
|
Python library: tmrl is a complete framework designed to help you successfully implement ad-hoc RL pipelines for real-world applications. It features secure remote training, fine-grained customizability, and it is fully compatible with real-time environments (e.g., robots...). It is based on a single-server / multiple-clients architecture, which enables collecting samples locally from one to arbitrarily many workers, and training remotely on a High Performance Computing cluster. A complete tutorial toward doing this for your specific application is provided here.
|
|
29
|
|
|
30
|
TrackMania Gymnasium environment: tmrl comes with a Gymnasium environment for TrackMania 2020, based on rtgym. Once the library is installed, it is easy to use this environment in your own training framework. More information here.
|
|
31
|
|
|
32
|
External libraries: tmrl gave birth to some sub-projects of more general interest, that were cut out and packaged as standalone python libraries. In particular, rtgym enables implementing Gymnasium environments in real-time applications, vgamepad enables emulating virtual game controllers, and tlspyo enables transferring python object over the Internet in a secure fashion.
|
|
33
|
|
|
34
|
TMRL in the media:
|
|
35
|
In the french show Underscore_ (2022-06-08), we used a vision-based (LIDAR) policy to play against the TrackMania world champions. Spoiler: our policy lost by far (expectedly ๐); the superhuman target was set to about 32s on the tmrl-test track, while the trained policy had a mean performance of about 45.5s. The Gymnasium environment that we used for the show is available here.
|
|
36
|
|
|
37
|
In 2023, we were invited at Ubisoft Montreal to give a talk describing how video games could serve as visual simulators for vision-based autonomous driving in the near future.
|
|
38
|
|
|
39
|
Installation
|
|
40
|
Detailed instructions for installation are provided at this link.
|
|
41
|
|
|
42
|
Getting started
|