Reinforcement Learning
Vector quantization (vq)
1 class nu-SVM
Reinforcement Learning
Particle Swarm Optimization
Unscented Kalman Filter
gaml (general purpose machine learning library)
rl library

MAchine Learning and Interactive Systems

rl library
  by Frezza-Buet Herve, Geist Matthieu


The rl library provides C++ classes and templates for the robust and efficient design of reinforcement learning applications. Templates are used in order to fit the mathematical formalism of the reinforcment learning theory.

Version 3.00.00 is released, it simplifies dramatically the code, and allows to get rid off complex cencepts.

Cite our work...

You can use the rl library freely for your research, for software design as well as education. Please cite the library as follows :

H. FREZZA-BUET, M. GEIST, "A C++ Template-Based Reinforcement Learning Library : Fitting the Code to the Mathematics". In Journal of Machine Learning Research, 14:625 - 628, 2013.

JMLR Publication

The library has been published in JMLR. The published version is rl-2.04.00, but this is not the latest release. For the latest release (recommended), read next section. We still provide the sources and the documentation related to the publication.

Download and installation

The rl source code, since version 2.04.00, only consists of a set of headers. It is thus quite easy to install manually. Nevertheless, we propose some packaging support.
- You can download the latest rl-*.tar.gz library, and then install form sources (see here). You can also use yum if you are using Fedora (see here).


You can find rl documentation here. It is a Doxygen documentation, where examples are numbered. Read them in the suggested order, this will provide you with an up-to-date user manual.


This example implements a 50-armed bandit. An agent chooses one of the arms, according to some Q-function associated to each arm. The example plots the histograms of the agent choices (20000 choices are made).

This is the Q-function :

This is the histogram for the random agent :

This is the histogram for the greedy agent :

This is the histogram for the epsilon-greedy agent :

This is the histograms for the Soft-Max agent when the temperature varies :


The cliff-walking problem, solved with sarsa and Q-learning.


The command

rl-example-003-003-mountain-car-ktdsarsa learnandmovie 300

generates the Q-function movie. Try with 100 epsides instead of 300, it works.

The command

rl-example-003-003-mountain-car-ktdsarsa test bottom

generates an episode like this