Exposes

21 février 2018, Amphithéatre Shannon, batiment 660
23 février 2018, salle 2014, 2e etage, batiment Shannon, 660

https://docs.google.com/spreadsheets/d/1eidQleMOdpmXbr3tGaKUdhM0avHUGwBQTtW-lh2QrL4/edit?usp=sharing

Examen de l'an dernier

2016-2017 AIC_RL_Exam_16.pdf

Projects

Todo: code, experiments, analysis, written report.

Copy-paste of existing programs on the Net will have consequences, which could include receiving a mark of 0.

Projects involve at most 3 students except for the last two (Halite & Alesia: 4 students).
Projects are due on February, 15th, 23:59 GMT+1.

Each group must produce :

A report of circa 2 pages (max 3 pages without references), TeX and .pdf files, including a description of the approach, results and comparison with other algorithms/state of the art (when possible), using the ICML 2017 format. People not able to write TeX can produce a .doc(x) document, with its .pdf.( Description | ICML2017 TeX package )
The code of your implemented approach. This code should work "out of the box", add a notice/readme giving the list of required packages/libraries, special notes if needed. Producing a code taken from the internet, with none or little modifications could lead to unwanted consequences.

You can discuss about your project's problems/ideas, and ask for more information at : diviyan (at) lri (dot) fr

The subjects are the following (increasing difficulty):

Mountain car problem (compare two approaches)
Inverted pendulum (compare two representations of the problem)
The acrobot
Octopus
Td-gammon
bicycle: equilibrium + advancing
Anti-Imitation Policy learning: reproduce an experiment from mainDIVA.pdf
halite.io
Jeu d'Alesia (voir Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, ICML 15)Alesia_game.zip

Pointers

Evaluation

Exam écrit 12 février
TP notés
Projets

13 nov, Michele Sebag

Generalities RL_2017_Cours1.pdf
Value functions RL_2017_Cours2.pdf

20 nov. MS + DK

Value functions followed, Model-free settings RL_2017_Cours3.pdf

27 nov. DK

4 dec. pas cours

11 dec. MS

8 jan MS + DK

Multi-armed bandits: revised course RL_2017_Cours4_revised.pdf

15 jan.

Cours Function Approximation Cours_RL_15_Jan_2018.pdf

24 jan.

Cours Mehdi Khamassi
Cours Direct Policy Search RL_2017_Cours5.pdf

Jour à fixer, présentation d'articles

Neural Optimizer Search with Reinforcement Learning ICML 2017
Boosted Fitted Q-Iteration
Constrained Policy Optimization
~~Curiosity-driven Exploration by Self-supervised Prediction~~ Pauline Brunet et Quentin Bouchut
~~The K-armed Dueling Bandits Problem~~ Zizhao Li et Xudong Zhang
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
~~Coordinated Multi-Agent Imitation Learning~~ Ghiles SIDI SAID et Amine BIAD
Local Bayesian Optimization of Motor Skills
~~Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning~~, Thomas Gauthier, Pereira Abou Rejaili Rodrigo
~~Designing Neural Network Architectures using Reinforcement Learning~~ Mohamed Ali Dargouth & Walid Belrhalmia___
~~Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning~~ Eden Belouadah
Abstraction Selection in Model-based Reinforcement Learning
Universal Value Function Approximators
Deterministic Policy Gradient Algorithms
Dynamic Programming Boosting for Discriminative Macro-Action Discovery

Deep RL: DQN, AlphaZeroGo, AlphaZero

~~Playing Atari with Deep Reinforcement Learning~~ Vincent Boyer et Ludovic Kun
~~[|]~~, Zhengying Liu
~~Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm~~, Thomas Foltete et Guillaume Collin
~~Deep Reinforcement Learning from Self-Play in Imperfect-Information Games~~ Adrien Pavao et Eleonor Bartenlian

ID	Nom	Commentaire	Envoyé	Taille	Téléchargements
1727	AIC_RL_Exam_16.pdf	RL Exam AIC 2016-2017	sebag mer. 07 de Feb, 2018 15h59	118.81 Kb	834
1726	RL_2017_Cours5.pdf	Cours DPS	sebag mer. 24 de Jan, 2018 00h47	1.71 Mo	988
1724	Cours_RL_15_Jan_2018.pdf	RL Cours Func. Approx	sebag lun. 15 de Jan, 2018 01h14	2.05 Mo	1034
1722	RL_2017_Cours4_revised.pdf		sebag dim. 07 de Jan, 2018 21h46	2.13 Mo	858
1721	Alesia_game.zip	Alesia	sebag jeu. 04 de Jan, 2018 01h19	3.10 Kb	658
1720	mainDIVA.pdf	DiVA	sebag jeu. 04 de Jan, 2018 01h17	1.12 Mo	815
1718	RL_2017_Cours3.pdf	RL Cours 3	sebag lun. 20 de Nov, 2017 10h50	924.81 Kb	941
1716	RL_2017_Cours2.pdf	RL cours 2	sebag lun. 13 de Nov, 2017 22h31	833.02 Kb	707
1715	RL_2017_Cours1.pdf	RL cours 1	sebag lun. 13 de Nov, 2017 22h31	1.48 Mo	782

2017 Module Reinforcement Learning, Michele Sebag, Diviyan Kalainathan, Laurent Cetinsoy

Exposes

Examen de l'an dernier

Projects

Pointers

Evaluation

13 nov, Michele Sebag

20 nov. MS + DK

27 nov. DK

4 dec. pas cours

11 dec. MS

8 jan MS + DK

15 jan.

24 jan.

Jour à fixer, présentation d'articles

Deep RL: DQN, AlphaZeroGo, AlphaZero

Fichiers joints

actions

2017 Module Reinforcement Learning, Michele Sebag, Diviyan Kalainathan, Laurent Cetinsoy

Exposes

Examen de l'an dernier

Projects

Pointers

Evaluation

13 nov, Michele Sebag

20 nov. MS + DK

27 nov. DK

4 dec. pas cours

11 dec. MS

8 jan MS + DK

15 jan.

24 jan.

Jour à fixer, présentation d'articles

Deep RL: DQN, AlphaZeroGo, AlphaZero

Fichiers joints