Project: Reinforcement Learning
Pretraining
Literature
Multi-Task Reinforcement Learning: Shaping and Feature Selection, Matthijs Snel and Shimon Whiteson, EWRL 2011.
the Cacla algorithm
Insights in Reinforcement Learning, recent dissertation by Hado van Hassel
This algorithm uses a value function and an action function, both implemented as continuous function approximators. The value function is trained by temporal-difference error, as usual. The action function is trained using a target function of the action taken if the TD error is positive. No update is made if the TD error is negative. This algorithm is shown to be very competitive, with some advantages, to other algorithms. Hardest problem considered is a double-pendulum.
Control of Power Systems
Literature
Using SVM Regression
Knowledge-Based Support-Vector Regression for Reinforcement Learning by Maclin, Shavlik, Walker, Torrey
Power Systems
- Ernst, Glavic, Capitanescu, and Wehenkel, Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 39, NO. 2, APRIL 2009. Contains small test problem that is simple to implement.
- Ernst, Glavic and Wehenkel, Power Systems Stability Control: Reinforcement Learning Framework, IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 19, NO. 1, FEBRUARY 2004.
- Ernst, Glavic, Geurts and Wehenkel, Approximate Value Iteration in the Reinforcement Learning Context. Application to Electrical Power System Control, International Journal of Emerging Electric Power Systems, Volume , Issue 1, 2005.
Hard Problems: Good High(er) Dimensional Problems
HIV Infection Benchmark Problem, by Ernst.