Machine Learning Group


|

Project: Reinforcement Learning

Pretraining

Literature

Multi-Task Reinforcement Learning: Shaping and Feature Selection, Matthijs Snel and Shimon Whiteson, EWRL 2011.

the Cacla algorithm

Insights in Reinforcement Learning, recent dissertation by Hado van Hassel

This algorithm uses a value function and an action function, both implemented as continuous function approximators. The value function is trained by temporal-difference error, as usual. The action function is trained using a target function of the action taken if the TD error is positive. No update is made if the TD error is negative. This algorithm is shown to be very competitive, with some advantages, to other algorithms. Hardest problem considered is a double-pendulum.

Control of Power Systems

Literature

Using SVM Regression

Power Systems

  • Ernst, Glavic, Capitanescu, and Wehenkel, Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 39, NO. 2, APRIL 2009. Contains small test problem that is simple to implement.
  • Ernst, Glavic and Wehenkel, Power Systems Stability Control: Reinforcement Learning Framework, IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 19, NO. 1, FEBRUARY 2004.
  • Ernst, Glavic, Geurts and Wehenkel, Approximate Value Iteration in the Reinforcement Learning Context. Application to Electrical Power System Control, International Journal of Emerging Electric Power Systems, Volume , Issue 1, 2005.

Hard Problems: Good High(er) Dimensional Problems