User Tools

Site Tools




Date Topic Material
Jan 30 Course Overview OpenAi's Spinning Up as a Deep RL Research
Spinning Up with Pytorch
Feb 5 Vanilla Policy Gradient.
Industrial Control Benchmark.
REINFORCE Policy Gradients From Scratch In Numpy by Sam Kirkiles
Reinforcement Learning: An Introduction, Chapter 13
A Benchmark Environment Motivated by Industrial Control Problems by Daniel Hein, et al.
Feb 12 Continuous-action Vanilla Policy Gradient, by Anurag and Chuck Tar file for IDS benchmark code (updated April 8)
RL — Policy Gradient Explained. by Jonathan Hui
Using Gaussian Function for Policy Gradient Methods by Al-Sharif, Muhammad
Feb 19 Kush Unity: A General Platform for Intelligent Agents
Feb 26 Abhishek Training handwritten digits -
OpenAI text generator algorithm held private
Potential Harm from AI -
Technology too dangerous to release to public?
Fake Face Generator
March 5 Sadaf Cooperatively Learning Human Values
March 12 Chuck and Anurag Proximal Policy Optimization
March 26 Apoorv and Dan Elliott (connecting remotely) RL for Dots and Boxes Game
Ensembles in RL (Dan)
April 2 Dan Elliott, Ensembles in RL The Wisdom of the Crowd: Reliable Deep Reinforcement Learning Through Ensembles of Q-functions, Dan, 2018
April 9
April 18 Anurag on learning cooperation Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach
April 23 Kush on hierarchical RL Data-Efficient Hierarchical Reinforcement Learning
April 30 Sadaf Quantifying Generalization in Reinforcement Learning
May 7
start.txt · Last modified: 2019/05/16 23:56 by anderson