|Jan 30||Course Overview|| OpenAi's Spinning Up as a Deep RL Research
Spinning Up with Pytorch
|Feb 5|| Vanilla Policy Gradient.|
Industrial Control Benchmark.
| REINFORCE Policy Gradients From Scratch In Numpy by Sam Kirkiles
Reinforcement Learning: An Introduction, Chapter 13
A Benchmark Environment Motivated by Industrial Control Problems by Daniel Hein, et al.
|Feb 12||Continuous-action Vanilla Policy Gradient, by Anurag and Chuck|| Tar file for IDS benchmark code (updated April 8)
RL — Policy Gradient Explained. by Jonathan Hui
Using Gaussian Function for Policy Gradient Methods by Al-Sharif, Muhammad
|Feb 19||Kush||Unity: A General Platform for Intelligent Agents|
|Feb 26||Abhishek|| Training handwritten digits - https://medium.com/coinmonks/handwritten-digit-prediction-using-convolutional-neural-networks-in-tensorflow-with-keras-and-live-5ebddf46dc8
OpenAI text generator algorithm held private
Potential Harm from AI - https://www.theverge.com/2019/2/21/18234500/ai-ethics-debate-researchers-harmful-programs-openai
Technology too dangerous to release to public?
Fake Face Generator
|March 5||Sadaf||Cooperatively Learning Human Values|
|March 12||Chuck and Anurag||Proximal Policy Optimization|
|March 26||Apoorv and Dan Elliott (connecting remotely)|| RL for Dots and Boxes Game
Ensembles in RL (Dan)
|April 2||Dan Elliott, Ensembles in RL||The Wisdom of the Crowd: Reliable Deep Reinforcement Learning Through Ensembles of Q-functions, Dan, 2018|
|April 18||Anurag on learning cooperation||Towards Cooperation in Sequential Prisoner's Dilemmas: a Deep Multiagent Reinforcement Learning Approach|
|April 23||Kush on hierarchical RL||Data-Efficient Hierarchical Reinforcement Learning|
|April 30||Sadaf||Quantifying Generalization in Reinforcement Learning|