Before I took a Ph.D.-level course in Reinforcement Learning with one of the founding thinkers in the field, Dr. Satinder Singh, I studied the field on my own with the help of the Sutton & Barto book. I was able to do this through Michigan Aerospace because there were several projects there that I thought would be a good fit for reinforcement learning.
These studies turned into Black Oak, which is a repository on my github that contains an implementation of Trust Region Policy Optimization and Proximal Policy Optimization, along with some work from a course that I found for free online. My work from the reinforcement learning class is unpublished. I am very interested in reinforcement learning as it solves the precriptive problem in machine learning: given a state, what is the optimal action to take? This line of thought clearly has the potential to make far-reaching impact.