http://nicksphere.i2p/specification-gaming/commit
v=meE5aaRJ0Zs,"Chrabaszcz et al, 2018",Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari,https://arxiv.org/pdf/1802.08842.pdf,Sudhanshu Kasewa, +Reward modeling - Hero,"""The agent has learned to exploit a fault in the reward model (the model rewards actions that seem to lead to shooting a spider, but barely miss it).""",https://www.youtube.com/watch?v=Ehc3lsQqewU & feature=youtu.be & t=52,"Ibarz et al, 2018",Reward learning from human...