http://nicksphere.i2p/specification-gaming/tree/examples.csv
The RL algorithm has managed to exploit the classifier by moving the robot arm in a peculiar way, since the classifier was not trained on this specific kind of negative examples. " " " ,https://bair.berkeley.edu/static/blog/end_to_end/pr2_classifier.gif, " Singh, 2019 " , " End-to-End Deep Reinforcement Learning
without Reward Engineering " ,https://bair.berkeley.edu/blog/2019/05/28/end-to-end/#problem-with-classifiers,Jan Leike,
Go pass,A reimplementation of AlphaGo learns to pass forever if passing is an...