Wisecrack

3y

The following paper combines recurrent neural nets for vision with methods from reinforcement learning research:

https://proceedings.neurips.cc/pape...

Apparently an agent learned to catch a ball 85% of the time, without being explicitly told to track the ball. The RL algorithm rewarded the agent only for successfully catching the ball. The system itself used this reward signal to set its own policy/goal, which was used to guide it toward the goal of tracking the ball itself--all on its own.

Behold, the very infancy of the paperclip maximizer problem.

random

ai

research

Ranter

Comments

3

arcadesdude

6307

3y

Great case against micro-managers too. If machines can work backwards from the goal and figure stuff out without being given every instruction then we can too!
0

Ranchonyx

10404

3y

Okay, now I'm wondering what use that has for humanity.
1

Midnight-shcode

4433

3y

@Ranchonyx infinite paperclips, obviously

Related Rants

devRant © 2021 Hexical Labs LLC
Privacy Policy | Terms of Service