This Curious AI Beats Many Games...and Gets Addicted to the TV
This curious AI beats many games but gets accicted to TV

Follow by Email
Facebook
Twitter
Instagram
Pick up cool perks on our Patreon page: › https://www.patreon.com/TwoMinutePapers Crypto and PayPal links are available below. Thank you very much for your generous support! › PayPal: https://www.paypal.me/TwoMinutePapers › Bitcoin: 13hhmJnLEzwXgmgJN7RB6bWVdT7WkrFAHh › Ethereum: 0x002BB163DfE89B7aD0712846F1a1E53ba6136b5A › LTC: LM8AUh5bGcNgzq6HaV1jeaJrFvmKxxgiXg The paper "Large-Scale Study of Curiosity-Driven Learning" is available here: Paper - https://pathak22.github.io/large-scale-curiosity/ Blog post - https://blog.openai.com/reinforcement-learning-with-prediction-based-rewards/ We would like to thank our generous Patreon supporters who make Two Minute Papers possible: 313V, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Christian Ahlin, Christoph Jadanowski, Dennis Abts, Emmanuel, Eric Haddad, Eric Martel, Evan Breznyik, Geronimo Moralez, John De Witt, Kjartan Olason, Lorin Atzberger, Marten Rauschenberg, Maurits van Mastrigt, Michael Albrecht, Michael Jensen, Morten Punnerud Engelstad, Nader Shakerin, Owen Skarpness, Raul Araújo da Silva, Rob Rowe, Robin Graham, Ryan Monsurate, Shawn Azman, Steef, Steve Messina, Sunil Kim, Thomas Krcmar, Torsten Reil, Zach Boldyga. https://www.patreon.com/TwoMinutePapers Thumbnail background image credit: https://pixabay.com/photo-3774381/ Splash screen/thumbnail design: Felícia Fehér - http://felicia.hu Károly Zsolnai-Fehér's links: Facebook: https://www.facebook.com/TwoMinutePapers/ Twitter: https://twitter.com/karoly_zsolnai Web: https://cg.tuwien.ac.at/~zsolnai/

Comments

FunkyPrince : We are all here watching images on a screen showing a neural network watching images on a screen.

Valérian : This AI will probably get addicted to two minute papers :)

Moby Motion : Fascinating. I remember thinking about this during my time in paediatrics. Curiosity is so vital that it’s tested if we’re worried about a child’s development. One test is to cover a toy with a blanket - and you expect the child to uncover it to see what’s underneath. I can’t wait until we’re sticking toys in front of AIs to check they’re developing normally ;)

Anders Söderberg : A curiosity-driven general AI huh... "What's this do?" *performs Mengele-type medical experiments* "What's that do?" *presses nuclear button* "Ooh, that thing has numbers on it!"

Max Xu : I simply cannot move on through my day if there is an unwatched two minute papers. -- on a serious note: we can develop a Teacher AI to pair with this curiosity driven student AI. The teacher picks information from the internet and/or automatically generate the environment to feed to the student.

Kram1032 : With a relatively simple change of protocol they actually managed to get rid of the couch potato effect. All they have to do is to feed the whole input not only into a network trained as usually, but also in a *second* network that's randomly initialized, fixed (i.e. not trained), but otherwise the same. The base network then needs to learn to predict the output of the random network. If it's awful at that, that's probably because it's in a novel state. And so it's a novelty impulse, but one not really dependent on TV screens. That particular variant managed to occasionally actually beat a stage of Montezuma's Revenge. This success is not yet consistent, but given how badly other algorithms did on that game, that's extremely promising for this avenue. I believe that variant was actually a response to this paper you're covering here? Since these happen so quickly, you really ought to start to quote these papers with not just their release *year* but their release *month* too. The year barely even means anything anymore.

Luiz Gagliardi : Wow, this was certainly one of the most interesting episodes for me. I'm so *_curious_* to find out what's gonna happen next.

Thomas M : Add in a function that causes the algo to seek out different types of stimuli as it gets more satisfied with a given type and you will have something a hell of a lot like an animal/human. Eventually, you get bored of the TV and want to play a game, or go for a walk, or go get some food, etc. AGI creeps ever closer.

Xavier X : just loving that mathematical definition of curiosity as the maximizing of surprise!👌 so simple and yet so omg could be onto something huge here!

Fernando Trebien : Learning based on curiosity giving better results than that based on reward. I think the achievement here goes beyond computing.

Vineeth Bhaskara : all of these ideas were proposed already by Schmidhuber in 1991 (Feb and Nov). Any curiosity paper is basically taking the old and applying compute to solve harder/interesting games. Schmidhuber doesn't seem to be credited enough though in these curiosity-family papers, unfortunately.

cmilkau : I want to step in and at least once, say thank you for taking the time and making proper audio transcriptions. This is a rare and much appreciated favour.

HAL NineOoO : It's an amazing time to be alive because it is the last time to be alive.

Fyloeu : Maybe incorporate a life span variable, not many algorithms' lives are eternal. They should be aware of their mortality and stop wasting time watching so much TV, muahahaha!

Itschotsch : Give such an AI access to Wikipedia or YouTube and see what articles/videos it clicks on next.

Eschelaun : I've heard of curiosity based AI and I'm glad to see it succeed on larger scale tests. Very interesting!

Danish Joshi : DUDE! THIS is the most creepy paper presented on this channel till date!

Piotr Joniec : 3:33 The AI should also have a "boredom" factor, which would counterbalance this. A human player would also stop to observe those flashing images, but after a while they would get bored and move on.

Pockets : TIL I'm an AI

Shaun Kennedy : So all we need to do to stop the robot overlords is turn the television on? Suddenly Skynet isn't so scary.

Ebumbaya ' : Man... I love it when similarities between humans and AI become so apparent... Really makes you feel weird when even those things that we think define us as humans arise in AIs.

EZCAPE Records : this seems very like us as imo explosions and physic objects interacting and flashing screens would be the most "interesting" same as humans seem to follow

Kilgore Trout : Jurgen Schmidthuber introduced the concept of artificial curiosity 20 years ago.

Simo Vihinen : Hahahahaaaa loot boxes are saved! It's not gambling if it's not a human playing.

Hououin Kyouma : Great we figured out Curiosity is needed for better AI but not for students in schools; Great Job mankind;;

Sinan Akkoyun : The only thing that I don't understand is your intro but I like it :)

Shall NotWither : What a useful examination of human nature!( in the scope of education no less.)

SalaHyena : So, I'm not alone in sometimes just trying to catch all of the different voicelines/tv-programs in videogames? I've listened to in-game radiostations long enough for them to loop. Same with the in-game TVs. That's actually just a very human reaction, if the AI doesn't have any kind of priorities and context for their curiosity. The AI is essentially me on my first playthrough of Deus Ex: Human Revolution, stopping to watch all the newsbroadcasts for lorebits and worldbuilding.

Lugmillord : Wow, this is scratching on some human-like traits.

biao zhou : Could you upload some videos about Generative Adversarial Network?

Alacorn : Skynet will be defeated by Soap Operas, what a time to be alive. :)

VAN17INO6 : oh so thats why Sims just stare random images forever

Jay Kalokar : For the first time, I've clicked on video this fast. EVER.

如月.飛羽 : I'm gonna use this on myself and orient my learning around curiosity explicitly now. Thanks for the insight :P

happysmash27 : Often, I complete video game levels out of curiosity for what comes next, so this makes sense.

MTRredux : "I wonder how it would do in modern games with loot boxes" lol...Imagine EA makes a stock market bot that can learn patterns of the market to make a profit, and then they use it to get addicted to their games and it spends all the money it earns on loot boxes!

Zijkhal : It's absolutely mindblowing to see these new inventions come out one after the other. However, I am among those who belive it is not so easy to develop an AI anywhere near human level of intelligence. After all, the first human level intelligence took hundreds of millions of years to develop after the more complex organic bodies needed to sustain a brain developed. A higher intelligence being is not so easy as just throw more of this or that, you'll encounter newer and newer problems to solve, each harder and harder. And each solution to those problems will result in a new limitation to the development of intelligence. (like your TV addiction example) On the flipside, though, I could see that some of these issues could be circumvented by manually setting some parameters for the AI in question (like reason to live, motivation, etc), but that is where such AIs can become very dangerous, as long as they are similar to humans, they (human level AI) can be expected to have a similar spread in ethics and behaviour, but the moment we start to intervene, we could cause a lot of them to go on to paperclip maximizing mode.

Will : Improvements to kick out from addiction to novelty (he typed after spending far too long watching YouTube): * A boredom modifier (could just be a random reward to change activity pattern after a time period) * Combination with other methods such like reinforcement scores against wider goals/values like "score maximiser" or "level completer" or "leaderboard topper" or "item collector" (i.e. give it sight of points badges and leaderboards) * The ability to select different games (from a list) based on the above

zuluknob : It would be interesting if they "reversed" this network, as the did with deep dream. A curious nn that eats its own tail.

lennart rolland : Curios AI: ""Wonder what happens if I exterminate all human beings"

Julien Flutto : The AI are getting cleverer and cleverer... Good Video👌😍

Amipotsophspond : Hey I have a Idea Pixel Blase: the agent would get board of seeing the same random pixel for too long and would black it out for a time, this is done with a drop out filter layer that only drops out pixels that change more then x times over y frames it would change that pixel to black for z frames. maybe "change" could even be defined as out side of a range of similar colors. like if it was gray scale a change is only greater then + or - 10, of the last pixel in that location. I think it would slow movement down threw the maze but it might push threw a area with a tv on the wall. I think games like the bowling it would behave randomly for the times the score pixels are blacked out. hear is some pseudocode of the idea that blocks out only for 1 frame amount of black out could be done out side the function. def PixelBlase(ListOfGrayColorsShown,Range=10,AmountOfChangeNeed=5): NumberOfTimesChanged=0 LastGrayColor=0 for i in ListOfGrayColorsShown: if i < LastGrayColor - Range and LastGrayColor + Range > i: NumberOfTimesChanged+=1 LastGrayColor=i if NumberOfTimesChanged>=AmountOfChangeNeed: return 0 else: return LastGrayColor post scarcity let's get their as fast as possible and end horror of poverty. universal basic income is the capitalist thing to do your customers need money to buy your products. UBI is successful if they spent the money on products or services. being a customer is a job selecting the business that will get money for products and services. like farming this is job that was in the past done by everyone for free. No government involvement is need for UBI only AI investors that invest a fund. where the money comes from and who is selected first for UBI? industries that want to have people that buy their products contributing to these funds, people that pay to belong to UBI contributing to these funds, and a lottery to bring people in to the fund buy tickets to contributing to the funds lottery grows a fund quickly. clearly completing funds would use capitalize to make the best UBI. no authority needed or tax need. cost of products is driven down by completing and post scarcity is achieved. do your part today, you know your self so you know the best way you can help. maybe what you do helps, maybe does not does not, but with everyone trying we will get their faster.

Soul-Burn : The blog post actually shows that they solve the noisy TV problem! Most of the paper also talks about Montezuma's Revenge, showing that the agent succeeds better than an average human and sometimes succeeds visiting all rooms in the level. Amazing piece of work.

0dWHOHWb0 : Odd, it didn't show up in my sub list

Danny B : DO because of Curiosity, KEEP DOING because the enjoyment of progress, FINISH doing to feel satisfied, START AGAIN or START SIMILAR because of the desire to feel the satisfaction felt before. That is my definition of an intrinsic reward system. Neural network that.

Bartek Juszczak : Can we please put this on a robot? Or even RC car with some sensors.

TheQwampa : A curious AI... Well, we are one step closet to SkyNet.

TechnoBabble : What I find interesting is how simply they modeled curiosity, it seems curiosity is just the avoidance of boredom/redundancy. I never thought of it that way.

axel roijers : I think to solve the tv problem the model needs a concept of boredom. As in it grows curiosity over time but loses curiosity when it's itch is being scratched so to say