This Curious AI Beats Many Games...and Gets Addicted to the TV

Share this video on

What's Hot

What's New

Top Grossing

Top of the Chart

Recommend

Two Minute Papers : Apologies for the delay...creating this episode has been "eventful". :) This is why: https://twitter.com/karoly_zsolnai/status/1063476734962659329

Moby Motion : Fascinating. I remember thinking about this during my time in paediatrics. Curiosity is so vital that it’s tested if we’re worried about a child’s development. One test is to cover a toy with a blanket - and you expect the child to uncover it to see what’s underneath. I can’t wait until we’re sticking toys in front of AIs to check they’re developing normally ;)

FunkyPrince : We are all here watching images on a screen showing a neural network watching images on a screen.

Valérian : This AI will probably get addicted to two minute papers :)

Anders Söderberg : A curiosity-driven general AI huh... "What's this do?" *performs Mengele-type medical experiments* "What's that do?" *presses nuclear button* "Ooh, that thing has numbers on it!"

lIIlIllIlIl : *AI after watching TV for too long:* _Huh.. I know kung fu_

Xavier X : just loving that mathematical definition of curiosity as the maximizing of surprise!👌 so simple and yet so omg could be onto something huge here!

Kram1032 : With a relatively simple change of protocol they actually managed to get rid of the couch potato effect. All they have to do is to feed the whole input not only into a network trained as usually, but also in a *second* network that's randomly initialized, fixed (i.e. not trained), but otherwise the same. The base network then needs to learn to predict the output of the random network. If it's awful at that, that's probably because it's in a novel state. And so it's a novelty impulse, but one not really dependent on TV screens. That particular variant managed to occasionally actually beat a stage of Montezuma's Revenge. This success is not yet consistent, but given how badly other algorithms did on that game, that's extremely promising for this avenue. I believe that variant was actually a response to this paper you're covering here? Since these happen so quickly, you really ought to start to quote these papers with not just their release *year* but their release *month* too. The year barely even means anything anymore.

Vineeth Bhaskara : all of these ideas were proposed already by Schmidhuber in 1991 (Feb and Nov). Any curiosity paper is basically taking the old and applying compute to solve harder/interesting games. Schmidhuber doesn't seem to be credited enough though in these curiosity-family papers, unfortunately.

Thomas M : Add in a function that causes the algo to seek out different types of stimuli as it gets more satisfied with a given type and you will have something a hell of a lot like an animal/human. Eventually, you get bored of the TV and want to play a game, or go for a walk, or go get some food, etc. AGI creeps ever closer.

MaGetzUb : What a creepy time to be alive indeed.

Max Xu : I simply cannot move on through my day if there is an unwatched two minute papers. -- on a serious note: we can develop a Teacher AI to pair with this curiosity driven student AI. The teacher picks information from the internet and/or automatically generate the environment to feed to the student.

Fyloeu : Maybe incorporate a life span variable, not many algorithms' lives are eternal. They should be aware of their mortality and stop wasting time watching so much TV, muahahaha!

cmilkau : I want to step in and at least once, say thank you for taking the time and making proper audio transcriptions. This is a rare and much appreciated favour.

Luuucy : solution: factor in the repetitiveness of the data as well as it's unpredictability. Something that is unpredictable but is just the same thing over and over would be less interesting than. something unpredictable but has a lot more variety.

Itschotsch : Give such an AI access to Wikipedia or YouTube and see what articles/videos it clicks on next.

Luiz Gagliardi : Wow, this was certainly one of the most interesting episodes for me. I'm so *_curious_* to find out what's gonna happen next.

Omar Cusma Fait : _\__/_ [ • O •] (• _• ) / > > 🎁 < \

janvierbam : TIL I'm an AI

Henry Miller : Mind blowing. Every time I watch a video here. Just, wow.

Shaun Kennedy : So all we need to do to stop the robot overlords is turn the television on? Suddenly Skynet isn't so scary.

Fernando Trebien : Learning based on curiosity giving better results than that based on reward. I think the achievement here goes beyond computing.

Hououin Kyouma : Great we figured out Curiosity is needed for better AI but not for students in schools; Great Job mankind;;

Ebumbaya ' : Man... I love it when similarities between humans and AI become so apparent... Really makes you feel weird when even those things that we think define us as humans arise in AIs.

curtisw0234 : Is curiousity a good algorithm to beat a game. Or does a good game reward curiosity

Eschelaun : I've heard of curiosity based AI and I'm glad to see it succeed on larger scale tests. Very interesting!

Kilgore Trout : Jurgen Schmidthuber introduced the concept of artificial curiosity 20 years ago.

Danish Joshi : DUDE! THIS is the most creepy paper presented on this channel till date!

HAL NineOoO : It's an amazing time to be alive because it is the last time to be alive.

Simo Vihinen : Hahahahaaaa loot boxes are saved! It's not gambling if it's not a human playing.

Shall NotWither : What a useful examination of human nature!( in the scope of education no less.)

Jay Kalokar : For the first time, I've clicked on video this fast. EVER.

Zbigniew Chlebicki : I wonder if you can defeat the addiction by introducing "restlessness" - curiosity about own actions in addition to curiosity about environment. Then it would probably have problem with puzzles that require patience, but these seem to be rare.

Lugmillord : Wow, this is scratching on some human-like traits.

Ecci Ecci : There is another kind of reward that works for every task... I am glad I never seen this be used somewhere so when I have time I can experiment with it :) Its so simple thats why I am so wondering why its not used and some other techniques like curiosity are developed instead (though this can improve my idea still by alot).. You can see how the AI behaves(repeating stupid actions) that they totally ignore this fundamental measurement or intrinsic score... Not trying to be arrogant.. But it seems so obvious to me lol

Fernando Lener : I hope AI doesn't ask the question: what happens if an human get stabbed?

Pavement : Are these agents based on the Free Energy principle, or is it something different?

Julien Flutto : The AI are getting cleverer and cleverer... Good Video👌😍

Tymski : I'm addicted to youtube! Help!

Zijkhal : It's absolutely mindblowing to see these new inventions come out one after the other. However, I am among those who belive it is not so easy to develop an AI anywhere near human level of intelligence. After all, the first human level intelligence took hundreds of millions of years to develop after the more complex organic bodies needed to sustain a brain developed. A higher intelligence being is not so easy as just throw more of this or that, you'll encounter newer and newer problems to solve, each harder and harder. And each solution to those problems will result in a new limitation to the development of intelligence. (like your TV addiction example) On the flipside, though, I could see that some of these issues could be circumvented by manually setting some parameters for the AI in question (like reason to live, motivation, etc), but that is where such AIs can become very dangerous, as long as they are similar to humans, they (human level AI) can be expected to have a similar spread in ethics and behaviour, but the moment we start to intervene, we could cause a lot of them to go on to paperclip maximizing mode.

TheQwampa : A curious AI... Well, we are one step closet to SkyNet.

André Pfitzner : Perfectly predictable behavior. Intelligent people are not addicted to TV in general. Because intelligent people like to LEARN new things, and not just WATCH new things. Also, there are people who prefer to play videogames, and other people who prefer to watch tv. So I think that people who prefer to play videogame are not just curious, but they like to have CONTROL of what is happening on the screen. TV does not give any control for the user, TV just guides the mind of the user through a pleasing mental experience with little to learn, because TV is for the masses, and most people don't like to learn new things most of the time. If they preferred that they would be learning something new, by definition.

Promethor : I don't know much about physiology or AIs, but in humans doesn't our attention get lower over time while concentrating and doesn't our reward system need stronger and stronger stimuli over time when repeating something? Take someone that like to do dangerous activities because of the adrenaline rush for example, over time if the same action is repeated often, that person usually gets to do someone even more extreme to get the same rush as he did before. So having an AI that is curious but get less and less interested about A specific topic over time could work? Making it seek a new kind of experience instead. I don't know, just a thought ... But after a while it should be curious again about previous subjects imo. Maybe also a kind of priority system that makes the AI remember there is something else that need its focus...

Amipotsophspond : Hey I have a Idea Pixel Blase: the agent would get board of seeing the same random pixel for too long and would black it out for a time, this is done with a drop out filter layer that only drops out pixels that change more then x times over y frames it would change that pixel to black for z frames. maybe "change" could even be defined as out side of a range of similar colors. like if it was gray scale a change is only greater then + or - 10, of the last pixel in that location. I think it would slow movement down threw the maze but it might push threw a area with a tv on the wall. I think games like the bowling it would behave randomly for the times the score pixels are blacked out. hear is some pseudocode of the idea that blocks out only for 1 frame amount of black out could be done out side the function. def PixelBlase(ListOfGrayColorsShown,Range=10,AmountOfChangeNeed=5): NumberOfTimesChanged=0 LastGrayColor=0 for i in ListOfGrayColorsShown: if i < LastGrayColor - Range and LastGrayColor + Range > i: NumberOfTimesChanged+=1 LastGrayColor=i if NumberOfTimesChanged>=AmountOfChangeNeed: return 0 else: return LastGrayColor post scarcity let's get their as fast as possible and end horror of poverty. universal basic income is the capitalist thing to do your customers need money to buy your products. UBI is successful if they spent the money on products or services. being a customer is a job selecting the business that will get money for products and services. like farming this is job that was in the past done by everyone for free. No government involvement is need for UBI only AI investors that invest a fund. where the money comes from and who is selected first for UBI? industries that want to have people that buy their products contributing to these funds, people that pay to belong to UBI contributing to these funds, and a lottery to bring people in to the fund buy tickets to contributing to the funds lottery grows a fund quickly. clearly completing funds would use capitalize to make the best UBI. no authority needed or tax need. cost of products is driven down by completing and post scarcity is achieved. do your part today, you know your self so you know the best way you can help. maybe what you do helps, maybe does not does not, but with everyone trying we will get their faster.

jamie heiney : The examples, Sonic, ping-pong, Mario, and bowling have good point systems that we could easily use without 'curiosity'. Would it really be that hard to move a machine learning setup from one of these games to the next? So where does this curiosity AI help? Has the curisoty ai been applied to games where the point system is very different from other games, or where the point system doesn't exists?

Michael Müller : Unless I misunderstood the RND paper, all those comments about TVs saving us from Skynet are wrong, as is the title of this episode. (But still great video and channel 👍) The new Random Network Distillation is just that, immun to the noisy TV problem (at least according to OpenAi) because it uses two networks of the same size for the problem (which are given the current state), a random one which isn't trained at all and another one which is trained to predict the first ones outcome.

shankyb0y : hooked to Infinite Jest

Bartek Juszczak : General AI? So we're approaching an out of control for humans situation already? So rewarded by curiosity, complex outcomes, could we not at this point give it some control over it's code / programming and in theory now, if it makes changes to itself that make it better it's thoughts will become more complex so it will keep those and keep pushing forward ever improving itself? Is this it?

Will : Improvements to kick out from addiction to novelty (he typed after spending far too long watching YouTube): * A boredom modifier (could just be a random reward to change activity pattern after a time period) * Combination with other methods such like reinforcement scores against wider goals/values like "score maximiser" or "level completer" or "leaderboard topper" or "item collector" (i.e. give it sight of points badges and leaderboards) * The ability to select different games (from a list) based on the above

New Boss Media : In reference to this bowling game, I saw a video where an AI agent trained on other games was put into a hockey game abd it performed reasonably well .. but only when playing on the left side of the field, it definetely didn't like going "back" from right to left :D