share
interactive transcript
request transcript/captions
live captions
|
MyPlaylist
[MUSIC PLAYING] SHEILA NIRENBERG: There's been a lot of talk in the news, and we see it in movies, too, about having robots that are smart. You know, like robots that can work with us, be home health aides, drive our cars-- things like that. But we're still not there yet.
So why is that? One of the reasons is that a lot of the field is moving in a particular direction toward building big data for robots. The idea is to build large databases, either on the robot or in the cloud, to give the robot access to more and more information, more and more facts. But is this going to take us where we need to go? Will this actually make robots smart?
So this kind of thing is happening everywhere in the field, from building up small robots like Watson and Siri to building up big ones like self-driving cars. So let's take self-driving cars as an example. So as you can imagine, building a self-driving car is a seriously hard problem. The car has to be able to know where the roads are, stay in the lanes, not collide with anything, recognize street signs.
So the way the car companies approach this, and Google is probably the best-known example, is by gathering a huge amount of data. So they chose a town-- Mountain View, California their town-- and they mapped it. And they did this in enormous detail. So not just the roads, but also the lanes in the road, the street signs, the sidewalks. Even the heights of the sidewalks, down to the inch.
So what happens is the car does very well in Mountain View, California but take it someplace else, like, say New York or Detroit, and it starts to falter. Since it's working by drawing information from a database rather than thinking on the fly, it gets stuff. It can't handle new places or weather or construction changes. If its database doesn't have the information that it needs, it's in a bind.
OK, so one way to fix the problem would be, of course, to just keep gathering more data. Map more and more places, do it in different times of day, take into account construction changes, weather, to try to cover all the bases. But maybe there's another way, and that is to make the software that's operating the robot smart. So not just knowledgeable, but also smart.
So what I'm going to tell you about today is an approach for doing that. And I just want to say it's just the tip of the iceberg, and we've only really been doing it in the field of vision so far, but I want to show you how it works and its capacity to change the way we think about building smart robots.
OK, so before we do that, I just wanted to go through what it is we want it to be able to do. If we're making a self-driving car, or really any robot with autonomous mobility, we want it to be able to navigate and maneuver through its environment, even if it's never been there before. This is something that we as humans can easily do.
So just as an example, let's say you fly to another city-- a city you've never been to. You get off the plane, and you start walking through the airport. So this means you have to maneuver through a crowd, weave your way around people and obstacles.
But this is no big deal. You can easily do it. You don't need a map of the airport or any prior knowledge of the things in it. You just figure out which way to go, which way to turn, on the fly. So along the same lines, you can drive in that city even though you've never been there before. And you can do it whether it's sunny or rainy, day or night. Your brain just figures out how to move through space, how to move through traffic, while it's doing it.
And the last one is if, along, along the way, you happen to see somebody that you've encountered before, you'll recognize him whether you want to or not. Your brain will just go, hey, that's Joe, even though you've never seen Joe in this city before, in this lighting, wearing that outfit. Anyway, you get the idea. Our brains are very amazing things, even at this simple level.
So what's our approach to making robots smart, where they can do things like this? Our approach is to use something called neural codes, something we have that robots don't. And it's a very useful thing.
So let me show you what I mean by neural code, and let's start with vision first. OK, so when you look at something like this image here, it goes into your eye and it gets converted into a code. We call it a neural code because the neural circuitry in your eye does the encoding. OK, so images go in, pass through the neural circuitry, get converted into a code-- I mean, they actually come out as a code-- and then the code gets sent to the brain.
Same thing happens when you hear something. When you hear something sounds, go into your ear, get converted into a code, and the code gets sent to the brain. Same thing happens with touch.
So in all these cases, information from the outside world passes through this intermediary step before it goes to the brain. So what does this stuff do? What it does is it simplifies the information so it's easier for the brain to use. Basically, the world out there is very complicated. It contains an enormous amount of information, an enormous amount of data-- too much, really, for the brain to handle. So we evolved this intermediary step to help.
What it does is it reduces the dimensionality of the information. That is, it throws away the parts you don't need, holds onto the parts you do need, and then sends this streamlined version on to the brain. This streamlined version is very useful. It's what allows our brain to solves problems in fast and effective ways-- smart ways.
OK, so I was thinking about this, and sort of raises the question, well, what if we had access to these codes, these neural codes? What if we could put them into robots? Would it make their brain solve problems in faster, smarter ways, too? What I mean is would it allow them to solve problems the way we do?
OK, so I cracked one of these codes, and this front-end code for vision. And I can tell you more about that later if you're interested. And I put it into some robots, and I set out to test the idea.
So let me show you. So we started with the problem of navigation-- the problem that I was mentioning before of navigating and maneuvering through your environment. And for the first test, we used virtual reality.
So we started with a virtual robot in a virtual environment, and we used a rural one with trees, rocks, wood crates, et cetera. Something that looks like this. And the question we asked was, can we train a robot navigate through this environment? So picture it going into this screen, weaving its way around trees and rocks, trying to find its way through.
So what happens? So first, let me show you what happens with a standard robot just so we have a baseline. So this is a top-down view, looking at many runs. Each white line is the trajectory of a different run. And you can see that the standard robot can actually do this. It gets it wrong occasionally, but for the most part, it's not too bad.
OK, so what about our robot, the one that uses the neural code? It can do it, too. So at first glance, this doesn't seems so interesting. They can both do it. But what is interesting is what happens behind the scenes-- how they do it. The one that uses the neural code solved the problem in a fundamentally different way.
What it learned generalized. That is, the features that it pulled out from this environment weren't just useful for this environment, but for other environments, too. So let me show you.
OK, so these are just three examples. So the top one is just a different rural environment. And so same basic things are in it-- trees, rocks. It's just laid out in a different way. And you can see that the standard robot is starting to have trouble. What it learned in that first environment is not turning out to be so useful here, so it doesn't know what to do exactly, and it starts to veer out.
But the robot using the neural code is still fine. What it learned in that first environment is turning generalized to this environment.
The next one is the suburbs, and it's very different objects in it. It's got cars, sidewalks, white picket fences. And so now the standard robot is really confused, and its trajectories are disorganized, and there are lots of collisions.
But the robot that use the neural code is still fine. What it learned in that first environment generalized to this one, too.
The last one is a playground with a slightly apocalyptic-looking sky. We're not that good at drawing in virtual reality. But you can see it's got a tire obstacle course, and those black things are tires.
And you can see now that the standard robot is pretty much lost. It's supposed to be avoiding hitting the tires, and instead, it's going right at them, like head-on collisions. But the robot that used the neural code is still fine.
So we were pretty excited, you know, that the robot that used the neural code seemed to be solving the problem kind of like we do. That is, it was pulling out features from that first environment that were generally useful-- that it would allow it to solve problems in any environment.
So this is kind of like what happens to us. As kids, we learn implicitly from running around and interacting. We learn the general rules of how to maneuver around things, and these rules stay with us for life. So it seemed like the robot that was using the neural code was pulling out those same kinds of rules.
OK, so we're excited but cautious, so we decided to push it further and make the task harder. So what we did was we moved the position of the sun, creating different times of day. This creates all sorts of shadows, which are really a big problem right now for machine vision, but not a problem for us as humans. So if the robot were solving the problem like we do, it should be able to handle these environments.
And it does. It actually sails through. So the robot that was using the neural code performs close to 100% in all these environments and in all these lighting conditions.
But the real test is what happens in the real world? What happens when we take the robot out of virtual reality and put it into the real world? So we put exactly the same software into a mobile robot-- a real, physical robot-- and we set it free in our lab. So this is the robot, and that's the lab with lab stuff-- wastepaper baskets, chairs, sweatshirts, and general electronics.
And remember, it's been trained only on that virtual reality rural environment. It's never seen a lab. It's never seen anything beyond trees and rocks.
What happens? The standard robot gets stuck, crashes into chairs, but the robot that uses the neural code steers clear of them just fine, and it does it over and over. It seems to fundamentally understand its environment and how to move through it wherever it goes.
So now what I want to do is show you some quick video clips just because seeing it in video is worth a thousand words, and you can get a feel for what it does. OK, so the first one-- this is just showing you what these virtual reality environments look like. And this is just, like, one of our test cases.
So it sees the tree and it realizes that it's going to hit it, so it starts to go. Notice that it's not, like, in any way affected by the shadows. And you can see that it's, like, a pretty complicated environment. If you're trying to ride your bike through this, you would just get off your bike first and look, actually, but it's not allowed to stop.
So it sees an open spot while it's thinking, and it looks around to make sure that that was a good idea. And it's like, OK, that was a good idea. We'll go there. OK, and I have millions of these if you want to see more later.
OK, this is just another one, just showing how different it is as it's heading toward that crate that looks like it's made of Popsicle sticks, and goes over the shadows. Gets itself into a tight spot and gets itself right out of it.
This is a robot in the lab. And the top one is showing the point of view from the robot, and the bottom shows the robot as it's thinking. So it's about 2 feet high, so the point of view is kind of low. Remember, there are no sensors here. There's nothing fancy. It's just the camera and the thinking. And
It sees that leg on the chair, and it will move out of the way. And then some guy comes along and it sees it. Our robot's name is actually Amelia, so sometimes I say "she" instead of "it." But don't be confused-- it's an actual robot and not a person.
So it tried to go through that guy's legs because unlike a teenage girl, it thinks it's actually thinner than it is. And so we spent a lot of time in the lab jumping in front of it and seeing whether she'll maneuver around us, and she does. She just doesn't hit us.
Now it's in a car. So it generalized not just from into our lab, but into an actual car at high speeds. But it's the neural code that's driving it. The neural code allows us to get into a car and handle high speeds, and it allows the robot to be able to do it, too.
And again, lots of examples. I'm not going to go through them, but I just wanted to show you a screenshot of each of these. So it can handle bad weather, it can handle sun in its eyes. It can handle all sorts of complicated shadows.
One key fact about it is because it's the neural code of the eye, it's not just for navigation, but it's for everything visual that we do. So we've started to expand it to be able to do face recognition and pedestrian detection, object detection, motion detection. It could be used in Las Vegas to see whether somebody is cheating, whether you could get a sense of [? a tell ?] for kids with Asperger's to be able to recognize emotions.
So here, we trained it to only recognize pedestrians in danger so it doesn't break all the time. And so as you get closer, you start to pull up all the pedestrians. And then here's watching TV. So this is a show "The Wire," and we just trained it to recognize these four characters. So when it sees that one on the left, McNulty, it'll give you a green square, and the other guy will be blue, et cetera. And we only trained it on the large faces.
But the key thing about it is that it trains so quickly. It learns it almost as fast as we do, because think when you're watching a TV show, and you see the characters that you've never seen before, then it cuts to another shot, and you go back to it, and you can recognize those characters. And so the robot can do the same. We really have only started doing this.
OK, so our goal was to try to make smart robots so we can have robot companions, so we can have self-driving cars, somebody to watch TV with. And so people know how to make robots knowledgeable, but smart is a different thing. Smart is being able to go into a completely new situation and figure things out on the fly. And I think with the neural codes, it opens the door to really being able to do this. And I've only just shown you to the tip of the iceberg. Thank you.
[APPLAUSE]
Sheila Nirenberg, professor of neuroscience at Weill Cornell Medical College and founder of two startup companies – Bionic Sight LLC and Nirenberg Neuroscience LLC – explains the science behind a new kind of smart robot that she’s creating, drawing on the basic science of visual processing.