Trucks with Intuition
Perceptive Automata partners with Volvo Trucks to demonstrate AI technology that can determine when pedestrians will cross the road.
The automotive and commercial vehicle industries are working on answers to a tremendous number of questions no one bothered to ask before. Volvo, for example, wants to know if vehicles can have human intuition.
It’s not as absurd a concept as it might seem, if you limit the definition of intuition to the specific question of whether a pedestrian is about to cross the street in front of an autonomous vehicle (AV). A new project by Volvo Trucks and Perceptive Automata, an artificial intelligence (AI) start-up based in Boston and Silicon Valley, is using the fact that people don’t really hide this information, to build better AVs. The challenge is getting machines to recognize the all-too-human clues.
Perceptive Automata trained its AI using around a hundred hours of video footage taken by Dependable Highway Express (DHE), another partner in the test project, from two Volvo VNR 300 regional-haul trucks that drove on public roads. Actual humans then told the system what they thought the people in the video were about to do, which gave the AI clues about how to extract from the videos the subtle cues and signals that most AVs do not take into account: eye contact, body language, arm movements and even the type of person. A police officer standing in the road directing traffic, for example, is less likely to move in front of the AV than an average person in the same spot, said Sid Misra, Perceptive Automata’s CEO and co-founder.
“You can hack your way to this problem,” Misra said. “We’re trying to solve it in a very generalized way.”
Deciphering complex cues
Other AV companies use pose estimation, Misra said, which breaks the body into skeletal sections and an AI learns to analyze those sections to predict actions. “We don’t think that’s a great approach because it misses a lot of information that’s apparent but not in the pose,” he said. “The way they’re carrying objects or pushing a cart or stroller. The eye contact part is not always captured. And all of these things matter.”
Scientists tend to want to simplify a problem in order to solve it, said James Gowers, Perceptive Automata’s vice president of strategy and business development. This has led other AV companies to build stick models of humans that can offer some sort of predictive analysis about when a pedestrian wants to cross the road.
“The problem is, in simplification, you lose what you need: the complex information that humans use,” he said. “It’s beyond just arm and leg movement. It’s subtle things. Are they leaning back? What way are they facing? Are they on their back heel? There are hundreds of criteria like that. Traditional approaches that focus on simplification don’t work because it’s so much more complex.”
Aside from pose estimations, AVs can also use context information to predict behavior, something Misra said was necessary for an AV. But not all context-based predictions are the same. When Misra rode in an Aptiv AV in Las Vegas, he learned that the company’s vehicles treat pedestrians different if they are in a crosswalk or not.
“When they’re outside, the car would think of them as objects and when they’re in the crosswalk, it would think of them as pedestrians that they need to think of as pedestrians,” he said. “You can do that in a place like the Vegas strip, where it’s safe. But you can’t always do that in Boston or in New York City or in downtown San Francisco.”
This is why an eventual full-fledged AV from Perceptive Automata might operate differently based on where the vehicle is, said Peter Valle, the company’s vice president of engineering.
“The road configuration in Southern California is very different, pedestrian behavior is very different,” he said. “The expectation of a driver in Boston is very different than the expectation of a driver in LA. The roads are much wider, people aren’t as likely to scamper across the street unexpectedly, but there are other things that need to be accounted for.”
Misra said that an AI that learns about people in different locations will be better than one that doesn’t because it adds together two pieces of human behavior: a common piece and a culturally contingent piece.
“Let’s say we get data from Southern California,” he said. “It improves the Boston model as well because it captures some of the common piece we missed there. We know there are actual benefits for including data from different geographies.”
Any reliable AV can identify pedestrians or bicyclists or other objects it needs to be aware of. In demonstration screens that companies use to show the public what the AV “sees,” these are often outlined with boxes that track the object’s movements. When Perceptive Automata shows off its system, the boxes are eliminated in favor of little “thought bubbles” that float above a pedestrian’s head and indicate two items: intention and awareness.
Intention is the system’s guess about if the person (or, in the future, the bicyclist or other car) is about to move into the AV’s lane of traffic. Awareness is a separate indicator that tries to figure out if the person knows that the AV is present.
Room to improve
Currently, Perceptive Automata AI relies only on cameras because they are cheap, ubiquitous and rich in terms of getting information, Misra said. Its engineers are investigating using lidar and thermal cameras for different conditions. Thermal cameras could be especially useful because of what they can reveal in terms of body language.
“For us, it’s early days,” Misra said. “We’re investigating these sensors because ultimately the full self-driving solution, whenever that goes at scale, will have a sensor fusion profile.”
Perceptive Automata’s technology is open enough to use information from all of those sensors or from V2I signals (from smart traffic lights, for example), but the technology is being built to not need any external help, Misra said, adding that the company’s software has plenty of room to grow.
“Compute-wise, our approach is very lightweight,” he said. “There’s a lot of optimization that we’ve still left on the table that we would use if we’re going to a really constrained environment, like getting it on the [Toyota] Camry for 2022, or something like that.”
Volvo eyes ‘human intent’ tech for varied AVs
Perceptive Automata is only one of the partners that Volvo Trucks is working with as it develops self-driving technology. From platooning semi-trucks to fully autonomous mining vehicles to the Vera cab-less tractor, Volvo is approaching autonomy from a number of different angles. Aside from the autonomous vehicle (AV) tech itself, the work Volvo is doing with Perceptive Automata and DHE is about getting to know what its trucking customers actually want in a self-driving vehicle, said Aravind Kailas, research and innovation manager for Volvo Group.
Kailas said that Volvo realized it was absolutely critical to have the perception technology as a component of AVs, but exactly how it might reach the market down the road is uncertain at this point. Kailas said Volvo wants to bring Perceptive Automata’s technology into some of its commercial pilots but, “From there, where it goes is anybody’s guess just because we’re going to obviously be looking at some very practical complex problems that we want this technology to solve,” he said.
“As the word is spreading internally through the Volvo Group [about Perceptive Automata], people are realizing that it’s not just trucks but even autonomous buses, for example, or construction equipment operating in urban environments,” he said. “There are a lot of regions where our products will be in areas which will have pedestrians, which will have cyclists, which will have people walking pets and so on. I think what has come out of this project is this change of mindset about human intent to automated vehicles.”