SAE WCX 2022: Vehicles Are the Gateway to Data, Data, and More Data

In the right hands, meaningful data can lead to safer vehicles, concluded an SAE expert panel.

According to SAE WCX 2022 panelist Kevin Gay, head of safety for autonomous mobility and delivery at Uber, “You can have too much data in terms of nominal use cases.” (Uber)

The mobility ecosystem is bursting with data from the passenger vehicles, buses, ride-share vehicles, delivery vans, and other transportation modes using driver assist and connected systems. “These much smarter systems are becoming more intelligent in producing data,” Rajeev Chhajer, principal research engineer for Connected Technologies at Honda Development & Manufacturing America, noted during the April 5 “Data - Navigating Complex and Conflicting Forces” panel discussion at WCX 2022 .

While data mined from vehicles is considered a feedstock for creating safer mobility, getting the right data in the right hands at the right time is most important. “When you look at the autonomous research that’s been going on for years, petabytes of information have been captured from test fleets,” said Dean Phillips, worldwide technical leader for automotive at Amazon Web Services (AWS).

For Dr. Stephen Ridella, Director of the Office of Defects Investigation at the National Highway Traffic Safety Administration, data can help pinpoint causes of vehicular crashes. As one example, NHTSA began requesting accident-related data in mid-2021 from approximately 100 companies involved with advanced driver assist systems (ADAS) via production vehicle applications, ride sharing and road testing. “We want the information in a timely fashion, so we can analyze it and understand what might be happening with the vehicles,” Ridella said.

Panelists discuss the importance of vehicle data at the SAE WCX Leadership Summit program on April 5. From left: Panel moderator Jack Weast, Honda’s Rajeev Chhajer, Uber’s Kevin Gay, AWS’ Dean Phillips, and NHTSA’s Dr. Stephen Ridella. (Adam Isovitsch/SAE)

ADAS-equipped vehicles produce enormous amounts of data, not all of which is always needed, or wanted. “I want to make sure that I have the right information. I don’t want data that we can’t use,” Ridella said. Honda’s Chhajer agreed that having a mountain of data doesn’t always bring about meaningful insights. “As an organization, we have to be very intentional about the use cases,” he said. Fellow panelist Kevin Gay, head of safety for autonomous mobility and delivery at Uber, added: “You can have too much data in terms of nominal use cases.”

From a cost perspective, information isn’t free. “By no means is data collection or data management an inexpensive endeavor,” said Honda’s Chhajer. In his opinion, research teams seeking data should consider the data’s expected value versus actual costs, including the cost of processing that data. Said Uber’s Gay, “I certainly think it’s more efficient if there’s an overarching framework for how data is shared.”

The AWS Data Exchange cloud service could be a model of data sharing for OEMs and NHTSA, Phillips offered. For instance, a data-exchange subscriber could download data sets and, depending on how it’s being shared, the information can be refreshed based on an interval of time, such as monthly, daily or hourly.

“This kind of capability would be useful if OEMs create data that needs to be reported [for regulatory purposes]. Or if data is in the hands of NHTSA and others want or need to subscribe to that, you can create an easy model to do the sharing,” Phillips said. Money can be part of the exchange. “If I build an application in my account that leverages data from the data exchange service, I pay for it,” he observed.

Data means different things to different users. “We need to have frameworks for how we define our data because that directly results in the decisions we make and the actions we take,” Honda’s Chhajer asserted. For instance, certain data sets could be monetized. Other data sets could be used for government regulation and compliance purposes, while select data sets could be tagged for R&D purposes.

The overarching umbrella is for users to know why data is being collected, how it’s being used as well as how the data is being stored and/or discarded. “It’s really important to think about how data will be used,” Chhajer said.