Mobileye’s Self-Driving Secret? 200PB of Data
January 10, 2022 | IntelEstimated reading time: 2 minutes
Mobileye is sitting on a virtual treasure trove of driving data – some 200 petabytes worth. When combined with Mobileye’s state-of-the-art computer vision technology and extremely capable natural language understanding (NLU) models, the dataset can deliver thousands of results within seconds, even for incidents that fall into the “long tail” of rare conditions and scenarios. This helps the AV and state-of-the-art computer vision system handle edge cases and thereby achieve the very high mean time between failure (MTBF) rate targeted for self-driving vehicles.
“Data and the infrastructure in place to harness it is the hidden complexity of autonomous driving. Mobileye has spent 25 years collecting and analyzing what we believe to be the industry’s leading database of real-world and simulated driving experience, setting Mobileye apart by enabling highly capable AV solutions that meet the high bar for mean time between failure,” said Prof. Amnon Shashua, Mobileye president and chief executive officer.
Mobileye’s database – believed to be the world’s largest automotive dataset – comprises more than 200 petabytes of driving footage, equivalent to 16 million 1-minute driving clips from 25 years of real-world driving. Those 200 petabytes are stored between Amazon Web Services (AWS) and on-premise systems. The sheer size of Mobileye’s dataset makes the company one of AWS’s largest customers by volume stored globally.
Large-scale data labeling is at the heart of building powerful computer vision engines needed for autonomous driving. Mobileye’s rich and relevant dataset is annotated both automatically and manually by a team of more than 2,500 specialized annotators. The compute engine relies on 500,000 peak CPU cores at the AWS cloud to crunch 50 million datasets monthly – the equivalent to 100 petabytes being processed every month related to 500,000 hours of driving.
Data is only valuable if you can make sense of it and put it to use. This requires deep comprehension of natural language along with state-of-the-art computer vision, Mobileye’s long-standing strength.
Every AV player faces the “long tail” problem in which a self-driving vehicle encounters something it has not seen or experienced before. This long tail contains large datasets, but many do not have the tools to effectively make sense of it. Mobileye’s state-of-the-art computer vision technology combined with extremely capable NLU models enable Mobileye to query the dataset and return thousands of results within the long tail within seconds. Mobileye can then use this to train its computer vision system and make it even more capable. Mobileye’s approach dramatically accelerates the development cycle.
Mobileye’s team uses an in-house search engine database with millions of images, video clips and scenarios. They include anything from “tractor covered in snow” to “traffic light in low sun,” all collected by Mobileye and feeding its algorithms. (See sample images).
With access to the industry’s highest-quality data and the talent required to put it to use, Mobileye’s driving policy can make sound, informed decisions deterministically, an approach that removes the uncertainty of artificial intelligence-based decisions and yields a statistically high mean time between failure rate. At the same time, the dataset hastens the development cycle to bring the lifesaving promise of AV technology to reality more quickly.
Testimonial
"Advertising in PCB007 Magazine has been a great way to showcase our bare board testers to the right audience. The I-Connect007 team makes the process smooth and professional. We’re proud to be featured in such a trusted publication."
Klaus Koziol - atgSuggested Items
Semiconductors Get Magnetic Boost with New Method from UCLA Researchers
07/31/2025 | UCLA NewsroomA new method for combining magnetic elements with semiconductors — which are vital materials for computers and other electronic devices — was unveiled by a research team led by the California NanoSystems Institute at UCLA.
Diraq Secures CTCP Funding to Uncover Energy Applications
07/28/2025 | DiraqDiraq has been awarded AU$500,000 in funding to explore how quantum computers can enhance the performance, sustainability and security of energy networks.
Argonne, Partners Celebrate Aurora Supercomputer’s Impact on Science with AI and Exascale Power
07/21/2025 | BUSINESS WIREThe U.S. Department of Energy’s (DOE) Argonne National Laboratory hosted a ribbon-cutting ceremony to celebrate its new Aurora exascale computer, marking a major milestone for AI-powered science.
EIFO, the Novo Nordisk Foundation Acquire the World's Most Powerful Quantum Computer
07/17/2025 | PRNewswireThe commercial and geopolitical stakes in quantum technology are immense, and significant technological advances have been made over the past decade.
IBM, RIKEN Unveil First IBM Quantum System Two Outside of the U.S.
06/24/2025 | IBMIBM and RIKEN, a national research laboratory in Japan, today unveiled the first IBM Quantum System Two ever to be deployed outside of the United States and beyond an IBM Quantum Data Center.