Object Recognition for Robots
July 27, 2015 | MITEstimated reading time: 3 minutes
John Leonard’s group in the MIT Department of Mechanical Engineering specializes in SLAM, or simultaneous localization and mapping, the technique whereby mobile autonomous robots map their environments and determine their locations.
Last week, at the Robotics Science and Systems conference, members of Leonard’s group presented a new paper demonstrating how SLAM can be used to improve object-recognition systems, which will be a vital component of future robots that have to manipulate the objects around them in arbitrary ways.
The system uses SLAM information to augment existing object-recognition algorithms. Its performance should thus continue to improve as computer-vision researchers develop better recognition software, and roboticists develop better SLAM software.
“Considering object recognition as a black box, and considering SLAM as a black box, how do you integrate them in a nice manner?” asks Sudeep Pillai, a graduate student in computer science and engineering and first author on the new paper. “How do you incorporate probabilities from each viewpoint over time? That’s really what we wanted to achieve.”
Despite working with existing SLAM and object-recognition algorithms, however, and despite using only the output of an ordinary video camera, the system’s performance is already comparable to that of special-purpose robotic object-recognition systems that factor in depth measurements as well as visual information.
And of course, because the system can fuse information captured from different camera angles, it fares much better than object-recognition systems trying to identify objects in still images.
Drawing boundaries
Before hazarding a guess about which objects an image contains, Pillai says, newer object-recognition systems first try to identify the boundaries between objects. On the basis of a preliminary analysis of color transitions, they’ll divide an image into rectangular regions that probably contain objects of some sort. Then they’ll run a recognition algorithm on just the pixels inside each rectangle.
To get a good result, a classical object-recognition system may have to redraw those rectangles thousands of times. From some perspectives, for instance, two objects standing next to each other might look like one, particularly if they’re similarly colored. The system would have to test the hypothesis that lumps them together, as well as hypotheses that treat them as separate.
Because a SLAM map is three-dimensional, however, it does a better job of distinguishing objects that are near each other than single-perspective analysis can. The system devised by Pillai and Leonard, a professor of mechanical and ocean engineering, uses the SLAM map to guide the segmentation of images captured by its camera before feeding them to the object-recognition algorithm. It thus wastes less time on spurious hypotheses.
More important, the SLAM data let the system correlate the segmentation of images captured from different perspectives. Analyzing image segments that likely depict the same objects from different angles improves the system’s performance.
Picture perfect
Using machine learning, other researchers have built object-recognition systems that act directly on detailed 3-D SLAM maps built from data captured by cameras, such as the Microsoft Kinect, that also make depth measurements. But unlike those systems, Pillai and Leonard’s system can exploit the vast body of research on object recognizers trained on single-perspective images captured by standard cameras.
Moreover, the performance of Pillai and Leonard’s system is already comparable to that of the systems that use depth information. And it’s much more reliable outdoors, where depth sensors like the Kinect’s, which depend on infrared light, are virtually useless.
Pillai and Leonard’s new paper describes how SLAM can help improve object detection, but in ongoing work, Pillai is investigating whether object detection can similarly aid SLAM. One of the central challenges in SLAM is what roboticists call “loop closure.” As a robot builds a map of its environment, it may find itself somewhere it’s already been — entering a room, say, from a different door. The robot needs to be able to recognize previously visited locations, so that it can fuse mapping data acquired from different perspectives.
Object recognition could help with that problem. If a robot enters a room to find a conference table with a laptop, a coffee mug, and a notebook at one end of it, it could infer that it’s the same conference room where it previously identified a laptop, a coffee mug, and a notebook in close proximity.
“The ability to detect objects is extremely important for robots that should perform useful tasks in everyday environments,” says Dieter Fox, a professor of computer science and engineering at the University of Washington. “This work shows very promising results on how a robot can combine information observed from multiple viewpoints to achieve efficient and robust detection of objects.”
Testimonial
"We’re proud to call I-Connect007 a trusted partner. Their innovative approach and industry insight made our podcast collaboration a success by connecting us with the right audience and delivering real results."
Julia McCaffrey - NCAB GroupSuggested Items
EV Group Achieves Breakthrough in Hybrid Bonding Overlay Control for Chiplet Integration
09/12/2025 | EV GroupEV Group (EVG), a leading provider of innovative process solutions and expertise serving leading-edge and future semiconductor designs and chip integration schemes, today unveiled the EVG®40 D2W—the first dedicated die-to-wafer overlay metrology platform to deliver 100 percent die overlay measurement on 300-mm wafers at high precision and speeds needed for production environments. With up to 15X higher throughput than EVG’s industry benchmark EVG®40 NT2 system designed for hybrid wafer bonding metrology, the new EVG40 D2W enables chipmakers to verify die placement accuracy and take rapid corrective action, improving process control and yield in high-volume manufacturing (HVM).
AV Switchblade 600 Loitering Munition System Achieves Pivotal Milestone with First-Ever Air Launch from MQ-9A
09/12/2025 | BUSINESS WIREAeroVironment, Inc. (AV) a global leader in intelligent, multi-domain autonomous systems, announced its Switchblade 600 loitering munition system (LMS) has achieved a significant milestone with its first-ever air launch from an MQ-9A Reaper Unmanned Aircraft System (UAS).
United Electronics Corporation Unveils Revolutionary CIMS Galaxy 30 Automated Optical Inspection System
09/11/2025 | United Electronics CorporationUnited Electronics Corporation (UEC) today announced the launch of its new groundbreaking CIMS Galaxy 30 Automated Optical Inspection (AOI) machine, setting a new industry standard for precision electronics manufacturing quality control. The Galaxy 30, developed and manufactured by CIMS, represents a significant leap forward in inspection technology, delivering exceptional speed improvements and introducing cutting-edge artificial intelligence capabilities.
IPS, SEL Raise the Bar for ENIG Automation in North America
09/11/2025 | Mike Brask, IPSIPS has installed a state-of-the-art automated ENIG plating line at Schweitzer Engineering Laboratories’ PCB facility in Moscow, Idaho. The 81-foot, fully enclosed line sets a new standard for automation, safety, and efficiency in North American PCB manufacturing and represents one of the largest fully enclosed final finish lines in operation.
Smart Automation: Odd-form Assembly—Dedicated Insertion Equipment Matters
09/09/2025 | Josh Casper -- Column: Smart AutomationLarge, irregular, or mechanically unique parts, often referred to as odd-form components, have never truly disappeared from electronics manufacturing. While many in the industry have been pursuing miniaturization, faster placement speeds, and higher-density PCBs, certain market sectors are moving in the opposite direction.