MIT Robot Combines Vision and Touch to Learn the Game of Jenga

February 1, 2019 | MIT

Estimated reading time: 5 minutes

In the basement of MIT’s Building 3, a robot is carefully contemplating its next move. It gently pokes at a tower of blocks, looking for the best block to extract without toppling the tower, in a solitary, slow-moving, yet surprisingly agile game of Jenga.

The robot, developed by MIT engineers, is equipped with a soft-pronged gripper, a force-sensing wrist cuff, and an external camera, all of which it uses to see and feel the tower and its individual blocks.

As the robot carefully pushes against a block, a computer takes in visual and tactile feedback from its camera and cuff, and compares these measurements to moves that the robot previously made. It also considers the outcomes of those moves — specifically, whether a block, in a certain configuration and pushed with a certain amount of force, was successfully extracted or not. In real-time, the robot then “learns” whether to keep pushing or move to a new block, in order to keep the tower from falling.

Details of the Jenga-playing robot are published today in the journal Science Robotics. Alberto Rodriguez, the Walter Henry Gale Career Development Assistant Professor in the Department of Mechanical Engineering at MIT, says the robot demonstrates something that’s been tricky to attain in previous systems: the ability to quickly learn the best way to carry out a task, not just from visual cues, as it is commonly studied today, but also from tactile, physical interactions.

“Unlike in more purely cognitive tasks or games such as chess or Go, playing the game of Jenga also requires mastery of physical skills such as probing, pushing, pulling, placing, and aligning pieces. It requires interactive perception and manipulation, where you have to go and touch the tower to learn how and when to move blocks,” Rodriguez says. “This is very difficult to simulate, so the robot has to learn in the real world, by interacting with the real Jenga tower. The key challenge is to learn from a relatively small number of experiments by exploiting common sense about objects and physics.”

He says the tactile learning system the researchers have developed can be used in applications beyond Jenga, especially in tasks that need careful physical interaction, including separating recyclable objects from landfill trash and assembling consumer products.

“In a cellphone assembly line, in almost every single step, the feeling of a snap-fit, or a threaded screw, is coming from force and touch rather than vision,” Rodriguez says. “Learning models for those actions is prime real-estate for this kind of technology.”

The paper’s lead author is MIT graduate student Nima Fazeli. The team also includes Miquel Oller, Jiajun Wu, Zheng Wu, and Joshua Tenenbaum, professor of brain and cognitive sciences at MIT.

Push and Pull

In the game of Jenga — Swahili for “build” — 54 rectangular blocks are stacked in 18 layers of three blocks each, with the blocks in each layer oriented perpendicular to the blocks below. The aim of the game is to carefully extract a block and place it at the top of the tower, thus building a new level, without toppling the entire structure.

To program a robot to play Jenga, traditional machine-learning schemes might require capturing everything that could possibly happen between a block, the robot, and the tower — an expensive computational task requiring data from thousands if not tens of thousands of block-extraction attempts.

Instead, Rodriguez and his colleagues looked for a more data-efficient way for a robot to learn to play Jenga, inspired by human cognition and the way we ourselves might approach the game.

The team customized an industry-standard ABB IRB 120 robotic arm, then set up a Jenga tower within the robot’s reach, and began a training period in which the robot first chose a random block and a location on the block against which to push. It then exerted a small amount of force in an attempt to push the block out of the tower.

For each block attempt, a computer recorded the associated visual and force measurements, and labeled whether each attempt was a success.

Rather than carry out tens of thousands of such attempts (which would involve reconstructing the tower almost as many times), the robot trained on just about 300, with attempts of similar measurements and outcomes grouped in clusters representing certain block behaviors. For instance, one cluster of data might represent attempts on a block that was hard to move, versus one that was easier to move, or that toppled the tower when moved. For each data cluster, the robot developed a simple model to predict a block’s behavior given its current visual and tactile measurements.

Fazeli says this clustering technique dramatically increases the efficiency with which the robot can learn to play the game, and is inspired by the natural way in which humans cluster similar behavior: “The robot builds clusters and then learns models for each of these clusters, instead of learning a model that captures absolutely everything that could happen.”

Page 1 of 2

Share on:

Suggested Items

I-Connect007 Editor’s Choice: Five Must-Reads for the Week

06/06/2025 | Nolan Johnson, I-Connect007
Maybe you’ve noticed that I’ve been taking to social media lately to about my five must-reads of the week. It’s just another way we’re sharing our curated content with you. I pay special attention to what’s happening in our industry, and I can help you know what’s most important to read about each week. Follow me (and I-Connect007) on LinkedIn to see these and other updates.

INEMI Interim Report: Interconnection Modeling and Simulation Results for Low-Temp Materials in First-Level Interconnect

05/30/2025 | iNEMI
One of the greatest challenges of integrating different types of silicon, memory, and other extended processing units (XPUs) in a single package is in attaching these various types of chips in a reliable way.

Siemens Leverages AI to Close Industry’s IC Verification Productivity Gap in New Questa One Smart Verification Solution

05/13/2025 | Siemens
Siemens Digital Industries Software announced the Questa™ One smart verification software portfolio, combining connectivity, a data driven approach and scalability with AI to push the boundaries of the Integrated Circuit (IC) verification process and make engineering teams more productive.

Cadence Unveils Millennium M2000 Supercomputer with NVIDIA Blackwell Systems

05/08/2025 | Cadence Design Systems
At its annual flagship user event, CadenceLIVE Silicon Valley 2025, Cadence announced a major expansion of its Cadence® Millennium™ Enterprise Platform with the introduction of the new Millennium M2000 Supercomputer featuring NVIDIA Blackwell systems, which delivers AI-accelerated simulation at unprecedented speed and scale across engineering and drug design workloads.

DARPA Selects Cerebras to Deliver Next Generation, Real-Time Compute Platform for Advanced Military and Commercial Applications

04/08/2025 | Ranovus
Cerebras Systems, the pioneer in accelerating generative AI, has been awarded a new contract from the Defense Advanced Research Projects Agency (DARPA), for the development of a state-of-the-art high-performance computing system. The Cerebras system will combine the power of Cerebras’ wafer scale technology and Ranovus’ wafer scale co-packaged optics to deliver several orders of magnitude better compute performance at a fraction of the power draw.

News Highlights

More News

Featured Books

Article Highlights

More Articles

Latest Columns

See all of our columnists

Media Kit - Choose Your Primary Marketing Focus: