Machines That Learn Language More Like Kids Do
November 1, 2018 | MITEstimated reading time: 6 minutes
The expression with the most closely matching representations for objects, humans, and actions becomes the most likely meaning of the caption. The expression, initially, may refer to many different objects and actions in the video, but the set of possible meanings serves as a training signal that helps the parser continuously winnow down possibilities. “By assuming that all of the sentences must follow the same rules, that they all come from the same language, and seeing many captioned videos, you can narrow down the meanings further,” Barbu says.
In short, the parser learns through passive observation: To determine if a caption is true of a video, the parser by necessity must identify the highest probability meaning of the caption. “The only way to figure out if the sentence is true of a video [is] to go through this intermediate step of, ‘What does the sentence mean?’ Otherwise, you have no idea how to connect the two,” Barbu explains. “We don’t give the system the meaning for the sentence. We say, ‘There’s a sentence and a video. The sentence has to be true of the video. Figure out some intermediate representation that makes it true of the video.’”
The training produces a syntactic and semantic grammar for the words it’s learned. Given a new sentence, the parser no longer requires videos, but leverages its grammar and lexicon to determine sentence structure and meaning.
Ultimately, this process is learning “as if you’re a kid,” Barbu says. “You see world around you and hear people speaking to learn meaning. One day, I can give you a sentence and ask what it means and, even without a visual, you know the meaning.”
“This research is exactly the right direction for natural language processing,” says Stefanie Tellex, a professor of computer science at Brown University who focuses on helping robots use natural language to communicate with humans. “To interpret grounded language, we need semantic representations, but it is not practicable to make it available at training time. Instead, this work captures representations of compositional structure using context from captioned videos. This is the paper I have been waiting for!”
In future work, the researchers are interested in modeling interactions, not just passive observations. “Children interact with the environment as they’re learning. Our idea is to have a model that would also use perception to learn,” Ross says.
This work was supported, in part, by the CBMM, the National Science Foundation, a Ford Foundation Graduate Research Fellowship, the Toyota Research Institute, and the MIT-IBM Brain-Inspired Multimedia Comprehension project.
Page 2 of 2Suggested Items
Meet Thiago Guimaraes, IPC's New Director of Industry Intelligence
05/05/2025 | Chris Mitchell, IPC VP, Global Government RelationsThe fast pace of innovation in the electronics manufacturing industry means business owners must continuously adapt their processes and capabilities to meet changing customer demands and market trends. To that end, IPC has hired Thiago Guimaraes as the new director of Industry Intelligence. In this interview, Thiago shares key goals and objectives that could revolutionize the industry as he helps stakeholders navigate industry trends and challenges.
Stocks Tumble as Nvidia Warns of Major Hit From U.S.-China Export Curbs
04/17/2025 | I-Connect007 Editorial TeamU.S. stocks slid sharply Wednesday after Nvidia warned that new U.S. export restrictions on chips to China could slash billions from its revenue, deepening investor anxiety over the broader economic fallout of President Donald Trump’s ongoing trade war.
Samsung and Google Cloud Expand Partnership
04/09/2025 | PRNewswireSamsung Electronics Co., Ltd and Google Cloud today announced an expanded partnership to bring Google Cloud's generative AI technology to Ballie, a new home AI companion robot from Samsung.
Insulectro Technology Village to Feature 35 Powerchats at IPC APEX EXPO 2025
03/11/2025 | InsulectroInsulectro, the largest distributor of materials for use in the manufacture of PCBs and printed electronics, will present its popular and successful 13.5-minute PowerChats™ during this year’s IPC APEX EXPO at the Anaheim Convention Center, March 18-20, 2025.
Drip by Drip: Semiconductor Water Management Innovations
03/05/2025 | IDTechExNot only does semiconductor manufacturing require large volumes of energy, chemicals, and silicon wafers, it also requires vast volumes of water. IDTechEx’s latest report, “Sustainable Electronics and Semiconductor Manufacturing 2025-2035: Players, Markets, Forecasts”, forecasts water usage across semiconductor manufacturing to double by 2035, as demand for integrated circuits continues to rise.