Explained: Neural Networks

April 17, 2017 | MIT

Estimated reading time: 7 minutes

In the past 10 years, the best-performing artificial-intelligence systems — such as the speech recognizers on smartphones or Google’s latest automatic translator — have resulted from a technique called “deep learning.”

Deep learning is in fact a new name for an approach to artificial intelligence called neural networks, which have been going in and out of fashion for more than 70 years. Neural networks were first proposed in 1944 by Warren McCullough and Walter Pitts, two University of Chicago researchers who moved to MIT in 1952 as founding members of what’s sometimes called the first cognitive science department.

Neural nets were a major area of research in both neuroscience and computer science until 1969, when, according to computer science lore, they were killed off by the MIT mathematicians Marvin Minsky and Seymour Papert, who a year later would become co-directors of the new MIT Artificial Intelligence Laboratory.

The technique then enjoyed a resurgence in the 1980s, fell into eclipse again in the first decade of the new century, and has returned like gangbusters in the second, fueled largely by the increased processing power of graphics chips.

“There’s this idea that ideas in science are a bit like epidemics of viruses,” says Tomaso Poggio, the Eugene McDermott Professor of Brain and Cognitive Sciences at MIT, an investigator at MIT’s McGovern Institute for Brain Research, and director of MIT’s Center for Brains, Minds, and Machines. “There are apparently five or six basic strains of flu viruses, and apparently each one comes back with a period of around 25 years. People get infected, and they develop an immune response, and so they don’t get infected for the next 25 years. And then there is a new generation that is ready to be infected by the same strain of virus. In science, people fall in love with an idea, get excited about it, hammer it to death, and then get immunized — they get tired of it. So ideas should have the same kind of periodicity!”

Weighty matters

Neural nets are a means of doing machine learning, in which a computer learns to perform some task by analyzing training examples. Usually, the examples have been hand-labeled in advance. An object recognition system, for instance, might be fed thousands of labeled images of cars, houses, coffee cups, and so on, and it would find visual patterns in the images that consistently correlate with particular labels.

Modeled loosely on the human brain, a neural net consists of thousands or even millions of simple processing nodes that are densely interconnected. Most of today’s neural nets are organized into layers of nodes, and they’re “feed-forward,” meaning that data moves through them in only one direction. An individual node might be connected to several nodes in the layer beneath it, from which it receives data, and several nodes in the layer above it, to which it sends data.

To each of its incoming connections, a node will assign a number known as a “weight.” When the network is active, the node receives a different data item — a different number — over each of its connections and multiplies it by the associated weight. It then adds the resulting products together, yielding a single number. If that number is below a threshold value, the node passes no data to the next layer. If the number exceeds the threshold value, the node “fires,” which in today’s neural nets generally means sending the number — the sum of the weighted inputs — along all its outgoing connections.

When a neural net is being trained, all of its weights and thresholds are initially set to random values. Training data is fed to the bottom layer — the input layer — and it passes through the succeeding layers, getting multiplied and added together in complex ways, until it finally arrives, radically transformed, at the output layer. During training, the weights and thresholds are continually adjusted until training data with the same labels consistently yield similar outputs.

Minds and machines

The neural nets described by McCullough and Pitts in 1944 had thresholds and weights, but they weren’t arranged into layers, and the researchers didn’t specify any training mechanism. What McCullough and Pitts showed was that a neural net could, in principle, compute any function that a digital computer could. The result was more neuroscience than computer science: The point was to suggest that the human brain could be thought of as a computing device.

Neural nets continue to be a valuable tool for neuroscientific research. For instance, particular network layouts or rules for adjusting weights and thresholds have reproduced observed features of human neuroanatomy and cognition, an indication that they capture something about how the brain processes information.

The first trainable neural network, the Perceptron, was demonstrated by the Cornell University psychologist Frank Rosenblatt in 1957. The Perceptron’s design was much like that of the modern neural net, except that it had only one layer with adjustable weights and thresholds, sandwiched between input and output layers.

Perceptrons were an active area of research in both psychology and the fledgling discipline of computer science until 1959, when Minsky and Papert published a book titled “Perceptrons,” which demonstrated that executing certain fairly common computations on Perceptrons would be impractically time consuming.

“Of course, all of these limitations kind of disappear if you take machinery that is a little more complicated — like, two layers,” Poggio says. But at the time, the book had a chilling effect on neural-net research.

“You have to put these things in historical context,” Poggio says. “They were arguing for programming — for languages like Lisp. Not many years before, people were still using analog computers. It was not clear at all at the time that programming was the way to go. I think they went a little bit overboard, but as usual, it’s not black and white. If you think of this as this competition between analog computing and digital computing, they fought for what at the time was the right thing.”

Page 1 of 2

Share on:

Testimonial

"We’re proud to call I-Connect007 a trusted partner. Their innovative approach and industry insight made our podcast collaboration a success by connecting us with the right audience and delivering real results."

Julia McCaffrey - NCAB Group

Suggested Items

Soaring Inference AI Demand Triggers Severe Nearline HDD Shortages; QLC SSD Shipments Poised for Breakout in 2026

09/16/2025 | TrendForce
TrendForce’s latest investigations reveal that the massive data volumes generated by AI are straining the global infrastructure of data center storage.

Advanced Packaging-to-Board-Level Integration: Needs and Challenges

09/15/2025 | Devan Iyer and Matt Kelly, Global Electronics Association
HPC data center markets now demand components with the highest processing and communication rates (low latencies and high bandwidth, often both simultaneously) and highest capacities with extreme requirements for advanced packaging solutions at both the component level and system level. Insatiable demands have been projected for heterogeneous compute, memory, storage, and data communications. Interconnect has become one of the most important pillars of compute for these systems.

Procense Raises $1.5M in Seed Funding to Accelerate AI-Powered Manufacturing

09/11/2025 | BUSINESS WIRE
Procense, a San Francisco-based industrial automation startup developing cutting-edge AI and remote sensing technologies for process manufacturers has raised $1.5 million in a seed funding round led by Kevin Mahaffey, Business Insider’s #1 seed investor of 2025 and HighSage Ventures, a Boston-based family office that primarily invests in public and private companies in the global software, internet, consumer, and financial technology sectors.

Zuken Announces E3.series 2026 Release for Accelerated Electrical Design and Enhanced Engineering Productivity  

09/10/2025 | Zuken
Zuken reveals details of the upcoming 2026 release of E3.series, which will introduce powerful new features aimed at streamlining electrical and fluid design, enhancing multi-disciplinary collaboration, and boosting engineering productivity.

AI Infrastructure Boosts Global Semiconductor Revenue Growth to 17.6% in 2025

09/09/2025 | IDC
According to the Worldwide Semiconduct o r Technology and Supply Chain Intelligence service from International Data Corporation (IDC), worldwide semiconductor revenue is expected to reach $800 billion in 2025, growing 17.6% year-over-year from $680 billion in 2024. This follows a strong rebound in 2024, when revenue grew by 22.4% year-over-year.

News Highlights

More News

Featured Books

Article Highlights

More Articles

Latest Columns

See all of our columnists

Media Kit - Choose Your Primary Marketing Focus: