AI Learns What an Infant Knows about the Physical World

A computer model simulating how objects react to physical forces approximates how babies understand their surroundings

Cute baby on the floor at home playing with colorful balls — olesiabilkei/Getty Images

If I drop a pen, you know that it won’t hover in midair but will fall to the floor. Similarly, if the pen encounters a desk on its way down, you know it won’t travel through the surface but will instead land on top.

These fundamental properties of physical objects seem intuitive to us. Infants as young as three months know that a ball no longer in sight still exists and that the ball can’t teleport from behind the couch to the top of the refrigerator.

Despite mastering complex games, such as chess and poker, artificial intelligence systems have yet to demonstrate the “commonsense” knowledge that an infant is either born with or picks up seemingly without effort in their first few months.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

“It’s so striking that as much as AI technologies have advanced, we still don’t have AI systems with anything like human common sense,” says Joshua Tenenbaum, a professor of cognitive sciences at the Massachusetts Institute of Technology, who has done research in this area. “If we were ever to get to that point, then understanding how it works, how it arises in humans” will be valuable.

A study published on July 11 in the journal Nature Human Behaviour by a team at DeepMind, a subsidiary of Google’s parent company Alphabet, takes a step toward advancing how such commonsense knowledge might be incorporated into machines—and understanding how it develops in humans. The scientists came up with an “intuitive physics” model by integrating the same inherent knowledge that developmental psychologists think a baby is born with into an AI system. They also created a means of testing the model that is akin to the methods used to assess cognition in human infants.

Normally, the deep-learning systems that have become ubiquitous in AI research go through training to identify patterns of pixels in a scene. By doing so, they can recognize a face or a ball, but they cannot predict what will happen to those objects when placed in a dynamic scene where they move and bump into each other. To tackle the trickier challenge presented by intuitive physics, the researchers developed a model called PLATO (Physics Learning through Auto-encoding and Tracking Objects) to focus on whole objects instead of individual pixels. They then trained PLATO on about 300,000 videos so that it could learn how an object behaves: a ball falling, bouncing against another object or rolling behind a barrier only to reappear on the other side.

The goal was to have PLATO understand what violates the laws of intuitive physics based on five fundamental concepts: object permanence (an object still exists even if it’s not in view), solidity (objects are physically solid), continuity (objects move in continuous paths and can’t disappear and reappear in an unexpectedly distant place), unchangeableness (an object’s properties always remain the same) and directional inertia (an object only changes direction under the law of inertia). PLATO, like an infant, exhibited “surprise” when it, say, viewed an object that moved through another one without ricocheting backward upon impact. It performed significantly better at distinguishing physically possible versus impossible scenes than a traditional AI system that was trained on the same videos but had not been imbued with an inherent knowledge of objects.

“Psychologists think that people use objects to understand the physical world, so maybe if we build a system like that, we’re going to maximize our likelihood of [an AI model] actually understanding the physical world,” said Luis Piloto, a research scientist at DeepMind who led the study, during a press conference.

Previous efforts to teach intuitive physics to AI by incorporating varying degrees of built-in or acquired physical knowledge into the system have achieved mixed success. The new study attempted to obtain an understanding of intuitive physics in the same manner that developmental psychologists think an infant does by first displaying an inborn awareness of what an object is. The child then learns the physical rules that govern the object’s behavior by watching it move about the world.

“What’s exciting and unique about this paper is that they did it very closely based on what is known in cognitive psychology and developmental science,” says Susan Hespos, a psychology professor at Northwestern University, who co-wrote a News & Views article accompanying the paper but was not involved with the research. “We are born with innate knowledge, but it’s not like it’s perfect when we're born with it.... And then, through experience and the environment, babies—just like this computer model—elaborate that knowledge.”

The DeepMind researchers emphasize that, at this stage, their work is not ready to advance robotics, self-driving cars or other trending AI applications. The model they developed will need substantially more training on objects involved in real-world scenarios before it can be incorporated into AI systems. As the model grows in sophistication, it might also inform developmental psychology research about how infants learn to understand the world. Whether commonsense knowledge is learned or innate has been debated by developmental psychologists for nearly 100 years, dating back to Swiss psychologist Jean Piaget’s work on the stages of cognitive development.

“There’s a fruitful collaboration that can happen between artificial intelligence that takes ideas from developmental science and incorporates it into their modeling,” Hespos says. “I think it can be a mutually beneficial relationship for both sides of the equation.”