Teaching AI Physics: From Large Models to Real-World Actions

By DAN LIU|Aug 12,2025

Chongqing - At the recently concluded 3rd China International Supply Chain Expo, NVIDIA founder and CEO Jensen Huang sat down with Alibaba Cloud founder Wang Jian to discuss the future of artificial intelligence. Huang made a bold prediction: while Generative AI — now at the peak of its hype — only emerged seven years ago, the next wave will be Physical AI — intelligent systems that integrate their capabilities into the real, physical world, from robotics to autonomous machines.

In Huang’s view, AI has evolved through three distinct stages: Perception AI (understanding images, text, and sound), Generative AI (creating text, photos, and audio), and Agentic AI (capable of autonomous decision-making). Physical AI, with its ability to interact with and understand the real environment, is the fourth great wave — and perhaps the ultimate form.

To explore what this means in practice, Bridging News recently spoke with Professor Zhan Zhenfei of the School of Mechanical, Electrical, and Vehicle Engineering at Chongqing Jiaotong University, a postdoctoral supervisor and former data scientist and R&D engineer at Ford North America. As an industry insider riding this Physical AI wave, Zhan offers a front-line perspective on how AI can truly grasp the rules of the physical world.

Professor Zhan Zhenfei of the School of Mechanical, Electrical, and Vehicle Engineering at Chongqing Jiaotong University is a postdoctoral supervisor, former data scientist, and R&D engineer at Ford North America.

Core Difference Between Traditional AI and Physical AI

Fundamentally, traditional AI — especially mainstream large language models (LLMs) such as ChatGPT — relies on statistical correlations. Its core mechanism is simple: given an input, predict the most probable next word or character based on patterns learned from massive datasets. In other words, it’s not “thinking” in the sense of understanding meaning, but computing: In this context, what is the most likely next token?

As Zhan explains, such models don’t truly operate on causality or logic. For instance, “smoking” and “lung cancer” are highly correlated in the data, but the model doesn’t know that smoking causes lung cancer — it only knows the two terms frequently appear together. Since its output is driven by statistical probability rather than fact-checking, the model may confidently fabricate research or events that never existed — a phenomenon known as AI hallucination.

To truly participate in human life, AI must understand real-world physical laws — how gravity affects motion, how friction changes mechanical efficiency, and how material properties determine structural strength.

Zhan believes Physical AI changes the situation where traditional AI can “see” the world but not “understand” it. By embedding human-verified physical rules — like classical physics — directly into models, using equations as the backbone and data as a supplement, Physical AI can reason beyond the limits of probabilistic pattern-matching. This means AI can both observe phenomena and grasp their causes; it can operate accurately even in unfamiliar environments because physical laws hold true across scenarios.

Physical AI: Giving AI Human-Like Senses and Judgment

If AI is the “brain,” it still needs “eyes,” “ears,” and “hands” to function. In AI systems, these sensory inputs come from multimodal sensors. For example, diagnosing a car problem might require “seeing,” “hearing,” and “feeling” — just as a mechanic would.

In vision, traditional AI often struggles with ambiguous inputs. A conventional system using cameras and radar might be unable to tell whether a shiny patch ahead is a puddle or an obstacle. Physical AI, however, can combine camera and radar data with weather records, onboard force sensors, friction coefficients, and physics models to estimate the puddle’s depth and slip risk.

Zhan offers a relatable example: driving in heavy rain. Experienced drivers can sense when a surface might be slippery, but autonomous vehicles’ “eyes” — lidar, radar, cameras — often fail in poor weather. According to the U.S. National Highway Traffic Safety Administration (NHTSA), roughly 30% of serious advanced driver-assistance accidents are caused by bad weather confusing sensors. Unlike humans, these systems can’t squint, slow down, or adapt instinctively — they rigidly report raw data, often leading to skids or collisions.

By contrast, a Physical AI-equipped car is like having a 30-year veteran driving instructor onboard. When sudden rain hits, a truck tips over ahead, or a ball rolls into the street, the AI “knows” the solutions: how snow changes road friction, when tires lose grip, and how the vehicle tilts. It can instantly decide, “A hard brake will cause an 8° skid; a light brake with a 30 cm swerve is safest,” then execute the maneuver with millisecond precision — just like an expert driver acting on instinct, but with numerical exactness.

The Heavy Lift in Future: Computing Power and Complexity

Despite its promise, Physical AI faces steep challenges. Zhan sees the biggest hurdle as its enormous computational demands and system complexity. It must simulate nonlinear physical processes, fuse multimodal sensor data, and complete the perceive–decide–act loop in milliseconds, pushing computation to exponential levels.

By analogy: traditional autonomous driving is like a “test-prep” driver who memorizes templates for traffic lights, pedestrians, and cones. When encountering a new scene, it just looks for the closest match — barely taxing the processor. Physical AI is like an “experiment-on-the-spot” veteran who instantly factors in road friction, wind speed, tire pressure, and vehicle weight to predict outcomes in milliseconds. Slight delays can send the car skidding across lane lines; if done quickly and precisely, passengers will simply feel a smooth, confident turn.

This leap from “language understanding” to “physical understanding” is only just beginning. Much like how cooking skills from a book pale in comparison to time spent in a real kitchen, AI must leave the comfort of datasets and enter the messy, variable, dynamic real world to truly understand its laws.

Why do models that run flawlessly in simulations “crash” so often in reality? Is transferring skills from autonomous driving to other embodied intelligence fields viable? And in this new race, where exactly does the U.S.–China gap lie?

We’ll explore these questions in the next installment: From Large Models to Small Actions: How AI Learns the Laws of the Physical World (Part 2).

SHARE TO SOCIAL MEDIA

Twitter Facebook LinkedIn Wechat Weibo cancel

Teaching AI Physics: From Large Models to Real-World Actions

Core Difference Between Traditional AI and Physical AI

Physical AI: Giving AI Human-Like Senses and Judgment

The Heavy Lift in Future: Computing Power and Complexity

Leaving a message