SBS News

How Do Robots Learn 'Good Behavior'? The Evolution of Physical AI Training

Hong Yeongjae

Published : Jun 28, 2026 9:46 PM

Video

Robots in the world of physical AI must understand human speech and move on their own to adapt to their surroundings.

A crucial part of this process is teaching them what constitutes good behavior.

Consider a robot picking up a box.

Failing to reach out and lift the box is a bad example, while balancing and lifting it successfully in one go is a good example.

To train a robot, tens of thousands to hundreds of millions of these answer keys must be created depending on the difficulty of the task. The problem is that humans have to grade each one to determine what is good and what is bad.

[Yoo Chang-dong / Professor, School of Electrical Engineering, KAIST: We have to inform the robot that we prefer the successful cases. Depending on the complexity of the task, having humans label and evaluate each one manually requires an enormous amount of time and cost.]

A U.S. robotics data organization estimated that creating more than 500 training datasets could cost up to 270 million won, considering equipment and labor costs.

The cost of creating a single video dataset of a simple object-picking task is around 20,000 won, while complex tasks involving the use of both hands can cost nearly 50,000 won.

Securing training data for robots has become another bottleneck in the AI era, just as significant as electricity and computing resources.

To reduce this bottleneck, a research team at KAIST has developed a method where AI grades itself.

By looking at only a small number of human-evaluated training datasets, the AI learns the evaluation criteria on its own and then grades thousands or tens of thousands of videos by itself.

[Yoo Chang-dong / Professor, School of Electrical Engineering, KAIST: In an experiment involving opening a drawer, we can generate about 10,000 data points using just 10 pieces of data evaluated by humans.]

A humanoid holding a tennis racket chases and returns a ball hit by a human.

Sports movements that require quick reactions and precise motions are even harder to secure training data for.

Researchers at Tsinghua University in China attempted a method that starts with incomplete data instead of collecting perfect training data.

The AI learned from five hours of video footage showing basic movements like forehands, backhands, and footwork, enabling it to engage in basic rallies. The research team explained that this approach reduces the burden of data collection.

Training in simulations and virtual spaces is also becoming increasingly important.

Boston Dynamics' Atlas lifts a refrigerator and moves it to a table nearby.

The robot succeeded in the real world after undergoing millions of failures in a virtual space, practicing movements that would be too dangerous or costly to repeat in reality.

[Shane Rozen-Levy / Engineer, Boston Dynamics: We provided reference scenes to mimic the refrigerator-moving task and conducted simulations over millions of hours.]

For robots to appear in workplaces and factories, they must learn more complex behaviors efficiently.

To solve that challenge, physical AI training methods are continuing to evolve.

Reported by Kim Hak-mo | Video by Choi Hye-young | Graphics by Park Tae-young

※ Please note: This article was translated by AI and may contain errors.