Experts warn flashy humanoid robot stunt videos should be viewed with suspicion

The criticism shows that the benchmark in the race for robot technology lies not in staging impressive scenes but in repeatability and general-purpose use. [Photo: Shutterstock]

Glamorous humanoid robot demonstration videos are spreading one after another, but critics say such scenes alone should not be used to judge real job performance.

According to IT outlet Ars Technica on June 4 local time, figures in the robot industry and academia see a large gap between robot videos that become a topic online and real-world performance.

The key is not a demo that succeeds once or twice, but whether a robot can reliably repeat the same task in varied environments. Companies highlight acrobatic moves or household chores, but whether the same level of performance can be reproduced in actual industrial sites or homes needs separate verification.

Jonathan Hurst (조너선 허스트), co-founder of Agility Robotics and a robotics researcher at Oregon State University, noted that human-shaped robots can easily create excessive expectations. He said when people see a robot that looks like it is dancing, they tend to automatically extend that to assume the robot can do other things like a person, and added, "That is not true." He said some startups also use that perception to raise funding.

Sergey Levine (서지 레빈), a computer scientist at the University of California, Berkeley and co-founder of Physical Intelligence, also said the real hard problem for robots is "general-purpose capability." Even if a robot can pour a glass of wine, whether it can do the same task with any bottle, any glass and in any environment is a completely different issue, he said. "It is much harder than doing a backflip in a one-off stage demo," he said.

That has led to criticism that quantitative, large-scale validation in real environments is more important than attention-grabbing videos when evaluating robot performance. Levine said there is always a gap between what someone can show in a demo and a robot's actual capabilities.

When watching a demo video, viewers should first check whether it is autonomous. Deepam Patel (디팜 파텔), a Purdue University computer science doctoral researcher and a research assistant at the U.S. Army Development Command's Army Research Laboratory, said many demos still rely on remote control. If a research paper or a company does not state that it is fully autonomous, it should be viewed with major suspicion, he said. "If it does not clearly state it is fully autonomous, you should watch it with very great suspicion," he said.

In the same vein, whether the demo setting is new and unfamiliar or a trained environment the robot has already learned was also presented as an important criterion. If a robot completed a task in an environment it encountered for the first time, it becomes more persuasive for general-purpose autonomy, but the interpretation can differ if it repeatedly performed in a familiar space.

Playback speed is also a variable. Patel pointed out that robots usually move very slowly for reasons such as safety. Some companies say their demo videos are played at 2x or 4x speed, and in those cases the same task could take two or four times longer than a person doing it. That means motions that look agile on screen do not necessarily show real work speed.

The purpose of a demo video and the degree of transparency also vary widely. Some videos have a strong performance aspect aimed at spreading on social media, while others are closer to promotional material aimed at winning customers or investors. Other videos, by contrast, show the robot's training process and trial and error, revealing limitations.

Ultimately, humanoid robot videos that draw attention online are closer to scenes showing only part of the full picture. Even if a video looks sophisticated and its source appears trustworthy, it is difficult to conclude a robot's real capabilities from that alone. The metrics the industry should focus on are autonomy, general-purpose capability, speed, repeat performance in real environments, and the scale of validation supporting them, not the polish of viral videos.

Jinju Hong hongjj@d-today.co.kr

Keyword