How the technology behind ChatGPT powers this bot

This robot looks a bit like something out of Star Wars – maybe not as elegant. But the Digital robot Not science fiction, but real.

Ilya Radosavovich and colleagues from the University of California, Berkeley, used a transformer model to teach Agility Robotics' biped, which is about 1.60 meters tall and weighs about 45 kilograms, to walk stably on a wide range of surfaces. In principle, the model works in the same way as the large language models that chatbots like ChatGPT rely on. Very similar to large language models, in this case the program also developed entirely new skills that it had not been trained on before: for example, it learned to walk backwards. Radosavovich and colleagues describe the technical details in one Paper on Arxiv preprint server.

Running for beginners

Humanoid robots, that is, human-like robots that are similar in size to adults, have been around since the 1970s. However, it soon became clear that it would be extremely difficult to make such machines work like humans. In theory, solving the problem isn't that difficult – but only if the robot isn't running very fast on a perfectly flat floor. Although there have been significant technical advances since then – for example, Boston Dynamics' Atlas robot can complete an obstacle course, this usually only works in one special case and must be extensively improved for that purpose.

For several years now, different groups and companies have been looking for ways in which a robot can learn optimal movements from a sufficient number of examples. Radosavovich and his team wanted to control the robot using a neural network in a transformer architecture – an architecture also used in large language models. However, this was only possible by detour.

Very little data

“However, unlike language, we do not have an easily accessible data set about human walking to learn from,” says Radosavovich. “In other words, we're starting from scratch, a clean slate.”

So the researchers trained their transformer model through trial and error – using reinforcement learning. “The robot initially performs random movement sequences. Every time the robot stumbles upon a desired behavior, such as balancing or taking a step, we give it a reward and encourage it to become more likely to perform that behavior. On the other hand, if the robot exhibits an undesirable behavior , e.g. falling, there is a negative punishment/reward to deter him from doing so in the future,” Radosavovich wrote. “Over the course of many experiments, this process converges to form a neural network capable of operating the robot.” The researchers currently describe the results of this training in one paper Article for Science Robotics magazine.

Editorial recommendations

After this initial training, Digit was able to reliably run over various terrains without falling and dealing with external disturbances – even while carrying various loads and pushing. In their latest work, the researchers then used this software to generate training data in a simulator. Combined with video recordings of people running and sensor data from the robot, the researchers received enough data to train a new Transformer model that now learns to walk independently.

Work by work

A transformer model trained in this way actually works in a similar way to a language model: after a series of movements, it predicts the next meaningful movement in the form of an action code, which is then executed, and so on. This worked not only in simulation, but also in different experiments in the city. The robot also developed new behaviors. For example, he was also able to walk backwards without stumbling, without having been explicitly trained to do so. “The result shows a promising way to learn additional and complex skills,” the researchers wrote. Next, they want to train the robot to overcome obstacles and teach it complex grasping movements with multi-fingered hands.

Don't miss any news! 💌

Note about the newsletter and data protection

Almost done!

Please click on the link in the confirmation email to complete your registration.

Would you like more information about the newsletter? Find out more now

Victor Booth

“Certified tv guru. Reader. Professional writer. Avid introvert. Extreme pop culture buff.”

How the technology behind ChatGPT powers this bot

Running for beginners

Very little data

Work by work

Plastic waste in the sea: new technology available

The Allen-Essingen Center for Electrical Technology opens its doors

Will sodium ion batteries replace lithium ion technology soon?

Eliminate lack of iPhone shortcuts. Actions is a free application that allows you to use 141 types of actions | LifehackerJapan

Live updates on Asian markets: RBA meeting, trade with China

Kim Kardashian was booed at Netflix's Tom Brady Roast

The orbiting solar module captures the Sun's delicate corona in stunning detail [Video]

Running for beginners

Very little data

Work by work

More Stories

Plastic waste in the sea: new technology available

The Allen-Essingen Center for Electrical Technology opens its doors

Will sodium ion batteries replace lithium ion technology soon?

You may have missed

Eliminate lack of iPhone shortcuts. Actions is a free application that allows you to use 141 types of actions | LifehackerJapan

Live updates on Asian markets: RBA meeting, trade with China

Kim Kardashian was booed at Netflix's Tom Brady Roast

The orbiting solar module captures the Sun's delicate corona in stunning detail [Video]