Teaching Robots to Perform Tasks Like Humans

By Greg Hardesty | March 14, 2023

USC computer science study addresses key challenge of teaching machines to perform tasks in realistic environments

Can language models reason in a real-world setting? USC researchers explored this question in a recent paper published at AAAI. Photo/iStock.

Can language models reason in a real-world setting? USC researchers explored this question in a recent paper published at AAAI. Photo/iStock.

Your coffee has gone cold. 

You pick up your cup, place it in the microwave, and zap it. 

Easy-peasy for humans. 

For a robot, however, the task is not easy – even if it has been “taught” by language models (LMs) where the water, cup and microwave are.  

The key problem? 

The inability of machines to show common sense and reason and plan tasks in a realistic environment. 

“Bill” Yuchen Lin, who earned his Ph.D. in computer science in 2022 under the mentorship of Assistant Professor Xiang Ren, is co-first author of a new research paper that examines this still open-ended problem. 

Any definitive answer is three to five years away, predicts Lin, who describes his paper as a pilot study in connecting LMs with realistic environments for grounded planning of embodied tasks. It’s part of his overall research goal to teach machines to think, talk and act as humans do. 

“There are a lot of challenges to building a useful robot that can complete household tasks fully independently,” says Lin, who presented his paper, On Grounded Planning for Embodied Tasks with Language Models,” at the 37th Association for the Advancement of Artificial Intelligence conference in Washington, D.C., Feb. 7-14, 2023. 

For example, it needs physical hardware that can hold a cup of water and open the microwave,” says Lin. “And it needs a high-precision tracking system,” adds Lin – “say, cameras or radar sensors that record the coordinates of objects and room layout changes.” 

Lin’s study, which essentially addresses a language-generation problem, is more about the minds of robots – a critical step in determining if, relatively soon, robots will be able to plan and perform tasks in real life. 

Step by step 

But first, researchers need to bridge the gap between natural language processing and robotics by designing and testing agents that can translate language instructions into sequences of actions, such as “move the teapot from the stove to the shelf.” A few prior studies have examined the planning ability of language models, but most of them were not grounded in realistic environments. Lin’s study addresses this. 

“Since LMs cannot interact with the world like humans using sight or touch, it is hard for them to perceive a realistic environment,” said Lin. “This research tried to find out if LMs create executable plans for embodied tasks.”

He and his fellow researchers created an LM with a task to complete — moving a teapot from a stove to a shelf. They input into the LM a table – think Google spreadsheet — where each row represented an object (e.g., a teapot, a stove, a shelf) including its coordinates, size and color. 

Because LMs are designed to process text as a linear sequence of words, the table was formatted as a sequence of values row by row, with each row representing a sentence and the table a paragraph of sentences. 

Lin and his colleagues then trained the LM to create subgoals step by step, based on the previous steps, instead of using the common training method of completing all steps at once. 

“This is a relatively simple change in the LM training, but it largely improves the performance,” Lin explains of the repetitive process of iteration (in the same way that the latest versions of software improve on previous versions). 

“It’s easier for the LM model to learn the relationship between two continuous steps in a plan, thus improving its planning ability,” he adds. “This is like our human thinking process. We are better at carefully thinking about each step when coming up with a plan by repeatedly looking at the existing steps that we have taken before.” 

One challenge of such an approach is that the cost can become huge when there are many steps to generate. However, Lin says, such an approach opens a promising future for researchers. 

Significant implications 

The ability of robots to perform everyday tasks more independently could have significant implications for various industries and individual households, Lin says. 

One is increased efficiency in manufacturing. Robots that can plan their own tasks could lead to a more streamlined production process, with less human intervention needed. This could lead to increased efficiency and a decrease in the overall time required to produce a given product. 

For elderly individuals, or people with disabilities who struggle with everyday tasks, having a robot that can independently plan and perform these tasks could greatly improve their quality of life, Lin says. 

And in certain hazardous industries, such as construction or mining, robots that can plan their own tasks could help to reduce the risk of accidents for human workers, he adds. 

“It’s important to note that the widespread adoption of robots that can plan everyday tasks would likely require significant advancements in artificial intelligence, robotics, and human-robot interaction,” said Lin, who in February 2023 joined AI2 Mosaic, a non-profit focused on contributing to AI research and engineering efforts intended to benefit the common good. 

“However,” he adds, “the potential benefits of such technology are significant, and as technology continues to advance, it’s likely that we will see increased use of robots in a variety of industries and settings.” 

Published on March 14th, 2023

Last updated on March 15th, 2023

Share This Story