ADELPHI, Md. -- Army researchers developed ground-breaking technology that will enhance how Soldiers and robots communicate and carry out tasks in tactical environments.
This research sets out to develop a natural language understanding, or NLU, pipeline for robots that would be easily ported over to any computational system or agent and incrementally tames the variation that we see in natural language, said Army researcher Dr. Claire Bonial from the U.S. Army Combat Capabilities Development Command, known as DEVCOM, Army Research Laboratory.
This means that regardless of how a Soldier chooses to express him or herself to the robot, the underlying intent of that language is understood and can be acted on, given both the current conversational and environmental or situational context.
To do this, the NLU pipeline first automatically parses the input language into Abstract Meaning Representation, or AMR, which captures the basic meaning of the content of the language, Bonial said. It then converts and augments the AMR into Dialogue-AMR, which captures additional elements of meaning needed for two-way human robot dialogue in particular, such as what the person is trying to do with the utterance in the conversational context, for example give a command, ask a question, state a fact about the environment, etc.
This research was presented at the 14th International Conference on Computational Semantics, or IWCS 2021, where it received the Outstanding Paper Award.
The award citation noted that “The authors are not afraid of using old-fashioned hand-written rules when they do the job, something which is lacking in much current work in NLP,“ and that “Anyone who wants to work on human-robot dialogue will want to see this first go at parsing in this new domain.”
“This award was incredibly gratifying for several reasons,” Bonial said. “First, this paper represents research efforts that were planted and have been growing since I was a doctoral student. I was part of the first group of researchers establishing what has become one of the most widely used semantic representations in natural language processing, AMR.”
Bonial started work with this group in 2010, and has been actively involved in refining and extending the representation ever since. Thus, this paper represents a body of research spanning over a decade for Bonial.
This includes work to represent language expressed through semi-idiomatic constructions, for example “the higher you fly, the harder you’ll fall!,” and most recently extending AMR so that it can better capture two-way dialogue, and specifically task-oriented, situated dialogue between people and robots, in an augmented version of the representation called Dialogue-AMR, she said.
Efforts to develop Dialogue-AMR started in 2018 with dialogue expert Dr. David Traum as part of the University Affiliated Research Center previously established between DEVCOM ARL and the Institute for Creative Technologies at the University of Southern California.
Dialogue-AMR draws upon ARL and ICT’s Bot Language research collaboration focused on robot dialogue systems, which was initiated by one of the IWCS paper’s co-authors, Dr. Clare Voss, in 2012.
Second, Bonial said, this paper reports on a critical step in the research trajectory: the researchers’ experiments to robustly and critically evaluate Dialogue-AMR as a computational semantic representation, as well as the NLU pipeline used to automatically obtain the correct Dialogue-AMR representation when given unconstrained natural language input.
These experiments were also done in collaboration with Traum of ICT, as well as an ARL student intern and recent graduate of Georgetown University, Mitchell Abrams. Abrams was recently selected as a Department of Defense Science, Mathematics, And Research For Transformation scholar, and will be coming back to the lab after completing his fully-funded doctoral program.
This evaluation is significant because the researchers evaluate in two problem domains: first, the domain of human-robot dialogue for collaborative search and navigation tasks, and second the domain of human-human dialogue in the virtual Minecraft gaming environment, where participants are collaboratively building blocks structures.
The Minecraft dialogue data that made this comparative evaluation possible was obtained from Dr. Martha Palmer of the University of Colorado, Boulder, and Dr. Julia Hockenmaier of the University of Illinois at Urbana-Champaign.
Evaluating in these two domains gets at a key question regarding the utility of such a representation and the NLU pipeline in general.
Bonial questioned how efficiently and with what accuracy the pipeline and representation can be applied in one problem domain, and then transferred and refined to a new problem domain. Or, in other words, how feasible is it for us to use this approach for communicating with a robot when that robot needs to tackle new problems and environments?
Further their evaluation demonstrates that with just a small amount of additional training data (several hours’ worth of annotation work amounting to about 200 additional training instances for the machine learning elements of the pipeline), the NLU pipeline obtains promising performance in the second domain of human-human dialogue in the Minecraft domain, she said.
The researchers noted that while the performance is somewhat lower than performance on the original human-robot dialogue domain, it is comparable or better than the performance of other automatic semantic parsers for other semantic representations and language domains.
“These promising results demonstrate that this NLU pipeline leverages a valid approach for communicating with robots in collaborative tasks across multiple domains,” Bonial said. “I am very excited for the promise of this research to provide transformational overmatch in our computational systems’ ability to understand the underlying meaning of Soldiers’ instructions, despite difference in their dialects, accents and other types of noise sure to arise in Army-relevant applications.”
In the research team’s next steps, they are connecting the output semantic representation with a system that grounds the pieces of the representation to both entities in the environments and the executable behaviors of the robot in joint work with Dr. Thomas Howard of the University of Rochester.
“We are optimistic that the deeper semantic representation will provide the structure needed for superior grounding of the language in both the conversational and physical environment such that robots can communicate and act more as teammates to Soldiers, as opposed to tools,” Bonial said.