WeChat Mini Program
Old Version Features

Human-Robot Dialogue Annotation for Multi-Modal Common Ground

Computing Research Repository (CoRR)(2024)

DEVCOM Army Research Laboratory

Cited 0|Views2
Abstract
In this paper, we describe the development of symbolic representations annotated on human-robot dialogue data to make dimensions of meaning accessible to autonomous systems participating in collaborative, natural language dialogue, and to enable common ground with human partners. A particular challenge for establishing common ground arises in remote dialogue (occurring in disaster relief or search-and-rescue tasks), where a human and robot are engaged in a joint navigation and exploration task of an unfamiliar environment, but where the robot cannot immediately share high quality visual information due to limited communication constraints. Engaging in a dialogue provides an effective way to communicate, while on-demand or lower-quality visual information can be supplemented for establishing common ground. Within this paradigm, we capture propositional semantics and the illocutionary force of a single utterance within the dialogue through our Dialogue-AMR annotation, an augmentation of Abstract Meaning Representation. We then capture patterns in how different utterances within and across speaker floors relate to one another in our development of a multi-floor Dialogue Structure annotation schema. Finally, we begin to annotate and analyze the ways in which the visual modalities provide contextual information to the dialogue for overcoming disparities in the collaborators' understanding of the environment. We conclude by discussing the use-cases, architectures, and systems we have implemented from our annotations that enable physical robots to autonomously engage with humans in bi-directional dialogue and navigation.
More
Translated text
Key words
Situated dialogue,Semantics,Multi-floor dialogue,Multi-modal dialogue
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
S HARNAD
1991

被引用1815 | 浏览

1981

被引用555 | 浏览

Habibian Soheil,Dadvar Mehdi, Peykari Behzad, Hosseini Alireza, Salehzadeh M. Hossein, Hosseini Alireza H. M., University of Tehran, College of Engineering, School of Mechanical Engineering
2021

被引用52 | 浏览

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了在人类-机器人对话数据上标注象征性表示的方法,以便使自主系统参与协作自然语言对话时能够理解意义的维度,并与人类伙伴建立共同基础,特别是在远程对话中如灾难救援或搜救任务等场景。

方法】:研究通过Dialogue-AMR标注对话中的命题语义和单个话语的言外之力,并通过多层级对话结构标注模式捕捉不同话语之间的关联。

实验】:实验涉及对视觉模态如何为对话提供上下文信息以克服对话者对环境理解差异的标注和分析,但未提及具体的数据集名称,实验结果为通过标注得到的见解用于实现使物理机器人能够与人类自主进行双向对话和导航的用例、架构和系统。