Deep Graph Random Process for Relational-Thinking-Based Speech Recognition

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Huang Hengguan, Fuzhao Xue, Hao Wang, Ye Wang


<p>Both relational thinking and relational reasoning lie at the core of human intelligence. While relational reasoning has inspired many perspectives in artificial intelligence, relational thinking is relatively unexplored in solving machine learning problems. It is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling of these percepts. Such mental processes are difficult to model in real-world problems such as in conversational automatic speech recognition (ASR), as the percepts (e.g. unconscious mental impressions formed while hearing sounds) are supposed to be innumerable and not directly observable. And yet the dialogue history of the conversation might still reflect such underlying processes, allowing an indirect way of modeling. We present a framework that models a percept as weak relations between a current utterance and its history. We assume the probability of the existence of such a relation to be close to zero due to the unconsciousness of the percept. Given an utterance and its history, our method can generate an infinite number of probabilistic graphs representing percepts and further analytically combine them into a new graph representing strong relations among utterances. This new graph can be further transformed to be task-specific and provide an informative representation for acoustic modeling. Our approach is able to successfully infer relations among utterances without using any relational data during training. Experimental evaluations on ASR tasks including CHiME-2, SWB-30k and CHiME-5 demonstrate the effectiveness and benefits of our method.</p>