- Topic Modeling
- Natural Language Processing Techniques
- Speech and dialogue systems
- Recommender Systems and Techniques
- Visual Attention and Saliency Detection
- Text and Document Classification Technologies
- Visual perception and processing mechanisms
- Multimodal Machine Learning Applications
- Sentiment Analysis and Opinion Mining
- Machine Learning and Data Classification
- Expert finding and Q&A systems
- Information Retrieval and Search Behavior
- Machine Learning in Healthcare
- Mobile Crowdsensing and Crowdsourcing
- Speech Recognition and Synthesis
- Web Data Mining and Analysis
- Domain Adaptation and Few-Shot Learning
- Gaze Tracking and Assistive Technology
- Text Readability and Simplification
- Advanced Image and Video Retrieval Techniques
- Semantic Web and Ontologies
- AI in Service Interactions
- Multi-Agent Systems and Negotiation
- Human Mobility and Location-Based Analysis
- Image and Signal Denoising Methods
Amazon (United States)
2020-2024
Oregon State University
2022
University of Virginia
2022
Amazon (Germany)
2019-2022
LinkedIn (United States)
2022
Harbin Institute of Technology
2021
Carnegie Mellon University
2009-2010
Fudan University
2006-2009
Salient areas in natural scenes are generally regarded as which the human eye will typically focus on, and finding these is key step object detection. In computer vision, many models have been proposed to simulate behavior of eyes such SaliencyToolBox (STB), Neuromorphic Vision Toolkit (NVT), others, but they demand high computational cost computing useful results mostly relies on their choice parameters. Although some region-based approaches were reduce complexity feature maps, still not...
Salient areas in natural scenes are generally regarded as the candidates of attention focus human eyes, which is key stage object detection. In computer vision, many models have been proposed to simulate behavior eyes such SaliencyToolBox (STB), neuromorphic vision toolkit (NVT) and etc., but they demand high computational cost their remarkable results mostly rely on choice parameters. Recently a simple fast approach based Fourier transform called spectral residual (SR) was proposed, used SR...
Knowledge distillation is typically conducted by training a small model (the student) to mimic large and cumbersome teacher). The idea compress the knowledge from teacher using its output probabilities as soft-labels optimize student. However, when considerably large, there no guarantee that internal of will be transferred into student; even if student closely matches soft-labels, representations may different. This mismatch can undermine generalization capabilities originally intended In...
Dingcheng Li, Zheng Chen, Eunah Cho, Jie Hao, Xiaohu Liu, Fan Xing, Chenlei Guo, Yang Liu. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.
Today, most of the large-scale conversational AI agents such as Alexa, Siri, or Google Assistant are built using manually annotated data to train different components system including Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Entity Resolution (ER). Typically, accuracy machine learning models in these improved by transcribing annotating data. As scope systems increase cover more scenarios domains, manual annotation improve becomes prohibitively costly time...
We present a methodology for the automatic identification and delineation of germ-layer components in H&E stained images teratomas derived from human nonhuman primate embryonic stem cells. A knowledge understanding biology these cells may lead to advances tissue regeneration repair, treatment genetic developmental syndromes, drug testing discovery. As teratoma is chaotic organization tissues three primary germ layers, often multiple tissues, each having complex unpredictable positions,...
Abstract Today, most of the large‐scale conversational AI agents such as Alexa, Siri, or Google Assistant are built using manually annotated data to train different components system including automatic speech recognition (ASR), natural language understanding (NLU), and entity resolution (ER). Typically, accuracy machine learning models in these improved by transcribing annotating data. As scope systems increase cover more scenarios domains, manual annotation improve becomes prohibitively...
Today, most of the large-scale conversational AI agents such as Alexa, Siri, or Google Assistant are built using manually annotated data to train different components system including automatic speech recognition (ASR), natural language understanding (NLU), and entity resolution (ER). Typically, accuracy machine learning models in these improved by transcribing annotating data. As scope systems increase cover more scenarios domains, manual annotation improve becomes prohibitively costly time...
Query Rewriting (QR) plays a critical role in large-scale dialogue systems for reducing frictions. When there is an entity error, it imposes extra challenges system to produce satisfactory responses. In this work, we propose KG-ECO: Knowledge Graph enhanced Entity COrrection query rewriting, correction with corrupt span detection and retrieval/re-ranking functionalities.To boost the model performance, incorporate (KG) provide structural information (neighboring entities encoded by graph...
Query rewrite (QR) is an emerging component in conversational AI systems, reducing user defect. User defect caused by various reasons, such as errors the spoken dialogue system, users' slips of tongue or their abridged language. Many defects stem from personalized factors, user's speech pattern, dialect, preferences. In this work, we propose a search-based QR framework, which focuses on automatic reduction We build index for each user, encompasses diverse affinity layers to reflect personal...
Jie Hao, Yang Liu, Xing Fan, Saurabh Gupta, Saleh Soltan, Rakesh Chada, Pradeep Natarajan, Chenlei Guo, Gokhan Tur. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track. 2022.
In this paper, we develop a robust signal space separation (rSSS) algorithm for real-time magnetoencephalography (MEG) data processing. rSSS is based on the spatial (SSS) method and it applies regression to automatically detect remove bad MEG channels so that results of SSS are not distorted. We extend existing via three important new contributions: 1) low-rank solver efficiently performs matrix operations; 2) subspace iteration scheme selects using low-order spherical harmonic functions; 3)...
Knowledge distillation is typically conducted by training a small model (the student) to mimic large and cumbersome teacher). The idea compress the knowledge from teacher using its output probabilities as soft-labels optimize student. However, when considerably large, there no guarantee that internal of will be transferred into student; even if student closely matches soft-labels, representations may different. This mismatch can undermine generalization capabilities originally intended In...
Query rewriting (QR) is an increasingly important component in voice assistant systems to reduce customer friction caused by errors a spoken language understanding pipeline. These originate from various sources such as Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) modules. In this work, we construct user interaction graph their queries using data mined Markov Chain Model [1], introduce self-supervised pre-training process for learning query embeddings leveraging...
For voice assistants like Alexa, Google Assistant, and Siri, correctly interpreting users’ intentions is of utmost importance. However, users sometimes experience friction with these assistants, caused by errors from different system components or user such as slips the tongue. Users tend to rephrase their queries until they get a satisfactory response. Rephrase detection used identify rephrases has long been treated task pairwise input, which does not fully utilize contextual information...
Zhongkai Sun, Yingxue Zhou, Jie Hao, Xing Fan, Yanbin Lu, Chengyuan Ma, Wei Shen, Chenlei Guo. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track. 2023.
Spoken language understanding (SLU) systems in conversational AI agents often experience errors the form of misrecognitions by automatic speech recognition (ASR) or semantic gaps natural (NLU). These easily translate to user frustrations, particularly so recurrent events e.g. regularly toggling an appliance, calling a frequent contact, etc. In this work, we propose query rewriting approach leveraging users' historically successful interactions as memory. We present neural retrieval model and...
In this paper, an attention selection model with visual memory and online learning is proposed, which has three parts: Sensory Mapping (SM), Cognitive (CM) Motor (MM). CM the novelty of our incorporates learning. order to mimic memory, we put forward Amnesic Incremental Hierachical Discriminant Regression (AIHDR) Tree amnesic function guide deletion redundant information tree. Experimental results show that AIHDR tree better performance in retrieval speed accuracy than IHDR/HDR...
Subword tokenization is a commonly used input pre-processing step in most recent NLP models. However, it limits the models’ ability to leverage end-to-end task learning. Its frequency-based vocabulary creation compromises low-resource languages, leading models produce suboptimal representations. Additionally, dependency on fixed subword adaptability across languages and domains. In this work, we propose vocabulary-free neural tokenizer by distilling segmentation information from...
Niranjan Uma Naresh, Ziyan Jiang, Ankit Ankit, Sungjin Lee, Jie Hao, Xing Fan, Chenlei Guo. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track. 2022.
Text Style Transfer (TST) aims to alter the underlying style of source text another specific while keeping same content. Due scarcity high-quality parallel training data, unsupervised learning has become a trending direction for TST tasks. In this paper, we propose novel VAE based with pivOt Words Enhancement leaRning (VT-STOWER) method which utilizes Variational AutoEncoder (VAE) and external embeddings learn semantics distribution jointly. Additionally, introduce pivot words learning, is...
Query rewriting (QR) is an increasingly important technique to reduce customer friction caused by errors in a spoken language understanding pipeline, where the originate from various sources such as speech recognition errors, or entity resolution errors. In this work, we first propose neural-retrieval based approach for query rewriting. Then, inspired wide success of pre-trained contextual embeddings, and also way compensate insufficient QR training data, language-modeling (LM) pre-train...