- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Anomaly Detection Techniques and Applications
- Video Surveillance and Tracking Methods
- Advanced Text Analysis Techniques
- Text and Document Classification Technologies
- Human Pose and Action Recognition
- Speech and dialogue systems
- Domain Adaptation and Few-Shot Learning
- Image Enhancement Techniques
- Network Security and Intrusion Detection
- Speech Recognition and Synthesis
- Artificial Immune Systems Applications
- Physical Education and Training Studies
- Educational Technology and Pedagogy
- Image Retrieval and Classification Techniques
- Sports Analytics and Performance
- Sports and Physical Education Research
- American Sports and Literature
- Sentiment Analysis and Opinion Mining
- Sports, Gender, and Society
- Advanced Image and Video Retrieval Techniques
- Advanced Vision and Imaging
Xi'an Jiaotong University
2021-2023
Badan Penelitian dan Pengembangan Kesehatan
2023
Toyota Technological Institute at Chicago
2018-2022
Meta (United States)
2021
University of Washington
2021
Meta (Israel)
2021
Northwest A&F University
2019-2020
Peking University
2016-2018
Chongqing University of Arts and Sciences
2009
We present the Visually Grounded Neural Syntax Learner (VG-NSL), an approach for learning syntactic representations and structures without any explicit supervision. The model learns by looking at natural images reading paired captions. VG-NSL generates constituency parse trees of texts, recursively composes constituents, matches them with images. define concreteness constituents their matching scores images, use it to guide parsing text. Experiments on MSCOCO data set show that outperforms...
Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing adopt syntactic parsing trees as the explicit structure prior. To study effectiveness different tree structures, we replace trivial (i.e., binary balanced tree, left-branching and right-branching tree) in encoders. Though contain no information, those get competitive or even all ten tasks investigated. This surprising result indicates that syntax guidance may not be main...
Many natural language processing (NLP) tasks involve reasoning with textual spans, including question answering, entity recognition, and coreference resolution. While extensive research has focused on functional architectures for representing words sentences, there is less work arbitrary spans of text within sentences. In this paper, we conduct a comprehensive empirical evaluation six span representation methods using eight pretrained models across tasks, two that introduce. We find that,...
We study a family of data augmentation methods, substructure substitution (SUB 2 ), that generalizes prior methods.SUB generates new examples by substituting substructures (e.g., subtrees or subsequences) with others having the same label.This idea can be applied to many structured NLP tasks such as part-of-speech tagging and parsing.For more general text classification) which do not have explicitly annotated substructures, we present variations SUB based on spans parse trees, introducing...
Data augmentation is an important component in the robustness evaluation of models natural language processing (NLP) and enhancing diversity data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based framework which supports creation both transformations (modifications to data) filters (data splits according specific features). We describe initial set 117 23 for variety tasks. demonstrate efficacy NL-Augmenter by using several its analyze popular...
Data augmentation is an important method for evaluating the robustness of and enhancing diversity training data natural language processing (NLP) models. In this paper, we present NL-Augmenter, a new participatory Python-based (NL) framework which supports creation transformations (modifications to data) filters (data splits according specific features). We describe initial set 117 23 variety NL tasks annotated with noisy descriptive tags. The incorporate noise, intentional accidental human...
Haoyue Shi, Luke Zettlemoyer, Sida I. Wang. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
We study the problem of grounding distributional representations texts on visual domain, namely visual-semantic embeddings (VSE for short). Begin with an insightful adversarial attack VSE embeddings, we show limitation current frameworks and image-text datasets (e.g., MS-COCO) both quantitatively qualitatively. The large gap between number possible constitutions real-world semantics size parallel data, to a extent, restricts model establish link textual concepts. alleviate this by augmenting...
We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the F1 score on Wall Street Journal (WSJ) development set (1,700 sentences). introduce strong baselines for them, by training an existing supervised model (Kitaev and Klein, 2018) same labeled examples they access. When 1,700 examples, or even when using only 50 5 development, such a few-shot approach can outperform all methods significant margin. Few-shot be further improved simple data...
Cardiovascular disease (CVD) is a highly significant contributor to loss of quality and quantity life all over the world. Early detection prediction very important for patients' treatment doctors' diagnose which can help reduce mortality. In this paper, we focus on practical problem Chinese hospital dealing with cardiovascular data make an early risk prediction. To better understand prescription advice in Chinese, basic natural language processing method was used synonym recognition...
Weakly-supervised Video Anomaly Detection (W-VAD) aims to detect abnormal events in videos given only video-level labels for training. Recent methods relying on multiple instance learning (MIL) and self-training achieve good performance, but they tend focus easy patterns while ignoring hard ones, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g.</i> , unusual driving trajectory or over-speeding driving. How anomalies is a critical largely...
We explore deep clustering of multilingual text representations for unsupervised model interpretation and induction syntax. As these are high-dimensional, out-of-the-box methods like K-means do not work well. Thus, our approach jointly transforms the into a lower-dimensional cluster-friendly space clusters them. consider two notions syntax: Part Speech Induction (POSI) Constituency Labelling (CoLab) in this work. Interestingly, we find that Multilingual BERT (mBERT) contains surprising...
We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist approach toward learning compositional and grounded meaning representation of language from data, such as paired images texts. At the core G2L2 is collection lexicon entries, which map each word to tuple syntactic type neuro-symbolic semantic program. For example, shiny has adjective; its program symbolic form {\lambda}x. filter(x, SHINY), where concept SHINY associated with neural network embedding, will be used...
Previous researches have shown that learning multiple representations for polysemous words can improve the performance of word embeddings on many tasks. However, this leads to another problem. Several vectors a may actually point same meaning, namely pseudo multi-sense. In paper, we introduce concept multi-sense, and then propose an algorithm detect such cases. With consideration detected multi-sense cases, try refine existing eliminate influence Moreover, apply our previous released tested...
Unsupervised learned representations of polysemous words generate a large pseudo multi senses since unsupervised methods are overly sensitive to contextual variations. In this paper, we address the multi-sense detection for word embeddings by dimensionality reduction sense pairs. We propose novel principal analysis method, termed Ex-RPCA, designed detect both and real senses. With empirically show that generated systematically in method. Moreover, can improved simple linear transformation...
Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing adopt syntactic parsing trees as the explicit structure prior. To study effectiveness different tree structures, we replace trivial (i.e., binary balanced tree, left-branching and right-branching tree) in encoders. Though contain no information, those get competitive or even all ten tasks investigated. This surprising result indicates that syntax guidance may not be main...
The deep similarity tracking via two-stream or multiple-stream network architectures has drawn great attention due to its strong capability of extracting discriminative feature with balanced accuracy and speed. However, these networks need a careful data pairing processing are usually difficult be updated for online visual tracking. In this paper, we propose simple effective extractor Single-Stream Deep Similarity learning Tracking, defined by SSDST. Different from the popular architecture,...