- Topic Modeling
- Natural Language Processing Techniques
- Speech and dialogue systems
- Multimodal Machine Learning Applications
- Advanced Text Analysis Techniques
- Web Data Mining and Analysis
- Semantic Web and Ontologies
- Data Management and Algorithms
- Mental Health via Writing
- Advanced Database Systems and Queries
- Speech Recognition and Synthesis
- Recommender Systems and Techniques
- Biomedical Text Mining and Ontologies
- Algorithms and Data Compression
- Advanced Graph Neural Networks
- Mental Health Research Topics
- Sentiment Analysis and Opinion Mining
- Image Retrieval and Classification Techniques
- Advanced Image and Video Retrieval Techniques
- Vehicle Routing Optimization Methods
- Domain Adaptation and Few-Shot Learning
- Video Analysis and Summarization
- Digital Mental Health Interventions
- Logic, Reasoning, and Knowledge
- Data Quality and Management
Shanghai Jiao Tong University
2014-2023
The University of Texas at Arlington
2023
Singapore University of Technology and Design
2020
Creighton University
2020
National University of Singapore
2002-2015
Princeton University
2007-2008
Microsoft (United States)
2006
University of Ottawa
1993
Knowledge is indispensable to understanding. The ongoing information explosion highlights the need enable machines better understand electronic text in human language. Much work has been devoted creating universal ontologies or taxonomies for this purpose. However, none of existing needed depth and breadth In paper, we present a universal, probabilistic taxonomy that more comprehensive than any ones. It contains 2.7 million concepts harnessed automatically from corpus 1.68 billion web pages....
This paper studies the problem of automatic detection false rumors on Sina Weibo, popular Chinese microblogging social network. Traditional feature-based approaches extract features from rumor message, its author, as well statistics responses to form a flat feature vector. ignores propagation structure messages and has not achieved very good results. We propose graph-kernel based hybrid SVM classifier which captures high-order patterns in addition semantic such topics sentiments. The new...
Answering complex questions that involve multiple entities and relations using a standard knowledge base is an open challenging task. Most existing KBQA approaches focus on simpler do not work very well because they were able to simultaneously represent the question corresponding query structure. In this work, we encode such structure into uniform vector representation, thus successfully capture interactions between individual semantic components within question. This approach consistently...
In this paper, we present our multi-channel neural architecture for recognizing emerging named entity in social media messages, which applied the Novel and Emerging Named Entity Recognition shared task at EMNLP 2017 Workshop on Noisy User-generated Text (W-NUT). We propose a novel approach, incorporates comprehensive word representations with information Conditional Random Fields (CRF) into traditional Bidirectional Long Short-Term Memory (BiLSTM) network without using any additional...
We propose a framework to automatically generate descriptive comments for source code blocks. While this problem has been studied by many researchers previously, their methods are mostly based on fixed template and achieves poor results. Our does not rely any template, but makes use of new recursive neural network called CodeRNN extract features from the embed them into one vector. When vector representation is input recurrent (Code-GRU), overall generates text descriptions with accuracy...
Empowering chatbots in the field of mental health is receiving increasing amount attention, while there still lacks exploration developing and evaluating psychiatric outpatient scenarios. In this work, we focus on exploring potential ChatGPT powering for psychiatrist patient simulation. We collaborate with psychiatrists to identify objectives iteratively develop dialogue system closely align real-world evaluation experiments, recruit real patients engage diagnostic conversations chatbots,...
Many location based services, such as FourSquare, Yelp, TripAdvisor, Google Places, etc., allow users to compose reviews or tips on points of interest (POIs), each having a geographical coordinates. These services have accumulated large amount geo-tagged review data, which allows deep analysis user preferences in POIs. This paper studies two types POIs: topical-region preference and category aware topical-aspect preference. We propose unified probabilistic model capture these simultaneously....
Convolutional neural networks (CNNs) have met great success in abstractive summarization, but they cannot effectively generate summaries of desired lengths. Because generated are used difference scenarios which may space or length constraints, the ability to control summary summarization is an important problem. In this paper, we propose approach constrain by extending a convolutional sequence model. The results show that generates high-quality with user defined length, and outperforms...
One of the ultimate goals e-commerce platforms is to satisfy various shopping needs for their customers. Much efforts are devoted creating taxonomies or ontologies in towards this goal. However, user still not well defined, and none existing has enough depth breadth universal understanding. The semantic gap in-between prevents experience from being more intelligent. In paper, we propose construct a large-scale E-commerce Cognitive Concept Net named "AliCoCo", which practiced Alibaba, largest...
N6‑methyladenosine (m6A) RNA modification regulates multiple biological functions. Methyltransferase like 3 (METTL3), one of the major N6‑methyltransferases, is highly expressed in gastric cancer, but its potential role disease unclear. The current study knocked out METTL3 (METTL3‑KO) human cancer AGS cells using CRISPR/Cas9. METTL3‑KO exhibited decreased m6A methylation levels. A significant inhibition cell proliferation was observed cells. Silencing altered expression profile many effector...
An ad hoc data source is any semistructured for which useful analysis and transformation tools are not readily available. Such must be queried, transformed displayed by systems administrators, computational biologists, financial analysts hosts of others on a regular basis. In this paper, we demonstrate that it possible to generate suite processing tools, including semi-structured query engine, several format converters, statistical analyzer visualization routines directly from the itself,...
Computing semantic similarity between two terms is essential for a variety of text analytics and understanding applications. However, existing approaches are more suitable words rather than the general multi-word expressions (MWEs), they do not scale very well. Therefore, we propose lightweight effective approach using large network automatically acquired from billions web documents. Given terms, map them into concept space, compare their there. Furthermore, introduce clustering to...
This paper targets to a novel but practical recommendation problem named exact-K recommendation. It is different from traditional top-K recommendation, as it focuses more on (constrained) combinatorial optimization which will optimize recommend whole set of K items called card, rather than ranking assumes that "better" should be put into top positions. Thus we take the first step give formal definition, and innovatively reduce Maximum Clique Optimization based graph. To tackle this specific...
Previous length-controllable summarization models mostly control lengths at the decoding stage, whereas encoding or selection of information from source document is not sensitive to designed length. They also tend generate summaries as long those in training data. In this paper, we propose a length-aware attention mechanism (LAAM) adapt based on desired Our approach works by LAAM summary length balanced dataset built original data, and then fine-tuning usual. Results show that effective...
News recommendation for anonymous readers is a useful but challenging task many news portals, where interactions between and articles are limited within temporary login session. Previous works tend to formulate session-based as next item prediction task, while they neglect the implicit feedback from user behaviors, which indicates what users really like or dislike. Hence, we propose comprehensive framework model behaviors through positive (i.e., spend more time on) negative choose skip...
Quantifying image complexity at the entity level is straightforward, but assessment of semantic has been largely overlooked. In fact, there are differences in across images. Images with richer semantics can tell vivid and engaging stories offer a wide range application scenarios. For example, Cookie Theft picture such kind widely used to assess human language cognitive abilities due its higher complexity. Additionally, semantically rich images benefit development vision models, as limited...
Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as sequence labeling problem and adopt such models BiLSTM-CRF. While these work relatively well on standard benchmark datasets, they face challenges the context of E-commerce where slot labels are more informative carry richer expressions. In this work, inspired by unique structure knowledge base, we propose novel multi-task model with cascade residual connections,...
In this paper, we propose a novel configurable framework to automatically generate distractive choices for open-domain cloze-style multiple-choice questions. The incorporates general-purpose knowledge base effectively create small distractor candidate set, and feature-rich learning-to-rank model select distractors that are both plausible reliable. Experimental results on new dataset across four domains show our yields outperforming previous methods by automatic human evaluation. can also be...
This paper presents an adaptive genetic algorithm (GA) to solve the vehicle routing problem with time windows (VRPTW) near optimal solutions. The employs a unique decoding scheme integer strings. It also automatically adapts crossover probability and mutation rate changing population dynamics. control maintains diversity at user-defined levels, therefore prevents premature convergence in search. Comparison between this normal fixed parameter GA clearly demonstrates advantage of control. Our...
Biomedical researchers often search through massive catalogues of literature to look for potential relationships between genes and diseases. Given the rapid growth biomedical literature, automatic relation extraction, a crucial technology in mining, has shown great support research gene-related Existing work this field produced datasets that are limited both scale accuracy.In study, we propose reliable efficient framework takes large repositories as inputs, identifies credible diseases...
Cross-cultural differences and similarities are common in cross-lingual natural language understanding, especially for research social media. For instance, people of distinct cultures often hold different opinions on a single named entity. Also, understanding slang terms across languages requires knowledge cross-cultural similarities. In this paper, we study the problem computing such We present lightweight yet effective approach, evaluate it two novel tasks: 1) mining entities 2) finding...