- Advanced Image and Video Retrieval Techniques
- Recommender Systems and Techniques
- Image Retrieval and Classification Techniques
- Topic Modeling
- Data Management and Algorithms
- Advanced Graph Neural Networks
- Multimodal Machine Learning Applications
- Complex Network Analysis Techniques
- Spam and Phishing Detection
- Opinion Dynamics and Social Influence
- Text and Document Classification Technologies
- Sentiment Analysis and Opinion Mining
- Music and Audio Processing
- Domain Adaptation and Few-Shot Learning
- Machine Learning and Algorithms
- Caching and Content Delivery
- Expert finding and Q&A systems
- Graph Theory and Algorithms
- Natural Language Processing Techniques
- Face and Expression Recognition
- Metaheuristic Optimization Algorithms Research
- Consumer Market Behavior and Pricing
- Remote-Sensing Image Classification
- Data Quality and Management
- Hate Speech and Cyberbullying Detection
Baidu (China)
2016-2021
Cognitive Research (United States)
2019-2021
Bellevue Hospital Center
2019
University of California, Santa Barbara
2014-2016
Zhejiang University
2010-2013
Acoustic-based music recommender systems have received increasing interest in recent years. Due to the semantic gap between low level acoustic features and high concepts, many researchers explored collaborative filtering techniques systems. Traditional recommendation methods only focus on user rating information. However, there are various kinds of social media information, including different types objects relations among these objects, communities such as Last.fm Pandora. This information...
Nowadays many people are members of multiple online social networks simultaneously, such as Facebook, Twitter and some other instant messaging circles. But these usually isolated from each other. Mapping common users across will benefit applications. Methods based on username comparison perform well parts users, however they can not work in the following situations: (a) choose different usernames networks; (b) a unique corresponds to individuals. In this paper, we propose utilize structures...
Millions of users share their opinions on Twitter, making it a valuable platform for tracking and analyzing public sentiment. Such analysis can provide critical information decision in various domains. Therefore has attracted attention both academia industry. Previous research mainly focused modeling In this work, we move one step further to interpret sentiment variations. We observed that emerging topics (named foreground topics) within the variation periods are highly related genuine...
Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications (e.g.,) machine learning and data mining. Recent studies show that graph-based ANN methods often outperform other types of algorithms. For typical methods, the algorithm executed iteratively execution dependency prohibits GPU adaptations. In this paper, we present novel framework decouples on graph into 3 stages, order to parallel performance-crucial distance computation....
Nowadays, a lot of people possess accounts on multiple online social networks, e.g., Facebook and Twitter. These networks are overlapped, but the correspondences between their users not explicitly given. Mapping common across these will be beneficial for applications such as cross-network recommendation. In recent years, mapping algorithms have been proposed which exploited and/or profile relations from different networks. However, there is still lack unified framework can well exploit...
Recently, plenty of neural network based recommendation models have demonstrated their strength in modeling complicated relationships between heterogeneous objects (i.e., users and items). However, the applications these fine trained are limited to off-line manner or re-ranking procedure (on a pre-filtered small subset items), due time-consuming computations. Fast item ranking under learned measures is largely still an open question.
Named Entity Disambiguation is the task of disambiguating named entity mentions in natural language text and link them to their corresponding entries a reference knowledge base (e.g. Wikipedia). Such disambiguation can help add semantics plain distinguish homonymous entities. Previous research has tackled this problem by making use two types context-aware features derived from base, namely, context similarity semantic relatedness. Both heavily rely on cross-document hyperlinks within base:...
Automatic synonym recognition is of great importance for entity-centric text mining and interpretation. Due to the high language use variability in real-life, manual construction semantic resources cover all synonyms prohibitively expensive may also result limited coverage. Although there are public knowledge bases, they only have coverage languages other than English. In this paper, we focus on medical domain propose an automatic way accelerate process synonymy resource development Chinese,...
Collaborative networks are composed of experts who cooperate with each other to complete specific tasks, such as resolving problems reported by customers. A task is posted and subsequently routed in the network from an expert another until being resolved. When cannot solve a task, his routing decision (i.e., where transfer task) critical since it can significantly affect completion time task. In this work, we attempt deduce cognitive process routing, model making generative made based on...
Given a health-related question (such as "I have bad stomach ache. What should I do?"), medical self-diagnosis Android inquires further information from the user, diagnoses disease, and ultimately recommend best solutions. One practical challenge to build such an is ask correct questions obtain most relevant information, in order correctly pinpoint likely causes of health conditions. In this paper, we tackle challenge, named "relevant symptom generation": limited set patient described...
Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
Efficient inner product search on embedding vectors is often the vital stage for online ranking services, such as recommendation and information retrieval. Recommendation algorithms, e.g., matrix factorization, typically produce latent to represent users or items. The services are conducted by retrieving most relevant item given user vector, where relevance defined product. Therefore, developing efficient recommender systems requires solving so-called maximum (MIPS) problem. In past decade,...
There are various kinds of social media information, including different types objects and relations among these objects, in music communities such as Last.fm Pandora. This information is valuable for recommendation. However, there two main challenges to exploit this rich information: (a) many communities, which makes it difficult develop a unified framework taking into account all relations. (b) In some much more sophisticated than pairwise relation, thus cannot be simply modeled by graph....
Bag of features (BoF) representation has attracted an increasing amount attention in large scale image processing systems. BoF treats images as loose collections local invariant descriptors extracted from them. The visual codebook is generally constructed by using unsupervised algorithm such K-means to quantize the into clusters. Images are then represented frequency histograms codewords contained To build a compact and discriminative codebook, codeword selection become indispensable tool....
Collaborative networks are common in real life, where domain experts work together to solve tasks issued by customers. How model the proficiency of is critical for us understand and optimize collaborative networks. Traditional expertise models, such as topic based methods, cannot capture two aspects human simultaneously: Specialization (what area an expert good at?) Proficiency Level (to what degree?). In this paper, we propose new models overcome problem. We embed all historical task data a...
Neural network based ranking has been widely adopted owing to its powerful capacity in modeling complex relationships (e.g., users and items, questions answers). Online neural ranking, i.e., the so called fast is considered a challenging task because measures are general non-convex asymmetric. Traditional approximate near neighbor (ANN) search which typically focuses on metric measures, not applicable these measures. To tackle this challenge, paper, we propose construct BipartitE Graph...
User information sharing is an important behavior in online social networks. Understanding such could help various applications as user modeling, cascade analysis, viral marketing, etc. In this paper, we aim to understand the strategies users employ make retweet decision. We are interested investigating whether these network contain significant about and can be used further characterize users. propose a flexible model that captures number of signals affecting user's Our empirical results...
Sample optimization, which involves sample augmentation and refinement, is an essential but often neglected component in modern display advertising platforms. Due to the massive number of ad candidates, industrial service usually leverages a multi-layer funnel-shaped structure involving at least two stages: candidate generation re-ranking. In step, offline neural network matching model trained based on past click/conversion data obtain user feature vector vector. However, there covariate...
User information sharing is an important behavior in online social networks. Understanding such could help various applications as user modeling, cascade analysis, viral marketing, etc. In this paper, we aim to understand the strategies users employ make retweet decision. We are interested investigating whether these network contain significant about and can be used further characterize users. propose a flexible model that captures number of signals affecting user's Our empirical results...
Suspended accounts are high-risk that violate the rules of a social network. These contain spam, offensive and explicit language, among others, incredibly variable in terms textual content. In this work, we perform detailed linguistic statistical analysis into information suspended show how insights from our study significantly improve deep-learning-based detection framework. Moreover, investigate utility advanced topic modeling for automatic creation word lists can discriminate regular...