- Topic Modeling
- Advanced Image and Video Retrieval Techniques
- Natural Language Processing Techniques
- Image Retrieval and Classification Techniques
- Multimodal Machine Learning Applications
- Recommender Systems and Techniques
- Data Management and Algorithms
- Text and Document Classification Technologies
- Face and Expression Recognition
- Video Surveillance and Tracking Methods
- Generative Adversarial Networks and Image Synthesis
- Metaheuristic Optimization Algorithms Research
- Advanced Graph Neural Networks
- Data Mining Algorithms and Applications
- Advanced Text Analysis Techniques
- Advanced Clustering Algorithms Research
- Sentiment Analysis and Opinion Mining
- Human Mobility and Location-Based Analysis
- Domain Adaptation and Few-Shot Learning
- Neural Networks and Applications
- Advanced Computational Techniques and Applications
- Advanced Vision and Imaging
- Anomaly Detection Techniques and Applications
- Plant Water Relations and Carbon Dynamics
- Rough Sets and Fuzzy Logic
Sun Yat-sen University
2016-2025
China Tourism Academy
2023-2025
China Guangzhou Analysis and Testing Center
2018-2024
University of Nottingham Malaysia Campus
2023
Guangzhou Experimental Station
2021
Microsoft Research Asia (China)
2020
Guangdong Food and Drug Vocational College
2020
Beijing Institute of Big Data Research
2018-2019
Northeast Agricultural University
2017
Anqing Normal University
2014-2017
Multi-view clustering, which seeks a partition of the data inmultiple views that often provide complementary information to eachother, has received considerable attention in recent years. In reallife clustering problems, each view may haveconsiderable noise. However, existing methods blindlycombine from multi-view with possiblyconsiderable noise, degrades their performance. thispaper, we propose novel Markov chain method for RobustMulti-view Spectral Clustering (RMSC). Our flavor oflow-rank...
Pre-trained models for programming languages have recently demonstrated great success on code intelligence. To support both code-related understanding and generation tasks, recent works attempt to pre-train unified encoder-decoder models. However, such framework is sub-optimal auto-regressive especially completion that requires a decoder-only manner efficient inference. In this paper, we present UniXcoder, cross-modal pre-trained model language. The utilizes mask attention matrices with...
Image content analysis is an important surround perception modality of intelligent vehicles. In order to efficiently recognize the on-road environment based on image from large-scale scene database, relevant images retrieval becomes one fundamental problems. To improve efficiency calculating similarities between images, hashing techniques have received increasing attentions. For most existing hash methods, suboptimal binary codes are generated, as hand-crafted feature representation not...
Virtual try-on systems under arbitrary human poses have significant application potential, yet also raise extensive challenges, such as self-occlusions, heavy misalignment among different poses, and complex clothes textures. Existing virtual methods can only transfer given a fixed pose, still show unsatisfactory performances, often failing to preserve person identity or texture details, with limited pose diversity. This paper makes the first attempt towards multi-pose guided system, which...
Fact checking is a challenging task because verifying the truthfulness of claim requires reasoning about multiple retrievable evidence. In this work, we present method suitable for semantic-level structure Unlike most previous works, which typically represent evidence sentences with either string concatenation or fusing features isolated sentences, our approach operates on rich semantic structures obtained by role labeling. We propose two mechanisms to exploit while leveraging advances...
Recently a few systems for automatically solving math word problems have reported promising results. However, the datasets used evaluation limitations in both scale and diversity. In this paper, we build large-scale dataset which is more than 9 times size of previous ones, contains many problem types. Problems are semi-automatically obtained from community question-answering (CQA) web pages. A ranking SVM model trained to extract answers answer text provided by CQA users, significantly...
Beyond current image-based virtual try-on systems that have attracted increasing attention, we move a step forward to developing video system precisely transfers clothes onto the person and generates visually realistic videos conditioned on arbitrary poses. Besides challenges in (e.g., fidelity, image synthesis), further requires spatiotemporal consistency. Directly adopting existing approaches often fails generate coherent with natural textures. In this work, propose Flow-navigated Warping...
This paper presents a novel template-based method to solve math word problems. learns the mappings between concept phrases in problems and their expressions from training data. For each equation template, we automatically construct rich template sketch by aggregating information various with same template. Our approach is implemented two-stage system. It first retrieves few relevant system templates aligns numbers those for candidate generation. then does fine-grained inference obtain final...
Interactive fashion image manipulation, which enables users to edit images with sketches and color strokes, is an interesting research problem great application value. Existing works often treat it as a general inpainting task do not fully leverage the semantic structural information in images. Moreover, they directly utilize conventional convolution normalization layers restore incomplete image, tends wash away sketch information. In this paper, we propose novel Fashion Editing Generative...
Deep hashing is an appealing approach for large-scale image retrieval. Most existing supervised deep methods learn hash functions using pairwise or triple similarities in randomly sampled mini-batches. They suffer from low training efficiency, insufficient coverage of data distribution, and pair imbalance problems. Recently, central similarity quantization (CSQ) attacks the above problems by "hash centers" as a global metric, which encourages codes similar images to their common center...
Retrieval plays an important role in knowledge-based visual question answering (KB-VQA), which relies on external knowledge to answer questions related image. However, not all information the is beneficial retrieval, e.g., that only semantically similar query but useful for answering. To improve effectiveness and efficiency of this paper, we propose efficient multimodal selection filter out irrelevant increase retriever performance KB-VQA. First, exclude most from large knowledge, uses a...
Learning user’s preference from check-in data is important for POI recommendation. Yet, a user usually has visited some POIs while most of are unvisited (i.e., negative samples). To leverage these “no-behavior” POIs, typical approach pairwise ranking, which constructs ranking pairs the and POIs. Although this generally effective, samples in obtained randomly, may fail to “critical” model training. On other hand, previous studies also utilized geographical feature improve recommendation...
Destination prediction is very important in location-based services such as recommendation of targeted advertising location. Most current approaches always predict destination according to existing trip based on history trajectories. However, no work has considered the difference between effects passing-by locations and trajectories, which seriously impacts accuracy predicted results can indicate purpose traveling. Meanwhile, temporal information trajectories plays an role. On one hand,...
Sentence similarity modeling lies at the core of many natural language processing applications, and thus has received much attention. Owing to success word embeddings, recently, popular neural network methods achieved sentence embedding. Most them focused on learning semantic information it as a continuous vector, yet syntactic sentences not been fully exploited. On other hand, prior works have shown benefits structured trees that include information, while few in this branch utilized...
Wanjun Zhong, Duyu Tang, Zhangyin Feng, Nan Duan, Ming Zhou, Gong, Linjun Shou, Daxin Jiang, Jiahai Wang, Jian Yin. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.
In this paper, we study how to learn a semantic parser of state-of-the-art accuracy with less supervised training data. We conduct our on WikiSQL, the largest hand-annotated parsing dataset date. First, demonstrate that question generation is an effective method empowers us neural network based thirty percent Second, show applying full data further improves model. addition, observe there logarithmic relationship between and amount
We propose a novel end-to-end deep architecture for face landmark detection, based on convolutional and deconvolutional network followed by carefully designed recurrent structures. The pipeline of this consists three parts. Through the first part, we encode an input image to resolution-preserved feature maps via with stacked layers. Then, in second estimate initial coordinates facial key points additional layer top these maps. In last using as input, refine that multiple long short-term...
Social emotion classification aims to predict the aggregation of emotional responses embedded in online comments contributed by various users. Such a task is inherently challenging because extracting relevant semantics from free texts classical research problem. Moreover, are typically characterized sparse feature space, which makes corresponding very difficult. On other hand, though deep neural networks have been shown be effective for speech recognition and image analysis tasks their...