- Topic Modeling
- Natural Language Processing Techniques
- Biomedical Text Mining and Ontologies
- Multimodal Machine Learning Applications
- Speech and dialogue systems
- Domain Adaptation and Few-Shot Learning
- Traditional Chinese Medicine Studies
- Advanced Graph Neural Networks
- Advanced Text Analysis Techniques
- Text and Document Classification Technologies
- Neural Networks and Applications
- Image Retrieval and Classification Techniques
- Text Readability and Simplification
- Expert finding and Q&A systems
- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Explainable Artificial Intelligence (XAI)
- Advanced Vision and Imaging
- 3D Shape Modeling and Analysis
- Video Coding and Compression Technologies
- Data Quality and Management
- Recommender Systems and Techniques
- Hand Gesture Recognition Systems
- Advanced Data Compression Techniques
- Complex Network Analysis Techniques
Zhejiang University
2013-2024
Zhejiang University of Science and Technology
2015-2024
Chongqing University of Posts and Telecommunications
2024
Academy of Military Medical Sciences
2023-2024
Shenzhen Research Institute of Big Data
2024
Chinese University of Hong Kong, Shenzhen
2024
Westlake University
2024
Beijing University of Technology
2023
Beijing Municipal Ecology and Environment Bureau
2023
Alibaba Group (United States)
2021
Knowledge of a disease includes information various aspects the disease, such as signs and symptoms, diagnosis treatment. This knowledge is critical for many health-related biomedical tasks, including consumer health question answering, medical language inference name recognition. While pre-trained models like BERT have shown success in capturing syntactic, semantic, world from text, we find they can be further complemented by specific diagnoses, treatments, other aspects. Hence, integrate...
Abstract Traditional Chinese Medicine (TCM) has been developed for several thousand years and plays a significant role in health care people. This paper studies the problem of classifying TCM clinical records into 5 main disease categories TCM. We explored number state-of-the-art deep learning models found that recent Bidirectional Encoder Representations from Transformers can achieve better results than other methods. further utilized an unlabeled corpus to fine-tune BERT language model...
In a modern e-commerce recommender system, it is important to understand the relationships among products. Recognizing product relationships-such as complements or substitutes-accurately an essential task for generating better recommendation results, well improving explainability in recommendation. Products and their associated naturally form graph, yet existing efforts do not fully exploit graph's topological structure. They usually only consider information from directly connected fact,...
As an interdisciplinary course, Machine Vision combines AI and digital image processing methods. This paper develops a comprehensive machine vision experiment on forest wildfire detection that organically integrates processing, learning deep technologies. Although the research has made great progress, many experiments are not suitable for students to operate. Also, with high accuracy is still big challenge. In this paper, we divide task of into two modules, which classification region...
In order to facilitate natural language understanding, the key is engage commonsense or background knowledge. However, how effectively in question answering systems still under exploration both research academia and industry. this paper, we propose a novel question-answering method by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, Cambridge Dictionary, boost performance. More concretely, first introduce graph-based iterative retrieval module, which iteratively retrieves...
Visual detection of Micro Air Vehicles (MAVs) has attracted increasing attention in recent years due to its important application various tasks. The existing methods for MAV assume that the training set and testing have same distribution. As a result, when deployed new domains, detectors would significant performance degradation domain discrepancy. In this paper, we study problem cross-domain detection. contributions paper are threefold. 1) We propose Multi-MAV-Multi-Domain (M3D) dataset...
During the last decade, community-based question answering (CQA) sites have accumulated a vast amount of questions and their crowdsourced answers over time. How to efficiently identify quality that are relevant given has become an active line research in CQA. The major challenge CQA is accurate selection high-quality w.r.t questions. Previous approaches tend model semantic matching between individual pair one its corresponding answer (how fitting posted question). However, these works ignore...
Zehao Lin, Xinjing Huang, Feng Ji, Haiqing Chen, Yin Zhang. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
Text classification is one of the fundamental tasks in text mining. In medical domain, there have been a number studies on modern medicine clinical notes written English. However, very limited research has conducted Chinese, especially traditional Chinese (TCM) records. The goal this study was to investigate features and machine learning algorithms for TCM classification. We collected 7,037 records famous doctors as our dataset, investigated effects different types algorithms. Additionally,...
Qianglong Chen, Feng Ji, Xiangji Zeng, Feng-Lin Li, Ji Zhang, Haiqing Yin Zhang. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.
Although pre-trained language models (PLMs) have achieved state-of-the-art performance on various natural processing (NLP) tasks, they are shown to be lacking in knowledge when dealing with driven tasks. Despite the many efforts made for injecting into PLMs, this problem remains open. To address challenge, we propose DictBERT, a novel approach that enhances PLMs dictionary which is easier acquire than graph (KG). During pre-training, present two pre-training tasks inject via contrastive...
In this paper, we present a deep learning based disease named entity recognition architecture. First, the word-level embedding, character-level embedding and lexicon feature are concatenated as input. Then multiple convolutional layers stacked over input to extract useful features automatically. Finally, label strategy, which is firstly introduced, applied output layer capture correlation information between neighboring labels. Experimental results on both NCBI CDR corpora show that ML-CNN...
Nowadays, vast amounts of multimedia data can be obtained across different collections (or domains). Therefore, it poses significant challenges for the utilization those cross-collection data, examples, summarization similarities and differences domains (e.g., CNN NYT), as well finding visually similar images visual photos, paintings hand-drawn sketches). In this paper, a supervised Latent Dirichlet Allocation (scLDA) approach is proposed to utilize collections. As natural extension...
Parameter regularization or allocation methods are effective in overcoming catastrophic forgetting lifelong learning. However, they solve all tasks a sequence uniformly and ignore the differences learning difficulty of different tasks. So parameter face significant when new task very from learned tasks, unnecessary overhead simple In this paper, we propose Allocation & Regularization (PAR), which adaptively select an appropriate strategy for each based on its difficulty. A is easy model that...
We present a new benchmark dataset called PARADE for paraphrase identification that requires specialized domain knowledge. contains paraphrases overlap very little at the lexical and syntactic level but are semantically equivalent based on computer science knowledge, as well non-paraphrases greatly not this Experiments show both state-of-the-art neural models non-expert human annotators have poor performance PARADE. For example, BERT after fine-tuning achieves an F1 score of 0.709, which is...