- Topic Modeling
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Web Data Mining and Analysis
- Advanced Text Analysis Techniques
- Text and Document Classification Technologies
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Sentiment Analysis and Opinion Mining
- Artificial Intelligence in Games
- Privacy-Preserving Technologies in Data
- Cryptography and Data Security
- Seismology and Earthquake Studies
- COVID-19 diagnosis using AI
- Guidance and Control Systems
- Reinforcement Learning in Robotics
- Stochastic Gradient Optimization Techniques
National University of Defense Technology
2023-2025
Hefei University of Technology
2025
Nanchang University
2019
Recently, character-word lattice structures have achieved promising results for Chinese named entity recognition (NER), reducing word segmentation errors and increasing boundary information character sequences. However, constructing the structure is complex time-consuming, thus these lattice-based models usually suffer from low inference speed. Moreover, quality of lexicon affects accuracy NER model. Since noise words can potentially confuse NER, limited coverage cause to degenerate into...
Multimodal named entity recognition (MNER) is an emerging field that aims to automatically detect entities and classify their categories, utilizing input text auxiliary resources such as images. While previous studies have leveraged object detectors preprocess images fuse textual semantics with corresponding image features, these methods often overlook the potential finer grained information within each modality may exacerbate error propagation due predetection. To address issues, we propose...
Multimodal Entity Linking (MEL) aims at linking ambiguous mentions with multimodal information to entity in Knowledge Graph (KG) such as Wikipedia, which plays a key role many applications. However, existing methods suffer from shortcomings, including modality impurity noise raw image and textual representation, puts obstacles MEL. We formulate neural text matching problem where each (text image) is treated query, the model learns mapping query relevant candidate entities. This paper...
Abstract Distantly supervised relation extraction is an automatically annotating method for large corpora by classifying a bound of sentences with two same entities and the relation. Recent works exploit sound performance adopting contrastive learning to efficiently obtain instance representations under multi-instance framework. Though these methods weaken impact noisy labels, it ignores long-tail distribution problem in distantly sets fails capture mutual information different parts. We are...
Zero-shot relation extraction (ZSRE) is shown to become more significant in the current information system, which aims at predicting classes that lack annotations or have just never appeared during training. Previous works focus on projecting sentences with their corresponding descriptions an intermediate semantic space and searching nearest for unseen classes. Though these methods can achieve sound performance, they only obtain inferior via a trivial distance metric neglect interaction...
Multimodal entity linking (MEL) aims to utilize multimodal information (usually textual and visual information) link ambiguous mentions unambiguous entities in knowledge base. Current methods facing main issues: (1)treating the entire image as input may contain redundant information. (2)the insufficient utilization of entity-related information, such attributes images. (3)semantic inconsistency between base its representation. To this end, we propose DWE+ for linking. could capture finer...
Multimodal large language models (MLLMs) have shown remarkable progress in high-level semantic tasks such as visual question answering, image captioning, and emotion recognition. However, despite advancements, there remains a lack of standardized benchmarks for evaluating MLLMs performance multi-object sentiment analysis, key task understanding. To address this gap, we introduce MOSABench, novel evaluation dataset designed specifically analysis. MOSABench includes approximately 1,000 images...
Multimodal Entity Linking (MEL) aims at linking ambiguous mentions with multimodal information to entity in Knowledge Graph (KG) such as Wikipedia, which plays a key role many applications. However, existing methods suffer from shortcomings, including modality impurity noise raw image and textual representation, puts obstacles MEL. We formulate neural text matching problem where each (text image) is treated query, the model learns mapping query relevant candidate entities. This paper...