- Topic Modeling
- Natural Language Processing Techniques
- Advanced Text Analysis Techniques
- Data Quality and Management
- Advanced Graph Neural Networks
- Multimodal Machine Learning Applications
- Text and Document Classification Technologies
- Image Retrieval and Classification Techniques
- Expert finding and Q&A systems
- Misinformation and Its Impacts
- Speech and dialogue systems
- Privacy-Preserving Technologies in Data
- Vehicle License Plate Recognition
- Food Supply Chain Traceability
- Cognitive Computing and Networks
- Algorithms and Data Compression
- Image and Video Stabilization
- Blockchain Technology Applications and Security
- Stock Market Forecasting Methods
- Machine Learning in Bioinformatics
- Sentiment Analysis and Opinion Mining
- Complex Network Analysis Techniques
- Music and Audio Processing
- Domain Adaptation and Few-Shot Learning
- Imbalanced Data Classification Techniques
North China University of Technology
2009-2024
China National Institute of Standardization
2023-2024
Shanghai Jiao Tong University
2007-2009
The extant event detection models, which rely on dependency parsing, have exhibited commendable efficacy. However, for some long sentences with more words, the results of parsing are complex, because each word corresponds to a directed edge label. These edges do not all provide guidance model, and accuracy tools decreases increase in sentence length, resulting error propagation. To solve these problems, we developed an model that uses self-constructed graph convolution network. First,...
In recent years, with the advancement of natural language processing techniques and release models like ChatGPT, how understand questions has become a hot topic. handling complex logical reasoning pre-trained models, its performance still room for improvement. Inspired by DAGN, we propose an improved DaGATN (Discourse-apperceptive Graph Attention Networks) model. By constructing discourse information graph to learn clues in text, decompose context, question, answer into elementary units...
As there is an increasing trend of people consuming by debit in China, financial organizations deal with a lot loan applications. If customers cannot repay the loans on time, have to cover loss. Therefore it important predict correctly whether customer will time. Typical machine learning methods can be employed exploit customers' information and give valuable judgements. We investigated function Deep Neural Network (DNN) this work, as achieves high successful rate fields image recognition,...
The similarity of words extracted from the rich text relation network is main way to calculate semantic similarity. Complex relational information and content in Wikipedia website, Community Question Answering social network, provide abundant corpus for calculation. However, most typical research only focused on single relationship. In this paper, we propose a calculation model which integrates multiple information, map relationship same space through learning representing matrix improve...
Sememes are the smallest semantic units of human languages, composition which can represent meaning words. have been successfully applied to many downstream applications in natural language processing (NLP) field. Annotation a word's sememes depends on experts, is both time-consuming and labor-consuming, limiting large-scale application sememe. Researchers proposed some sememe prediction methods automatically predict for However, existing focus information word itself, ignoring...
The goal of multimodal named entity recognition (MNER) is to detect spans in given image–text pairs and classify them into corresponding types. Despite the success existing works that leverage cross-modal attention mechanisms integrate textual visual representations, we observe three key issues. Firstly, models are prone misguidance when fusing unrelated text images. Secondly, most features not enhanced or filtered. Finally, due independent encoding strategies employed for images, a...
With the development of Internet and e-commerce, many product reviews have become an important source for collecting user opinions improving quality. The result feature extraction is basis text sentiment analysis, which directly affects accuracy data mining results. In extraction, mutual information method has selection with its low time complexity. However, does not consider difference terms in frequency, nor it distribution items same category, because candidate matrix too large, practical...
For the search engine, error-input query is a common phenomenon. This paper uses web log as training set for error checking. Through n -gram language model that trained by log, queries are analyzed and checked. Some features including words their number introduced into model. At same time data smoothing algorithm used to solve sparseness problem. It will improve overall accuracy of The experimental results show it effective.
Previous work has demonstrated that end-to-end neural sequence models well for document-level event role filler extraction. However, the network model suffers from problem of not being able to utilize global information, resulting in incomplete extraction arguments. This is because inputs BiLSTM are all single-word vectors with no input contextual information. phenomenon particularly pronounced at document level. To address this problem, we propose key-value memory networks enhance and...
For the Multiword Expression (MWE) recognition, Multiple Sequence Alignment (MSA) is proposed on motivation of gene recognition. Because textual sequence similar to in pattern analysis. This MSA technique combined with error-driven rules, improved efficiency beyond traditional methods.It provides a guarantee for MWE recall. It uses dynamic programming method prevent candidates from combinational explosion, and global solution extraction instead sub-pattern redundancy. Consequently, it has...
As the most comprehensive and structured Encyclopedia knowledge base that human known currently, Wikipedia has provided a great deal of semantic to people. In order fully excavate similarity keywords in Wikipedia, this paper we propose method based on words random walk mode. Firstly, construct link graph Wikipedia's information, then use relevance keywords. We T-truncated ε-truncated pruning strategies for improving algorithm performance. The experimental results show spearman series with...
With the rapid development of Weibo network, community discovery has become an emerging research hotspot. It is found that networks help operators understand network model structure and user characteristics provide personalized services for users. At present, most researches on mining only focus connection edge nodes, while ignoring content generated by users, resulting in a lower accuracy rate algorithms practical applications. This paper comprehensively considers node content, proposes...
Event detection tasks can enable the quick of events from texts and provide powerful support for downstream natural language processing tasks. Most such methods only detect a fixed set predefined event classes. To extend them to new class without losing ability old classes requires costly retraining model scratch. Incremental learning effectively solve this problem, but it abundant data In practice, however, lack high-quality labeled makes difficult obtain enough training. address above...
Existing data augmentation methods attempt to utilize more raw samples or incorporate external knowledge enhance the model, with assumption that explicit pool for retrieval must be accessible in both training and testing stages. We argue generated from distribution of beyond itself can provide informative relax strong original stage. To address this issue, we propose a novel framework introduces diffusion model first time. The Diffusion Model aims generate diversity by directly inheriting...
This paper addresses the issue of high resource consumption in named entity recognition (NER) under large models by utilizing meta-learning and structured distillation to generate lightweight models. Knowledge from commonly used NER tasks poses challenges because exponentially output space. Previous work treated it as a prediction task for distillation, but did not consider feedback student model optimize itself. Therefore, this proposes Meta-Structured Distillation (MSD). Specifically,...