- Topic Modeling
- Domain Adaptation and Few-Shot Learning
- Advanced Graph Neural Networks
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Privacy-Preserving Technologies in Data
- Adversarial Robustness in Machine Learning
- Advanced Image and Video Retrieval Techniques
- EFL/ESL Teaching and Learning
- Text and Document Classification Technologies
- Internet Traffic Analysis and Secure E-voting
- Bioinformatics and Genomic Networks
- Music and Audio Processing
- Advanced Algorithms and Applications
- Biomedical Text Mining and Ontologies
- Data Quality and Management
- Advanced Sensor and Control Systems
- Evaluation and Performance Assessment
- Linguistics, Language Diversity, and Identity
- Multimodal Machine Learning Applications
- Impact of Light on Environment and Health
- Recommender Systems and Techniques
- Embedded Systems and FPGA Design
- Speech and dialogue systems
- Language, Discourse, Communication Strategies
University of Minnesota
2024
Peking University
2014-2024
Northwest Normal University
2024
Qingdao University
2022-2023
University of Essex
2023
Didi Chuxing (China)
2021
Nanchang Hangkong University
2014
Visual modality recently has aroused extensive attention in the fields of knowledge graph and multimedia because a lot real-world is multi-modal nature. However, it currently unclear to what extent visual can improve performance tasks over unimodal models, equally treating structural features may encode too much irrelevant information from images. In this paper, we probe utility auxiliary context representation learning perspective by designing Relation Sensitive Multi-modal Embedding model,...
The construction of an effective good speech recognition system typically requires large amounts transcribed data, which is expensive to collect. To overcome this problem, many unsupervised pretraining methods have been proposed. Among these methods, Masked Predictive Coding achieved significant improvements on various datasets with BERT-like Reconstruction loss and transformer backbone. However, aspects MPC yet be fully investigated. In paper, we conduct a further study focus three...
Abstract We study the problem of multimodal embedding-based entity alignment (EA) between different knowledge graphs. Recent works have attempted to incorporate images (visual context) address EA in a view. While benefits information been observed, its negative impacts are non-negligible as injecting without constraints brings much noise. It also remains unknown under what circumstances or extent visual context is truly helpful task. In this work, we propose learn representations from graph...
Knowledge graph (KG) representation learning which aims to encode entities and relations into low-dimensional spaces, has been widely used in KG completion link prediction. Although existing models have shown promising performance, the theoretical mechanism behind is much less well-understood. It challenging accurately portray internal connections between build a competitive model systematically. To overcome this problem, unified framework, called GrpKG, proposed paper from generic groupoid...
Abstract Background Infectious diseases persistently pose global threats, and it is imperative to accelerate the professionalization of public health workforce. This study aimed develop validate infectious disease control competency scale (IDCCS) for professionals fill a theoretical gap elevate practical capabilities by informing professionals’ development goals. Methods The initial item pool was generated through literature review, categorized into three dimensions (knowledge, skills,...
Prosody is a kind of cues that are critical to human speech perception and comprehension, so it plausible integrate prosodic information into machine recognition. However, as result the supra-segmental nature, hard with conventional acoustic features. Recently, RNNLMs have shown be state-of-the-art language model in many tasks. We thus attempt for improving recognition performance based on rescoring strategy. Firstly, three word-level features extracted from then passed separately. Therefore...
This paper explores the challenges posed by nominal adjectives (NAs) in natural language processing (NLP) tasks, particularly part-of-speech (POS) tagging. We propose treating NAs as a distinct POS tag, "JN," and investigate its impact on tagging, BIO chunking, coreference resolution. Our study shows that reclassifying can improve accuracy of syntactic analysis structural understanding NLP. present experimental results using Hidden Markov Models (HMMs), Maximum Entropy (MaxEnt) models,...
Federated learning (FL) is an efficient, scalable, and privacy-preserving technology in which clients collaborate on machine or deep model training. However, malicious can send poisoned updates to the central server without being identified, makes FL vulnerable backdoor attacks. In this work, we propose a novel defence approach, FLSec, mitigate attacks caused by adversarial local updates. FLSec utilizes original measurement, GradScore, computed from loss gradient norm of final layer models...
Federated learning (FL) is an efficient and privacy-preserving technology which can be applied to 6G networks. However, FL known vulnerable model poisoning attacks, hamper the accuracy of aggregated by sending malicious updates during training process. While existing algorithms such as byzantine-robust have been proposed defend against targeted misclassify samples with preset triggers, there are very few works on defending untargeted attacks. In this work, we first present a unified...
Knowledge graph (KG) embedding aims to encode entities and relations into low-dimensional vector spaces, in turn, can support various machine learning models on KG related tasks with good performance. However, existing methods for knowledge fail consider the influence of space, which makes them still unsatisfactory practical applications. In this study, we try improve expressiveness space from perspective metric. Specifically, first point out implications Minkowski metric used then make a...
Contrastive learning, a self-supervised learning method, has become one of the main techniques for visual representation learning. It builds contrastive views through data augmentation, maximizing mutual information between with same semantic information. Currently, methods used augmentation are random cropping, resizing, rotating, and recoloring images. However, due to diversity randomness it is difficult guarantee pairs positive examples during augmentation. This limits efficiency...
This paper describes a query-based composition algorithm that can integrate an ARPA format language model in the unified WFST framework, which avoids memory and time cost of converting models to optimizing models. The proposed is applied on-the-fly one-pass decoder rescoring decoder. Both modified require less during decoding on different scale What's more, nearly has same speed as standard one even use rescore lattice. Because these advantages, large-scale be by improve performance large...
The author researches the impact of second generation wavelet transform spectrometer data preprocessing navel orange sugar content and acidity Partial Least Squares (PLS) quantitative accuracy prediction model. This paper also collects spectral date one hundred oranges by visible/near-infrared diffuse reflectance detection technology establishes PLS model using sixty as establishing samples. contrasts changes because are processed transform, Finally conclusion: processing can improve...