- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Time Series Analysis and Forecasting
- Domain Adaptation and Few-Shot Learning
- Text and Document Classification Technologies
- Neural Networks and Applications
- Stock Market Forecasting Methods
- Advanced Text Analysis Techniques
- Web Data Mining and Analysis
- Data Quality and Management
- Advanced Computational Techniques and Applications
- Handwritten Text Recognition Techniques
- Antenna Design and Optimization
- Advanced Graph Neural Networks
- Advanced Adaptive Filtering Techniques
- Industrial Vision Systems and Defect Detection
- Speech and Audio Processing
- Indoor and Outdoor Localization Technologies
- Music and Audio Processing
- Speech Recognition and Synthesis
- Recommender Systems and Techniques
- Virtual Reality Applications and Impacts
- Advanced Decision-Making Techniques
- Advanced Wireless Communication Techniques
Stevens Institute of Technology
2024
North China University of Technology
2019-2024
China National Institute of Standardization
2023-2024
Louisiana State University
2024
Chongqing University
2024
Hanyang University
2024
Huawei Technologies (China)
2017-2024
Anhui Polytechnic University
2023
China Electronic Information Industry Development
2023
Shanghai University of Engineering Science
2022
Scene text retrieval aims to localize and search all instances from an image gallery, which are the same or similar with a given query text. Such task is usually realized by matching recognized words, outputted end-to-end scene spotter. In this paper, we address problem directly learning cross-modal similarity between each instance natural images. Specifically, establish trainable network, jointly optimizing procedures of detection learning. way, can be simply performed ranking detected...
Deep reinforcement learning (DRL) has gained immense success in many applications, including gaming AI, robotics, and system scheduling. Distributed algorithms architectures have been vastly proposed (e.g., actor-learner architecture) to accelerate DRL training with large-scale server-based clusters. However, on-policy the architecture unavoidably induces resource wasting due synchronization between learners actors, thus resulting significantly extra billing. As a promising alternative,...
Since its introduction, the transformer has shifted development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to ability capture global dependencies within temporal tokens. Follow-up studies have largely involved altering tokenization and self-attention modules better adapt Transformers for addressing special challenges like non-stationarity, channel-wise dependency, variable correlation series. However, we found that expressive...
Since its introduction, the transformer has shifted development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to ability capture global dependencies within temporal tokens. Follow-up studies have largely involved altering tokenization and self-attention modules better adapt Transformers for addressing special challenges like non-stationarity, channel-wise dependency, variable correlation series. However, we found that expressive...
Many important data mining problems can be modeled as learning a (bidirectional) multidimensional mapping between two domains. Based on the generative adversarial networks (GANs), particularly conditional ones, cross-domain joint distribution matching is an increasingly popular kind of methods addressing such problems. Though significant advances have been achieved, there are still main disadvantages existing models, i.e., requirement large amount paired training samples and notorious...
Chinese spelling correction (CSC) constitutes a pivotal and enduring goal in natural language processing, serving as foundational element for various language-related tasks by detecting rectifying errors textual content. Numerous methods leverage multimodal information, including character, character sound, shape, to establish connections between incorrect correct characters. Research indicates that majority of stem from pinyin similarity, with similarity accounting half the errors....
The fifth-generation (5G) wireless communication is useful for positioning due to its large bandwidth and low cost. However, the presence of obstacles that block line-of-sight (LOS) path between devices would affect localization accuracy severely. In this paper, we propose an online learning approach mitigate ranging error directly in non-line-of-sight (NLOS) channels. distribution NLOS learned from received raw signals, where a network with neural processes regressor (NPR) utilized learn...
The extant event detection models, which rely on dependency parsing, have exhibited commendable efficacy. However, for some long sentences with more words, the results of parsing are complex, because each word corresponds to a directed edge label. These edges do not all provide guidance model, and accuracy tools decreases increase in sentence length, resulting error propagation. To solve these problems, we developed an model that uses self-constructed graph convolution network. First,...
With the rapid development of big data, artificial intelligence, and Internet technologies, human–human contact human–machine interaction have led to an explosion voice data. Rapidly identifying speaker’s identity retrieving managing their speech data among massive amount become major challenges for intelligent applications in field information security. This research proposes a vocal recognition technique based on adversarial training speaker audio video as well identification when oriented...
Fuzzy Kohonen clustering networks (FKCN) are well known for analysis (unsupervised learning and self-organizing). This classification of FKCN algorithm is a set iterative procedures that suffer some major problems, example its constringency rate not too fast large amount datasets. To overcome these defects, an efficient fuzzy network proposed in this paper, which can significantly reduce the computation time required to partition dataset into desired clusters. By introducing threshold values...
Deep neural networks, including transformers and convolutional have significantly improved multivariate time series classification (MTSC). However, these methods often rely on supervised learning, which does not fully account for the sparsity locality of patterns in data (e.g., diseases-related anomalous points ECG). To address this challenge, we formally reformulate MTSC as a weakly problem, introducing novel multiple-instance learning (MIL) framework better localization interest modeling...
The Metaverse is a mixed reality environment that combines virtual and real worlds, originating from the 1980s. Today, has become hot topic, attracting large influx of funding talent. It considered next generation internet, with wide-ranging applications in areas such as gaming, social media, healthcare, education, tourism, retail. In future, expected to significant economic industry, completely changing people's production lifestyle, much like internet did 20 years ago.
Argument Component Boundary Detection (ACBD) is an important sub-task in argumentation mining; it aims at identifying the word sequences that constitute argument components, and usually considered as first mining pipeline. Existing ACBD methods heavily depend on task-specific knowledge, require considerable human efforts feature-engineering. To tackle these problems, this work, we formulate a sequence labeling problem propose variety of Recurrent Neural Network (RNN) based methods, which do...
Clinical time series are known for irregular, highly-sporadic and strongly-complex structures consequently difficult to model by traditional state-space models. In this paper, we investigate the potential of applying variational recurrent neural networks (VRNNs) forecasting clinical extracted from electronic health records (EHRs) patients. Variational combine (RNNs) inference (VI) state-of-the-art methods highly-variable sequential data such as text, speech, multimedia signals in a...
Spelling error detection serves as a crucial preprocessing in many natural language processing applications. Unlike English, where every single word is directly typed by keyboard, we have to use an input method Chinese characters. The pinyin the most widely used. By intuition, should be helpful detecting spelling errors. However, when detect errors, of current methods ignore information and adopt pipeline framework that leads propagation. In this article, propose fusion lattice-LSTM model...
Feature selection in which most informative variables are selected for model generation is an important step pattern recognition. Here, one often tries to optimize multiple criteria such as discriminating power of the descriptor, performance and cardinality a subset. In this paper we propose fuzzy criterion multi-objective unsupervised feature by applying hybridized filter-wrapper approach (FC-MOFS). These formulations allow efficient way pick features from pool avoid misunderstanding...
Compared with the traditional few-shot task, none-of-the-above (NOTA) relation classification focuses on realistic scenario of learning, in which a test instance might not belong to any target categories. This undoubtedly increases task’s difficulty because given only few support samples, this cannot represent distribution NOTA categories space. The model needs make full use syntactic information and word meaning learned pre-training stage distinguish category sample embedding However,...
In the social network, similar users are assumed to prefer items, so searching of a target user plays an important role for most collaborative filtering methods. Existing methods use ratings items search users. Nowadays, abundant information is produced by Internet, such as profiles, relationships, behaviors, interests, and on. Only using not sufficient recommend wanted this paper, we propose new method fusion. Our first uses fusion then updates rating recommendation Experiments show that...
The similarity of words extracted from the rich text relation network is main way to calculate semantic similarity. Complex relational information and content in Wikipedia website, Community Question Answering social network, provide abundant corpus for calculation. However, most typical research only focused on single relationship. In this paper, we propose a calculation model which integrates multiple information, map relationship same space through learning representing matrix improve...
Previous work has demonstrated that end-to-end neural sequence models well for document-level event role filler extraction. However, the network model suffers from problem of not being able to utilize global information, resulting in incomplete extraction arguments. This is because inputs BiLSTM are all single-word vectors with no input contextual information. phenomenon particularly pronounced at document level. To address this problem, we propose key-value memory networks enhance and...
Open-domain event extraction is a fundamental task that aims to extract non-predefined types of events from news clusters. Some researchers have noticed its performance can be enhanced by improving dependency relationships. Recently, graphical convolutional networks (GCNs) been widely used integrate syntactic information into neural networks. However, they usually introduce noise and deteriorate the generalization. To tackle this issue, we propose using Bi-LSTM obtain semantic...
With the rapid development of big data, artificial intelligence, and Internet technologies, human-human contact human-machine interaction have produced an explosive growth voice data. Rapidly identifying speaker's identity retrieving managing his or her speech data in massive amount has become a major challenge for intelligent applications field information security. This research proposes vocal recognition technique based on adversarial training speaker audio video, identification...