- Advanced Image Fusion Techniques
- Topic Modeling
- Image and Signal Denoising Methods
- Generative Adversarial Networks and Image Synthesis
- Natural Language Processing Techniques
- Advanced Image Processing Techniques
- Insurance, Mortality, Demography, Risk Management
- Advanced Image and Video Retrieval Techniques
- Image Processing Techniques and Applications
- Statistical Methods and Inference
- Video Surveillance and Tracking Methods
- Structural Health Monitoring Techniques
- Insurance and Financial Risk Management
- Financial Risk and Volatility Modeling
- Underwater Acoustics Research
- Sparse and Compressive Sensing Techniques
- Text and Document Classification Technologies
- Multimodal Machine Learning Applications
- Advanced Neural Network Applications
- Machine Learning and ELM
- Statistical and numerical algorithms
- Hydrology and Drought Analysis
- Medical Imaging Techniques and Applications
- Remote-Sensing Image Classification
- Advanced Control and Stabilization in Aerospace Systems
Nanjing University of Posts and Telecommunications
2021-2023
Hangzhou Dianzi University
2019
Peking University
2013-2017
Zhejiang University
2013-2017
Although semi-supervised variational autoencoder (SemiVAE) works in image classification task, it fails text task if using vanilla LSTM as its decoder. From a perspective of reinforcement learning, is verified that the decoder's capability to distinguish between different categorical labels essential. Therefore, Semi-supervised Sequential Variational Autoencoder (SSVAE) proposed, which increases by feeding label into decoder RNN at each time-step. Two specific structures are investigated and...
Large language models (LLMs) have demonstrated remarkable performance on a variety of natural tasks based just few examples instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability languages other than English. In this technical report, we present Baichuan 2, series large-scale multilingual containing 7 billion and 13 parameters, trained from scratch, 2.6 trillion tokens. 2 matches outperforms open-source...
In this paper, a unified optimization model for medical image fusion based on tensor decomposition and the non-subsampled shearlet transform (NSST) is proposed. The NSST method to fuse high-frequency (HF) low-frequency (LF) parts of two source images obtain mixed-frequency fused image. general, we integrate information from perspective (TD) fusion. Due structural differences between representations, potential loss may occur in images. To address issue, introduce joint static dynamic guidance...
In recent years, the rise of Large Language Models (LLMs) has spurred a growing demand for plug-and-play AI systems. Among various techniques, prompt engineering stands out as particularly significant. However, users often face challenges in writing prompts due to steep learning curve and significant time investment, existing automatic (APE) models can be difficult use. To address this issue, we propose PAS, an LLM-based APE system. PAS utilizes LLMs trained on high-quality, automatically...
Machine reading comprehension have been intensively studied in recent years, and neural network-based models shown dominant performances. In this paper, we present a Sogou Reading Comprehension (SMRC) toolkit that can be used to provide the fast efficient development of modern machine models, including both published original prototypes. To achieve goal, provides dataset readers, flexible preprocessing pipeline, necessary network components, built-in which make whole process data...
Although semi-supervised variational autoencoder (SemiVAE) works in image classification task, it fails text task if using vanilla LSTM as its decoder. From a perspective of reinforcement learning, is verified that the decoder's capability to distinguish between different categorical labels essential. Therefore, Semi-supervised Sequential Variational Autoencoder (SSVAE) proposed, which increases by feeding label into decoder RNN at each time-step. Two specific structures are investigated and...
In the field of deep neural networks, several generative methods have been proposed to address challenges from and discriminative tasks, e.g., natural language process, image caption generation. this paper, a conditional recurrent variational autoencoder is for multi-digit synthesis. This model capable generating images given number sequences retaining generalisation ability recover different types background. Our method evaluated on SVHN dataset experimental results show it succeeds...
Armored equipment plays a crucial role in the ground battlefield. The fast and accurate detection of enemy armored targets is significant to take initiative Comparing general object vehicle detection, target battlefield environment more challenging due long distance observation complicated environment. In this paper, an robust automatic method proposed detect Firstly, inspired by Feature Pyramid Network (FPN), we propose top-down aggregation (TDA) network which enhances shallow feature maps...
Conditional text generation is a fundamental task in natural language generation. Traditional conditional generative models build probability distributions over the given labels. However, categorical label information usually very abstract, e.g., sentiment, and it difficult to be disentangled from content. Therefore, instead of generating by modeling distribution, we propose novel method TextDream through searching semantic space. Specifically, this method, random seed initially new...
The algorithm for excessive-emission vehicles track matching based on network topology and weights (NTWMA) is proposed in this paper to resolve the trajectory problems of vehicles. topological structure initially constructed factor called breadth-first traversal. Then, by using road constraints construct a set adjacent candidate sections matching, sum distance, direction, relative weight each considered as solving condition. Finally, optimal section sequence calculated Dijkstra algorithm,...
Existing super-resolution (SR) models primarily focus on restoring local texture details, often neglecting the global semantic information within scene. This oversight can lead to omission of crucial details or introduction inaccurate textures during recovery process. In our work, we introduce Cognitive Super-Resolution (CoSeR) framework, empowering SR with capacity comprehend low-resolution images. We achieve this by marrying image appearance and language understanding generate a cognitive...
For image super-resolution (SR), bridging the gap between performance on synthetic datasets and real-world degradation scenarios remains a challenge. This work introduces novel "Low-Res Leads Way" (LWay) training framework, merging Supervised Pre-training with Self-supervised Learning to enhance adaptability of SR models images. Our approach utilizes low-resolution (LR) reconstruction network extract embeddings from LR images, them super-resolved outputs for reconstruction. Leveraging unseen...
The salient multimodal capabilities and interactive experience of GPT-4o highlight its critical role in practical applications, yet it lacks a high-performing open-source counterpart. In this paper, we introduce Baichuan-Omni, the first 7B Multimodal Large Language Model (MLLM) adept at concurrently processing analyzing modalities image, video, audio, text, while delivering an advanced strong performance. We propose effective training schema starting with model proceeding through two stages...
Large Language Models (LLMs) have exhibited significant potential in performing diverse tasks, including the ability to call functions or use external tools enhance their performance. While current research on function calling by LLMs primarily focuses single-turn interactions, this paper addresses overlooked necessity for engage multi-turn calling--critical handling compositional, real-world queries that require planning with but not only functions. To facilitate this, we introduce an...
Generalization has long been a central challenge in real-world image restoration. While recent diffusion-based restoration methods, which leverage generative priors from text-to-image models, have made progress recovering more realistic details, they still encounter "generative capability deactivation" when applied to out-of-distribution data. To address this, we propose using text as an auxiliary invariant representation reactivate the capabilities of these models. We begin by identifying...
In this paper, the problem of whether left tail and right a distribution share same extreme value index (EVI) is addressed we propose two different test statistics. The first one based on result joint asymptotic normality Hill estimators for EVIs both tails. And therefore, can construct quotient-type statistic, which χ2(1) distributed after some standardization. second statistic proposed in paper inspired by two-sample empirical likelihood methodology, prove its non parametric version Wilk’s...
In order to accurately predict the trajectory of mobile pollution sources such as motor vehicles in real time, a prediction method based on hybrid genetic particle swarm optimization and optimized extreme learning machine (HGPSO-OELM) is proposed this paper. Extreme Learning Machine (OELM) avoids disadvantage traditional (ELM) which has poor generalization performance for small data sets. However, due random assignment input weights hidden layer node biases parameter groups, accuracy...