- Human Pose and Action Recognition
- Image Enhancement Techniques
- Anomaly Detection Techniques and Applications
- Image and Signal Denoising Methods
- Video Surveillance and Tracking Methods
- Time Series Analysis and Forecasting
- Advanced Image Processing Techniques
- Stock Market Forecasting Methods
- Underwater Acoustics Research
- Multimodal Machine Learning Applications
- Advanced Vision and Imaging
- Underwater Vehicles and Communication Systems
- Tea Polyphenols and Effects
- Gait Recognition and Analysis
- Microbial Metabolism and Applications
- Fermentation and Sensory Analysis
- Machine Learning in Materials Science
- Forecasting Techniques and Applications
- Computational Drug Discovery Methods
- Advanced Neural Network Applications
- Data Visualization and Analytics
- Biomedical Text Mining and Ontologies
Institute of Software
2022-2023
University of Chinese Academy of Sciences
2022-2023
Chinese Academy of Sciences
2023
In long-term time series forecasting, most Transformer-based methods adopt the standard point-wise attention mechanism, which not only has high complexity but also cannot explicitly capture predictive dependencies from contexts since corresponding key and value are transformed same point. This paper proposes a model called Preformer. Preformer introduces novel efficient Multi-Scale Segment-Correlation mechanism that divides into segments utilizes segment-wise correlation-based to replace...
Although artificial intelligence (AI) has made significant progress in understanding molecules a wide range of fields, existing models generally acquire the single cognitive ability from molecular modality. Since hierarchy knowledge is profound, even humans learn different modalities including both intuitive diagrams and professional texts to assist their understanding. Inspired by this, we propose multimodal foundation model which pretrained graphs semantically related textual data (crawled...
Recognizing and segmenting actions from long videos is a challenging problem. Most existing methods focus on designing temporal convolutional models. However, these models are limited in their flexibility ability to model long-term dependencies. Transformers have recently been used various tasks. But the lack of inductive bias inefficiency handling video sequences limit application action segmentation. In this paper, we present pure Transformer-based without convolutions segmentation, called...
Underwater video enhancement (UVE) aims to improve the visibility and frame quality of underwater videos, which has significant implications for marine research exploration. However, existing methods primarily focus on developing image algorithms enhance each independently. There is a lack supervised datasets models specifically tailored UVE tasks. To fill this gap, we construct Synthetic Video Enhancement (SUVE) dataset, comprising 840 diverse underwater-style videos paired with...
Due to the selective absorption and scattering of light by diverse aquatic media, underwater images usually suffer from various visual degradations. Existing image enhancement (UIE) approaches that combine physical imaging models with neural networks often fail accurately estimate model parameters such as depth veiling light, resulting in poor performance certain scenarios. To address this issue, we propose a model-guided framework for jointly training Deep Degradation Model (DDM) any...
Underwater image enhancement (UIE) aims to generate clear images from low-quality underwater images. Due the unavailability of reference images, researchers often synthesize them construct paired datasets for training deep models. However, these synthesized may sometimes lack quality, adversely affecting outcomes. To address this issue, we propose UIE with Diffusion Prior (UIEDP), a novel framework treating as posterior distribution sampling process conditioned on degraded inputs....
Transformer-based methods have shown great potential in long-term time series forecasting. However, most of these adopt the standard point-wise self-attention mechanism, which not only becomes intractable for forecasting since its complexity increases quadratically with length series, but also cannot explicitly capture predictive dependencies from contexts corresponding key and value are transformed same point. This paper proposes a model called {\em Preformer}. Preformer introduces novel...
Video action segmentation under timestamp supervision has recently received much attention due to lower annotation costs. Most existing methods generate pseudo-labels for all frames in each video train the model. However, these suffer from incorrect pseudo-labels, especially semantically unclear transition region between two consecutive actions, which we call ambiguous intervals. To address this issue, propose a novel framework perspective of clustering, includes following parts. First,...
Action classification has made great progress, but segmenting and recognizing actions from long untrimmed videos remains a challenging problem. Most state-of-the-art methods focus on designing temporal convolution-based models, the inflexibility of convolutions difficulties in modeling long-term dependencies restrict potential these models. Transformer-based models with adaptable sequence capabilities have recently been used various tasks. However, lack inductive bias inefficiency handling...
Video action segmentation under timestamp supervision has recently received much attention due to lower annotation costs. Most existing methods generate pseudo-labels for all frames in each video train the model. However, these suffer from incorrect pseudo-labels, especially semantically unclear transition region between two consecutive actions, which we call ambiguous intervals. To address this issue, propose a novel framework perspective of clustering, includes following parts. First,...
Action classification has made great progress, but segmenting and recognizing actions from long videos remains a challenging problem. Recently, Transformer-based models with strong sequence modeling ability have succeeded in many se-quence tasks. However, the lack of inductive bias difficulty handling video sequences limit application Transformer action segmentation task. In order to explore potential this task, we replace some specific linear layers vanilla dilated temporal convolution,...