Shaohui Lin

ORCID: 0000-0003-0284-9940
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Human Pose and Action Recognition
  • Image and Signal Denoising Methods
  • Advanced Image Processing Techniques
  • Multimodal Machine Learning Applications
  • Advanced Vision and Imaging
  • Video Surveillance and Tracking Methods
  • Neural Networks and Applications
  • Anomaly Detection Techniques and Applications
  • Advanced Image and Video Retrieval Techniques
  • Adversarial Robustness in Machine Learning
  • Image Processing Techniques and Applications
  • Image Enhancement Techniques
  • Composite Material Mechanics
  • Medical Image Segmentation Techniques
  • Advanced Data Compression Techniques
  • Remote Sensing and LiDAR Applications
  • Gait Recognition and Analysis
  • Power Systems Fault Detection
  • Metabolomics and Mass Spectrometry Studies
  • Nutritional Studies and Diet
  • Radiomics and Machine Learning in Medical Imaging
  • Blind Source Separation Techniques
  • 3D Shape Modeling and Analysis

East China Normal University
2021-2025

Chongqing University of Science and Technology
2025

Shanghai Jiao Tong University
2022-2024

Shanghai Mental Health Center
2022-2024

Ministry of Education of the People's Republic of China
2023-2024

Yangtze University
2024

Shanghai Ninth People's Hospital
2022-2024

National University of Singapore
2019-2020

Xiamen University
2016-2019

Fuzhou University
2017

Single image dehazing is a challenging ill-posed problem due to the severe information degeneration. However, existing deep learning based methods only adopt clear images as positive samples guide training of network while negative unexploited. Moreover, most them focus on strengthening with an increase depth and width, leading significant requirement computation memory. In this paper, we propose novel contrastive regularization (CR) built upon exploit both hazy samples, respectively. CR...

10.1109/cvpr46437.2021.01041 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Structured pruning of filters or neurons has received increased focus for compressing convolutional neural networks. Most existing methods rely on multi-stage optimizations in a layer-wise manner iteratively and retraining which may not be optimal computation intensive. Besides, these are designed specific structure, such as filter block structures without jointly heterogeneous structures. In this paper, we propose an effective structured approach that prunes well other end-to-end manner. To...

10.1109/cvpr.2019.00290 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Accelerating convolutional neural networks has recently received ever-increasing research focus. Among various approaches proposed in the literature, filter pruning been regarded as a promising solution, which is due to its advantage significant speedup and memory reduction of both network model intermediate feature maps. To this end, most tend prune filters layer-wise fixed manner, incapable dynamically recover previously removed filter, well jointly optimize pruned across layers. In paper,...

10.24963/ijcai.2018/336 article EN 2018-07-01

The success of convolutional neural networks (CNNs) in computer vision applications has been accompanied by a significant increase computation and memory costs, which prohibits their usage on resource-limited environments, such as mobile systems or embedded devices. To this end, the research CNN compression recently become emerging. In paper, we propose novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speed up reduce overhead CNNs, can be well...

10.1109/tnnls.2019.2906563 article EN IEEE Transactions on Neural Networks and Learning Systems 2019-05-21

Convolutional neural networks (CNNs) have achieved remarkable success in various computer vision tasks, which are extremely powerful to deal with massive training data by using tens of millions parameters. However, CNNs often cost significant memory and computation consumption, prohibits their usage resource-limited environments such as mobile or embedded devices. To address the above issues, existing approaches typically focus on either accelerating convolutional layers compressing...

10.1109/tpami.2018.2873305 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-10-01

Compressing convolutional neural networks (CNNs) has received ever-increasing research focus. However, most existing CNN compression methods do not interpret their inherent structures to distinguish the implicit redundancy. In this paper, we investigate problem of from a novel interpretable perspective. The relationship between input feature maps and 2D kernels is revealed in theoretical framework, based on which kernel sparsity entropy (KSE) indicator proposed quantitate map importance...

10.1109/cvpr.2019.00291 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

The Information Bottleneck (IB) provides an information theoretic principle for representation learning, by retaining all relevant predicting label while minimizing the redundancy. Though IB has been applied to a wide range of applications, its optimization remains challenging problem which heavily relies on accurate estimation mutual information. In this paper, we present new strategy, Variational Self-Distillation (VSD), scalable, flexible and analytic solution essentially fitting but...

10.1109/cvpr46437.2021.00157 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression. However, these two techniques are traditionally deployed an isolated manner, leading to significant accuracy drop when pursuing high compression rates. In this paper, we propose a Collaborative Compression (CC) scheme, which joints channel compress CNN models by simultaneously learning the model sparsity low-rankness. Specifically, first investigate sensitivity of each...

10.1109/cvpr46437.2021.00637 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Video anomaly detection aims to automatically identify unusual objects or behaviours by learning from normal videos. Previous methods tend use simplistic reconstruction prediction constraints, which leads the insufficiency of learned representations for data. As such, we propose a novel bi-directional architecture with three consistency constraints comprehensively regularize task pixel-wise, cross-modal, and temporal-sequence levels. First, predictive is proposed consider symmetry property...

10.1609/aaai.v36i1.19898 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

U-Nets have achieved tremendous success in medical image segmentation. Nevertheless, it may limitations global (long-range) contextual interactions and edge-detail preservation. In contrast, the Transformer module has an excellent ability to capture long-range dependencies by leveraging self-attention mechanism into encoder. Although was born model dependency on extracted feature maps, still suffers high computational spatial complexities processing high-resolution 3D maps. This motivates us...

10.1109/tmi.2023.3264433 article EN IEEE Transactions on Medical Imaging 2023-04-05

Cross-modality fusing complementary information from different modalities effectively improves object detection performance, making it more useful and robust for a wider range of applications. Existing fusion strategies combine types images or merge backbone features through elaborated neural network modules. However, these methods neglect that modality disparities affect cross-modality as with camera focal lengths, placements, angles are hardly fused. In this paper, we investigate by...

10.48550/arxiv.2404.09146 preprint EN arXiv (Cornell University) 2024-04-14

10.1016/j.ijepes.2012.12.005 article EN International Journal of Electrical Power & Energy Systems 2013-02-26

Convolutional neural networks (CNNs) are highly successful for super-resolution (SR) but often require sophisticated architectures with heavy memory cost and computational overhead significantly restricts their practical deployments on resource-limited devices. In this paper, we proposed a novel contrastive self-distillation (CSD) framework to simultaneously compress accelerate various off-the-shelf SR models. particular, channel-splitting network can first be constructed from target teacher...

10.24963/ijcai.2021/155 article EN 2021-08-01

Continual learning aims to enable a model incrementally learn knowledge from sequentially arrived data. Previous works adopt the conventional classification architecture, which consists of feature extractor and classifier. The is shared across tasks or classes, but one specific group weights classifier corresponding new class should be expanded. Consequently, parameters continual learner gradually increase. Moreover, as contains all historical certain size memory usually required store...

10.1109/cvpr52729.2023.00356 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Structured pruning of filters or neurons has received increased focus for compressing convolutional neural networks. Most existing methods rely on multi-stage optimizations in a layer-wise manner iteratively and retraining which may not be optimal computation intensive. Besides, these are designed specific structure, such as filter block structures without jointly heterogeneous structures. In this paper, we propose an effective structured approach that prunes well other end-to-end manner. To...

10.48550/arxiv.1903.09291 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The surge of interest towards Multi-modal Large Language Models (MLLMs), e.g., GPT-4V(ision) from OpenAI, has marked a significant trend in both academia and industry. They endow (LLMs) with powerful capabilities visual understanding, enabling them to tackle diverse multi-modal tasks. Very recently, Google released Gemini, its newest most capable MLLM built the ground up for multi-modality. In light superior reasoning capabilities, can Gemini challenge GPT-4V's leading position learning?...

10.48550/arxiv.2312.12436 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Despite weakly supervised object detection (WSOD) being a promising step toward evading strong instance-level annotations, its capability is confined to closed-set categories within single training dataset. In this paper, we propose novel open-vocabulary framework, namely WSOVOD, extend traditional WSOD detect concepts and utilize diverse datasets with only image-level annotations. To achieve this, explore three vital strategies, including dataset-level feature adaptation, salient...

10.1609/aaai.v38i4.28127 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24
Coming Soon ...