Jianbing Shen

ORCID: 0000-0003-2656-3082
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Video Surveillance and Tracking Methods
  • Human Pose and Action Recognition
  • Visual Attention and Saliency Detection
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Retinal Imaging and Analysis
  • 3D Shape Modeling and Analysis
  • Anomaly Detection Techniques and Applications
  • Adversarial Robustness in Machine Learning
  • Face recognition and analysis
  • Advanced Vision and Imaging
  • Remote Sensing and LiDAR Applications
  • Image Enhancement Techniques
  • Radiomics and Machine Learning in Medical Imaging
  • Image and Video Quality Assessment
  • Generative Adversarial Networks and Image Synthesis
  • COVID-19 diagnosis using AI
  • Advanced Image Processing Techniques
  • Olfactory and Sensory Function Studies
  • 3D Surveying and Cultural Heritage
  • Autonomous Vehicle Technology and Safety
  • Brain Tumor Detection and Classification
  • Imbalanced Data Classification Techniques

Beijing Institute of Technology
2011-2024

University of Macau
2021-2024

City University of Macau
2021-2024

Inception Institute of Artificial Intelligence
2019-2021

ETH Zurich
2020-2021

Beijing Academy of Artificial Intelligence
2020

Zhejiang University of Technology
2002-2004

Coronavirus Disease 2019 (COVID-19) spread globally in early 2020, causing the world to face an existential health crisis. Automated detection of lung infections from computed tomography (CT) images offers a great potential augment traditional healthcare strategy for tackling COVID-19. However, segmenting infected regions CT slices faces several challenges, including high variation infection characteristics, and low intensity contrast between normal tissues. Further, collecting large amount...

10.1109/tmi.2020.2996645 article EN IEEE Transactions on Medical Imaging 2020-05-22

As an essential problem in computer vision, salient object detection (SOD) has attracted increasing amount of research attention over the years. Recent advances SOD are predominantly led by deep learning-based solutions (named SOD). To enable in-depth understanding SOD, this paper, we provide a comprehensive survey covering various aspects, ranging from algorithm taxonomy to unsolved issues. In particular, first review algorithms different perspectives, including network architecture, level...

10.1109/tpami.2021.3051099 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-13

We present a comprehensive study on new task named camouflaged object detection (COD), which aims to identify objects that are "seamlessly" embedded in their surroundings. The high intrinsic similarities between the target and background make COD far more challenging than traditional task. To address this issue, we elaborately collect novel dataset, called COD10K, comprises 10,000 images covering various natural scenes, over 78 categories. All densely annotated with category, bounding-box,...

10.1109/cvpr42600.2020.00285 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

This paper presents a new method for detecting salient objects in images using convolutional neural networks (CNNs). The proposed network, named PAGE-Net, offers two key contributions. first is the exploitation of an essential pyramid attention structure object detection. enables network to concentrate more on regions while considering multi-scale saliency information. Such stacked design provides powerful tool efficiently improve representation ability corresponding layer with enlarged...

10.1109/cvpr.2019.00154 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as process of iterative information fusion over graphs. Specifically, builds fully connected to efficiently represent frames nodes, and relations between arbitrary frame pairs edges. underlying pair-wise are described by differentiable attention mechanism. Through parametric message passing, is able capture mine much richer higher-order frames,...

10.1109/iccv.2019.00933 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Magnetic resonance imaging (MRI) is a widely used neuroimaging technique that can provide images of different contrasts (i.e., modalities). Fusing this multi-modal data has proven particularly effective for boosting model performance in many tasks. However, due to poor quality and frequent patient dropout, collecting all modalities every remains challenge. Medical image synthesis been proposed as an solution, where any missing are synthesized from the existing ones. In paper, we propose...

10.1109/tmi.2020.2975344 article EN IEEE Transactions on Medical Imaging 2020-02-20

Predicting where people look in static scenes, a.k.a visual saliency, has received significant research interest recently. However, relatively less effort been spent understanding and modeling attention over dynamic scenes. This work makes three contributions to video saliency research. First, we introduce a new benchmark, called DHF1K (Dynamic Human Fixation 1K), for predicting fixations during scene free-viewing, which is long-time need this field. consists of 1K high-quality...

10.1109/tpami.2019.2924417 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-06-25

This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks. By elaborately annotating three popular video segmentation datasets (DAVIS, Youtube-Objects and SegTrack V2) with dynamic eye-tracking data UVOS setting, for first time, we quantitatively verified high consistency behavior among human observers, found strong correlation between explicit primary object judgements during dynamic, task-driven viewing. Such novel...

10.1109/cvpr.2019.00318 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Matching person images between the daytime visible modality and night-time infrared (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Existing methods usually learn multi-modality features in raw image, ignoring image-level discrepancy. Some apply GAN technique to generate images, but it destroys local structure introduces unavoidable noise. In this paper, we propose Homogeneous Augmented Tri-Modal (HAT) learning method for VI-ReID, where an auxiliary grayscale...

10.1109/tifs.2020.3001665 article EN IEEE Transactions on Information Forensics and Security 2020-06-11

Previous research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, less explored. In this work, we propose to employ former model type identify objects. We build a novel Attentive Saliency Network (ASNet)1 1.Available at: https://github.com/wenguanwang/ASNet. that learns detect objects from fixations. map, derived at upper network layers, mimics human attention mechanisms...

10.1109/tpami.2019.2905607 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-03-18

In this paper, we present a novel end-to-end learning neural network, i.e., MATNet, for zero-shot video object segmentation (ZVOS). Motivated by the human visual attention behavior, MATNet leverages motion cues as bottom-up signal to guide perception of appearance. To achieve this, an asymmetric block, named Motion-Attentive Transition (MAT), is proposed within two-stream encoder network firstly identify moving regions and then attend appearance capture full extent objects. Putting MATs in...

10.1109/tip.2020.3013162 article EN IEEE Transactions on Image Processing 2020-01-01

Visible thermal person re-identification (VT-ReID) is a challenging cross-modality pedestrian retrieval problem due to the large intra-class variations and modality discrepancy across different cameras. Existing VT-ReID methods mainly focus on learning sharable feature representations by handling modality-discrepancy in level. However, difference classifier level has received much less attention, resulting limited discriminability. In this paper, we propose novel modality-aware collaborative...

10.1109/tip.2020.2998275 article EN IEEE Transactions on Image Processing 2020-01-01

Visual tracking addresses the problem of localizing an arbitrary target in video according to annotated bounding box. In this article, we present a novel method by introducing attention mechanism into Siamese network increase its matching discrimination. We propose new way compute weights improve performance sub-Siamese [Attention Net (A-Net)], which locates attentive parts for solving searching problem. addition, features higher layers can preserve more semantic information while lower...

10.1109/tcyb.2019.2936503 article EN IEEE Transactions on Cybernetics 2019-09-12

Learning unbiased models on imbalanced datasets is a significant challenge. Rare classes tend to get concentrated representation in the classification space which hampers generalization of learned boundaries new test examples. In this paper, we demonstrate that Bayesian uncertainty estimates directly correlate with rarity and difficulty level individual samples. Subsequently, present novel framework for based class imbalance learning follows two key insights: First, should be extended...

10.1109/cvpr.2019.00019 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Convolutional Neural Networks have achieved significant success across multiple computer vision tasks. However, they are vulnerable to carefully crafted, human-imperceptible adversarial noise patterns which constrain their deployment in critical security-sensitive systems. This paper proposes a computationally efficient image enhancement approach that provides strong defense mechanism effectively mitigate the effect of such perturbations. We show deep restoration networks learn mapping...

10.1109/tip.2019.2940533 article EN IEEE Transactions on Image Processing 2019-09-19

Cross-modality person re-identification is a challenging task due to large cross-modality discrepancy and intramodality variations. Currently, most existing methods focus on learning modality-specific or modality-shareable features by using the identity supervision modality label. Different from methods, this paper presents novel Modality Confusion Learning Network (MCLNet). Its basic idea confuse two modalities, ensuring that optimization explicitly concentrated modality-irrelevant...

10.1109/iccv48922.2021.01609 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

In this article, we model a set of pixelwise object segmentation tasks - automatic video (AVS), image co-segmentation (ICS) and few-shot semantic (FSS) in unified view segmenting objects from relational visual data. To end, propose an attentive graph neural network (AGNN) that addresses these holistic fashion, by formulating them as process iterative information fusion over data graphs. It builds fully-connected to efficiently represent nodes relations between instances edges. The underlying...

10.1109/tpami.2021.3115815 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-09-28

In recent years, Siamese network based trackers have significantly advanced the state-of-the-art in real-time tracking. Despite their success, tend to suffer from high memory costs, which restrict applicability mobile devices with tight budgets. To address this issue, we propose a distilled tracking framework learn small, fast and accurate (students), capture critical knowledge large (teachers) by teacher-students distillation model. This model is intuitively inspired one teacher versus...

10.1109/tpami.2021.3127492 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-11-11

10.18653/v1/2024.findings-acl.940 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e., effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance motion streams are tightly coupled via dense cross connections, which integrate information with multi-layer, comprehensive features in way. Beyond traditional two-stream models separately, such design...

10.1109/tip.2019.2936112 article EN IEEE Transactions on Image Processing 2019-08-23

Existing LiDAR-based 3D object detectors usually focus on the single-frame detection, while ignoring spatiotemporal information in consecutive point cloud frames. In this paper, we propose an end-to-end online video detector that operates sequences. The proposed model comprises a spatial feature encoding component and aggregation component. former component, novel Pillar Message Passing Network (PMPNet) is to encode each discrete frame. It adaptively collects for pillar node from its...

10.1109/cvpr42600.2020.01151 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Deep embedding learning plays a key role in discriminative feature representations, where the visually similar samples are pulled closer and dissimilar pushed away low-dimensional space. This paper studies unsupervised problem by such representation without using any category labels. task faces two primary challenges: mining reliable positive supervision from highly fine-grained classes, generalizing to unseen testing categories. To approximate concentration negative separation properties...

10.1109/tpami.2020.3013379 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-08-03
Coming Soon ...