NFDI4DS | UHH-SEMS - Publication Details

Jian Sun

ORCID: 0000-0002-6178-4166

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100785015

Research Areas

Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Advanced Image and Video Retrieval Techniques
Video Surveillance and Tracking Methods
Multimodal Machine Learning Applications
Advanced Vision and Imaging
Robotics and Sensor-Based Localization
Advanced Image Processing Techniques
Human Pose and Action Recognition
Adversarial Robustness in Machine Learning
Anomaly Detection Techniques and Applications
Image Processing Techniques and Applications
Visual Attention and Saliency Detection
Image Enhancement Techniques
COVID-19 diagnosis using AI
Advanced Computational Techniques and Applications
Image and Object Detection Techniques
Remote-Sensing Image Classification
Machine Learning and Data Classification
Advanced Algorithms and Applications
Generative Adversarial Networks and Image Synthesis
Ocular Diseases and Behçet’s Syndrome
Glaucoma and retinal disorders
Face and Expression Recognition
Retinal Imaging and Analysis

Beijing Institute of Technology
2008-2025

Southeast University
2025

Chongqing University of Technology
2022-2025

National Institute of Allergy and Infectious Diseases
2023-2024

National Institutes of Health
2020-2024

Shenyang Ligong University
2020-2024

Vi Technology (United States)
2019-2023

Megvii (China)
2017-2023

Jilin Electric Power Research Institute (China)
2022

Tongji University
2022

Deep Residual Learning for Image Recognition

OPENALEX - Publications

Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun

Deeper neural networks are more difficult to train. We present a residual learning framework ease the training of that substantially deeper than those used previously. explicitly reformulate layers as functions with reference layer inputs, instead unreferenced functions. provide comprehensive empirical evidence showing these easier optimize, and can gain accuracy from considerably increased depth. On ImageNet dataset we evaluate nets depth up 152 - 8× VGG [40] but still having lower...

10.1109/cvpr.2016.90 article EN 2016-06-01

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

OPENALEX - Publications

Xiangyu Zhang Xinyu Zhou Mengxiao Lin Jian Sun

We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new utilizes two operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error...

10.1109/cvpr.2018.00716 preprint EN 2018-06-01

YOLOX: Exceeding YOLO Series in 2021

OPENALEX - Publications

Zheng Ge Songtao Liu Feng Wang Zeming Li Jian Sun

In this report, we present some experienced improvements to YOLO series, forming a new high-performance detector -- YOLOX. We switch the an anchor-free manner and conduct other advanced detection techniques, i.e., decoupled head leading label assignment strategy SimOTA achieve state-of-the-art results across large scale range of models: For YOLO-Nano with only 0.91M parameters 1.08G FLOPs, get 25.3% AP on COCO, surpassing NanoDet by 1.8% AP; for YOLOv3, one most widely used detectors in...

10.48550/arxiv.2107.08430 preprint EN other-oa arXiv (Cornell University) 2021-01-01

RepVGG: Making VGG-style ConvNets Great Again

OPENALEX - Publications

Xiaohan Ding Xiangyu Zhang Ningning Ma Jungong Han Guiguang Ding and 1 more

We present a simple but powerful architecture of convolutional neural network, which has VGG-like inference-time body composed nothing stack 3 × convolution and ReLU, while the training-time model multi-branch topology. Such decoupling is realized by structural re-parameterization technique so that named RepVGG. On ImageNet, RepVGG reaches over 80% top-1 accuracy, first time for plain model, to best our knowledge. NVIDIA 1080Ti GPU, models run 83% faster than ResNet-50 or 101% ResNet-101...

10.1109/cvpr46437.2021.01352 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

You Only Look One-level Feature

OPENALEX - Publications

Qiang Chen Ying‐Ming Wang Tong Yang Xiangyu Zhang Jian Cheng and 1 more

This paper revisits feature pyramids networks (FPN) for one-stage detectors and points out that the success of FPN is due to its divide-and-conquer solution optimization problem in object detection rather than multi-scale fusion. From perspective optimization, we introduce an alternative way address instead adopting complex - utilizing only one-level detection. Based on simple efficient solution, present You Only Look One-level Feature (YOLOF). In our method, two key components, Dilated...

10.1109/cvpr46437.2021.01284 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

OPENALEX - Publications

Hanchao Li Pengfei Xiong Haoqiang Fan Jian Sun

This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints. Our proposed network starts from a single lightweight backbone and aggregates discriminative features through sub-network sub-stage cascade respectively. Based on the multi-scale feature propagation, substantially reduces number of parameters, but still obtains sufficient receptive field enhances model learning ability, which strikes balance between speed...

10.1109/cvpr.2019.00975 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Repulsion Loss: Detecting Pedestrians in a Crowd

OPENALEX - Publications

Xinlong Wang Tete Xiao Yuning Jiang Shuai Shao Jian Sun and 1 more

Detecting individual pedestrians in a crowd remains challenging problem since the often gather together and occlude each other real-world scenarios. In this paper, we first explore how state-of-the-art pedestrian detector is harmed by occlusion via experimentation, providing insights into problem. Then, propose novel bounding box regression loss specifically designed for scenes, termed repulsion loss. This driven two motivations: attraction target, surrounding objects. The term prevents...

10.1109/cvpr.2018.00811 preprint EN 2018-06-01

MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

OPENALEX - Publications

Zechun Liu Haoyuan Mu Xiangyu Zhang Zichao Guo Xin Yang and 2 more

In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks. We first train PruningNet, kind network, which is able to generate weight parameters any pruned structure given the target network. use simple stochastic sampling method training PruningNet. Then, apply an evolutionary procedure search good-performing The highly efficient because weights are directly generated by trained PruningNet and do not need finetuning at time. With...

10.1109/iccv.2019.00339 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

CrowdHuman: A Benchmark for Detecting Human in a Crowd

OPENALEX - Publications

Shuai Shao Zijian Zhao Boxun Li Tete Xiao Gang Yu and 2 more

Human detection has witnessed impressive progress in recent years. However, the occlusion issue of detecting human highly crowded environments is far from solved. To make matters worse, crowd scenarios are still under-represented current benchmarks. In this paper, we introduce a new dataset, called CrowdHuman, to better evaluate detectors scenarios. The CrowdHuman dataset large, rich-annotated and contains high diversity. There total $470K$ instances train validation subsets, $~22.6$ persons...

10.48550/arxiv.1805.00123 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Deep Learning with Low Precision by Half-Wave Gaussian Quantization

OPENALEX - Publications

Zhaowei Cai Xiaodong He Jian Sun Nuno Vasconcelos

The problem of quantizing the activations a deep neural network is considered. An examination popular binary quantization approach shows that this consists approximating classical non-linearity, hyperbolic tangent, by two functions: piecewise constant sign function, which used in feedforward computations, and linear hard tanh backpropagation step during learning. widely ReLU non-linearity then half-wave Gaussian quantizer (HWGQ) proposed for forward approximation shown to have efficient...

10.1109/cvpr.2017.574 article EN 2017-07-01

Objects365: A Large-Scale, High-Quality Dataset for Object Detection

OPENALEX - Publications

Shuai Shao Zeming Li Tianyuan Zhang Chao Peng Gang Yu and 3 more

In this paper, we introduce a new large-scale object detection dataset, Objects365, which has 365 categories over 600K training images. More than 10 million, high-quality bounding boxes are manually labeled through three-step, carefully designed annotation pipeline. It is the largest dataset (with full annotation) so far and establishes more challenging benchmark for community. Objects365 can serve as better feature learning localization-sensitive tasks like semantic segmentation. The...

10.1109/iccv.2019.00852 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

OPENALEX - Publications

Xiangyu Zhang Xinyu Zhou Mengxiao Lin Jian Sun

10.48550/arxiv.1707.01083 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

OPENALEX - Publications

Xuecai Hu Haoyuan Mu Xiangyu Zhang Zilei Wang Tieniu Tan and 1 more

Recent research on super-resolution has achieved great success due to the development of deep convolutional neural networks (DCNNs). However, arbitrary scale factor been ignored for a long time. Most previous researchers regard differentscale factors as independent tasks. They train specific model each which is inefficient in computing, and prior work only take several integer into consideration. In this work,we propose novel method called Meta-SR firstly solve (including non-integer...

10.1109/cvpr.2019.00167 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

OPENALEX - Publications

Guan’an Wang Shuo Yang Huanyu Liu Zhicheng Wang Yang Yang and 4 more

Occluded person re-identification (ReID) aims to match occluded images holistic ones across dis-joint cameras. In this paper, we propose a novel framework by learning high-order relation and topology information for discriminative features robust alignment. At first, use CNN backbone learn feature maps key-points estimation model extract semantic local features. Even so, still suffer from occlusion outliers. Then, view the extracted of an image as nodes graph adaptive direction convolutional...

10.1109/cvpr42600.2020.00648 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Bundling features for large scale partial-duplicate web image search

OPENALEX - Publications

Zhong Wu Qifa Ke Michael Isard Jian Sun

In state-of-the-art image retrieval systems, an is represented by a bag of visual words obtained quantizing high-dimensional local descriptors, and scalable schemes inspired text are then applied for large scale indexing retrieval. Bag-of-words representations, however: 1) reduce the discriminative power features due to feature quantization; 2) ignore geometric relationships among words. Exploiting such constraints, estimating 2D affine transformation between query each candidate image, has...

10.1109/cvpr.2009.5206566 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009-06-01

Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification

OPENALEX - Publications

Yifan Sun Qin Xu Yali Li Chi Zhang Yikang Li and 2 more

This paper considers a realistic problem in person re-identification (re-ID) task, i.e., partial re-ID. Under re-ID scenario, the images may contain observation of pedestrian. If we directly compare pedestrian image with holistic one, extreme spatial misalignment significantly compromises discriminative ability learned representation. We propose Visibility-aware Part Model (VPM) for re-ID, which learns to perceive visibility regions through self-supervision. The awareness allows VPM extract...

10.1109/cvpr.2019.00048 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

OTA: Optimal Transport Assignment for Object Detection

OPENALEX - Publications

Zheng Ge Songtao Liu Zeming Li Osamu Yoshie Jian Sun

Recent advances in label assignment object detection mainly seek to independently define positive/negative training samples for each ground-truth (gt) object. In this paper, we innovatively revisit the from a global perspective and propose formulate assigning procedure as an Optimal Transport (OT) problem – well-studied topic Optimization Theory. Concretely, unit transportation cost between demander (anchor) supplier pair weighted summation of their classification regression losses. After...

10.1109/cvpr46437.2021.00037 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Anchor DETR: Query Design for Transformer-Based Detector

OPENALEX - Publications

Ying‐Ming Wang Xiangyu Zhang Tong Yang Jian Sun

In this paper, we propose a novel query design for the transformer-based object detection. previous detectors, queries are set of learned embeddings. However, each embedding does not have an explicit physical meaning and cannot explain where it will focus on. It is difficult to optimize as prediction slot specific mode. other words, on region. To solve these problems, in our design, based anchor points, which widely used CNN-based detectors. So focuses objects near point. Moreover, can...

10.1609/aaai.v36i3.20158 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Coming Soon ...