NFDI4DS | UHH-SEMS - Publication Details

Cascade R-CNN: Delving Into High Quality Object Detection

OPENALEX - Publications

Zhaowei Cai Nuno Vasconcelos

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends degrade increasing the thresholds. Two main factors are responsible for this: 1) overfitting during training, due exponentially vanishing positive samples, 2) inference-time mismatch between IoUs which detector optimal those of input hypotheses. A...

10.1109/cvpr.2018.00644 preprint EN 2018-06-01

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

OPENALEX - Publications

Zhaowei Cai Nuno Vasconcelos

In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The train a detector defines its quality. While commonly of 0.5 leads noisy (low-quality) detections, detection performance degrades for larger thresholds. This paradox high-quality has two causes: 1) overfitting, due vanishing positive samples large thresholds, and 2) inference-time quality mismatch between test hypotheses. A multi-stage architecture, Cascade R-CNN, composed...

10.1109/tpami.2019.2956516 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-11-28

UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking

OPENALEX - Publications

Longyin Wen Dawei Du Zhaowei Cai Zhen Lei Ming-Ching Chang and 4 more

10.1016/j.cviu.2020.102907 article EN publisher-specific-oa Computer Vision and Image Understanding 2020-01-27

Deep Learning with Low Precision by Half-Wave Gaussian Quantization

OPENALEX - Publications

Zhaowei Cai Xiaodong He Jian Sun Nuno Vasconcelos

The problem of quantizing the activations a deep neural network is considered. An examination popular binary quantization approach shows that this consists approximating classical non-linearity, hyperbolic tangent, by two functions: piecewise constant sign function, which used in feedforward computations, and linear hard tanh backpropagation step during learning. widely ReLU non-linearity then half-wave Gaussian quantizer (HWGQ) proposed for forward approximation shown to have efficient...

10.1109/cvpr.2017.574 article EN 2017-07-01

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

OPENALEX - Publications

Zhaowei Cai Mohammad Saberian Nuno Vasconcelos

The design of complexity-aware cascaded detectors, combining features very different complexities, is considered. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later stages, where only few difficult candidate...

10.1109/iccv.2015.384 preprint EN 2015-12-01

Towards Universal Object Detection by Domain Attention

OPENALEX - Publications

Xudong Wang Zhaowei Cai Dashan Gao Nuno Vasconcelos

Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient detection system that is capable of working various image domains, from human faces traffic signs to medical CT images. Unlike multi-domain models, model does not require prior knowledge the domain interest. This achieved by introduction a new family adaptation layers, based principles squeeze excitation, domain-attention...

10.1109/cvpr.2019.00746 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking

OPENALEX - Publications

Longyin Wen Dawei Du Zhaowei Cai Zhen Lei Ming-Ching Chang and 4 more

In recent years, numerous effective multi-object tracking (MOT) methods are developed because of the wide range applications. Existing performance evaluations MOT usually separate object step from detection by using same fixed results for comparisons. this work, we perform a comprehensive quantitative study on effects accuracy to overall performance, new large-scale University at Albany DETection and tRACking (UA-DETRAC) benchmark dataset. The UA-DETRAC dataset consists 100 challenging video...

10.48550/arxiv.1511.04136 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Cascade R-CNN: Delving into High Quality Object Detection

OPENALEX - Publications

Zhaowei Cai Nuno Vasconcelos

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends degrade increasing the thresholds. Two main factors are responsible for this: 1) overfitting during training, due exponentially vanishing positive samples, 2) inference-time mismatch between IoUs which detector optimal those of input hypotheses. A...

10.48550/arxiv.1712.00726 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Rethinking Differentiable Search for Mixed-Precision Neural Networks

OPENALEX - Publications

Zhaowei Cai Nuno Vasconcelos

Low-precision networks, with weights and activations quantized to low bit-width, are widely used accelerate inference on edge devices. However, current solutions uniform, using identical bit-width for all filters. This fails account the different sensitivities of filters is suboptimal. Mixed-precision networks address this problem, by tuning individual filter requirements. In work, problem optimal mixed-precision network search (MPS) considered. To circumvent its difficulties discrete space...

10.1109/cvpr42600.2020.00242 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning

OPENALEX - Publications

Zhaowei Cai Avinash Ravichandran Subhransu Maji Charless C. Fowlkes Zhuowen Tu and 1 more

We present a plug-in replacement for batch normalization (BN) called exponential moving average (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike standard BN, where statistics are computed within each batch, EMAN, used in teacher, updates its by from BN student. This design reduces intrinsic cross-sample dependency enhances generalization teacher. EMAN strong baselines self-supervised 4-6/1-2 points about 7/2...

10.1109/cvpr46437.2021.00026 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Robust Deformable and Occluded Object Tracking With Dynamic Graph

OPENALEX - Publications

Zhaowei Cai Longyin Wen Zhen Lei Nuno Vasconcelos Stan Z. Li

While some efforts have been paid to handle deformation and occlusion in visual tracking, they are still great challenges. In this paper, a dynamic graph-based tracker (DGT) is proposed address these two challenges unified framework. the target graph, nodes local parts encoding appearance information, edges interactions between inner geometric structure information. This graph representation provides much more information for tracking presence of occlusion. The then formulated as undirected...

10.1109/tip.2014.2364919 article EN IEEE Transactions on Image Processing 2014-10-23

Deep Learning with Low Precision by Half-wave Gaussian Quantization

OPENALEX - Publications

Zhaowei Cai Xiaodong He Jian Sun Nuno Vasconcelos

The problem of quantizing the activations a deep neural network is considered. An examination popular binary quantization approach shows that this consists approximating classical non-linearity, hyperbolic tangent, by two functions: piecewise constant sign function, which used in feedforward computations, and linear hard tanh backpropagation step during learning. ReLU widely recent learning literature, then half-wave Gaussian quantizer (HWGQ) proposed for forward approximation shown to have...

10.48550/arxiv.1702.00953 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Learning Complexity-Aware Cascades for Deep Pedestrian Detection

OPENALEX - Publications

Zhaowei Cai Mohammad Saberian Nuno Vasconcelos

The design of complexity-aware cascaded detectors, combining features very different complexities, is considered. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later stages, where only few difficult candidate...

10.48550/arxiv.1507.05348 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

OPENALEX - Publications

Zhaowei Cai Nuno Vasconcelos

In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The train a detector defines its \textit{quality}. While commonly of 0.5 leads noisy (low-quality) detections, detection performance degrades for larger thresholds. This paradox high-quality has two causes: 1) overfitting, due vanishing positive samples large thresholds, and 2) inference-time quality mismatch between test hypotheses. A multi-stage architecture, Cascade R-CNN,...

10.48550/arxiv.1906.09756 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Robust Online Learned Spatio-Temporal Context Model for Visual Tracking

OPENALEX - Publications

Longyin Wen Zhaowei Cai Zhen Lei Yi Dong Stan Z. Li

Visual tracking is an important but challenging problem in the computer vision field. In real world, appearances of target and its surroundings change continuously over space time, which provides effective information to track robustly. However, enough attention has not been paid spatio-temporal appearance previous works. this paper, a robust context model based tracker presented complete task unconstrained environments. The constructed with temporal spatial models. captures historical...

10.1109/tip.2013.2293430 article EN IEEE Transactions on Image Processing 2014-01-15

Learning Complexity-Aware Cascades for Pedestrian Detection

OPENALEX - Publications

Zhaowei Cai Mohammad Saberian Nuno Vasconcelos

The problem of pedestrian detection is considered. design complexity-aware cascaded detectors, combining features very different complexities, investigated. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later...

10.1109/tpami.2019.2910514 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-04-22

Fatigue driving detection based on improved YOLOv8n-Pose

OPENALEX - Publications

Zhaowei Cai Shanling Lin Jianpu LIN Lu Shi Zhixian Lin and 1 more

10.37188/cjlcd.2024-0192 article EN Chinese Journal of Liquid Crystals and Displays 2025-01-01

Towards Universal Object Detection by Domain Attention

OPENALEX - Publications

Xudong Wang Zhaowei Cai Dashan Gao Nuno Vasconcelos

Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient detection system that is capable of working various image domains, from human faces traffic signs to medical CT images. Unlike multi-domain models, model does not require prior knowledge the domain interest. This achieved by introduction a new family adaptation layers, based principles squeeze excitation, domain-attention...

10.48550/arxiv.1904.04402 preprint EN other-oa arXiv (Cornell University) 2019-01-01