Zhaowei Cai

ORCID: 0000-0003-2023-7761
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Anomaly Detection Techniques and Applications
  • Human Pose and Action Recognition
  • Advanced Image Fusion Techniques
  • Adversarial Robustness in Machine Learning
  • Image Enhancement Techniques
  • Machine Learning and ELM
  • Face recognition and analysis
  • Visual Attention and Saliency Detection
  • Image and Video Quality Assessment
  • Remote-Sensing Image Classification
  • Color Science and Applications
  • Handwritten Text Recognition Techniques
  • Industrial Vision Systems and Defect Detection
  • Fire Detection and Safety Systems
  • Face and Expression Recognition
  • Advanced Sensor and Control Systems
  • Image Processing Techniques and Applications
  • Medical Imaging Techniques and Applications
  • Image and Signal Denoising Methods
  • Advanced Vision and Imaging

Wuhan University of Technology
2024

Amazon (United States)
2021-2022

Seattle University
2022

Amazon (Germany)
2021

UC San Diego Health System
2017-2020

University of California, San Diego
2014-2020

Universidad Católica Santo Domingo
2019

Chinese Academy of Sciences
2012-2015

Institute of Automation
2012-2015

Shandong Institute of Automation
2013-2014

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends degrade increasing the thresholds. Two main factors are responsible for this: 1) overfitting during training, due exponentially vanishing positive samples, 2) inference-time mismatch between IoUs which detector optimal those of input hypotheses. A...

10.1109/cvpr.2018.00644 preprint EN 2018-06-01

In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The train a detector defines its quality. While commonly of 0.5 leads noisy (low-quality) detections, detection performance degrades for larger thresholds. This paradox high-quality has two causes: 1) overfitting, due vanishing positive samples large thresholds, and 2) inference-time quality mismatch between test hypotheses. A multi-stage architecture, Cascade R-CNN, composed...

10.1109/tpami.2019.2956516 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-11-28

The problem of quantizing the activations a deep neural network is considered. An examination popular binary quantization approach shows that this consists approximating classical non-linearity, hyperbolic tangent, by two functions: piecewise constant sign function, which used in feedforward computations, and linear hard tanh backpropagation step during learning. widely ReLU non-linearity then half-wave Gaussian quantizer (HWGQ) proposed for forward approximation shown to have efficient...

10.1109/cvpr.2017.574 article EN 2017-07-01

The design of complexity-aware cascaded detectors, combining features very different complexities, is considered. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later stages, where only few difficult candidate...

10.1109/iccv.2015.384 preprint EN 2015-12-01

Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient detection system that is capable of working various image domains, from human faces traffic signs to medical CT images. Unlike multi-domain models, model does not require prior knowledge the domain interest. This achieved by introduction a new family adaptation layers, based principles squeeze excitation, domain-attention...

10.1109/cvpr.2019.00746 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

In recent years, numerous effective multi-object tracking (MOT) methods are developed because of the wide range applications. Existing performance evaluations MOT usually separate object step from detection by using same fixed results for comparisons. this work, we perform a comprehensive quantitative study on effects accuracy to overall performance, new large-scale University at Albany DETection and tRACking (UA-DETRAC) benchmark dataset. The UA-DETRAC dataset consists 100 challenging video...

10.48550/arxiv.1511.04136 preprint EN other-oa arXiv (Cornell University) 2015-01-01

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends degrade increasing the thresholds. Two main factors are responsible for this: 1) overfitting during training, due exponentially vanishing positive samples, 2) inference-time mismatch between IoUs which detector optimal those of input hypotheses. A...

10.48550/arxiv.1712.00726 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Low-precision networks, with weights and activations quantized to low bit-width, are widely used accelerate inference on edge devices. However, current solutions uniform, using identical bit-width for all filters. This fails account the different sensitivities of filters is suboptimal. Mixed-precision networks address this problem, by tuning individual filter requirements. In work, problem optimal mixed-precision network search (MPS) considered. To circumvent its difficulties discrete space...

10.1109/cvpr42600.2020.00242 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

We present a plug-in replacement for batch normalization (BN) called exponential moving average (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike standard BN, where statistics are computed within each batch, EMAN, used in teacher, updates its by from BN student. This design reduces intrinsic cross-sample dependency enhances generalization teacher. EMAN strong baselines self-supervised 4-6/1-2 points about 7/2...

10.1109/cvpr46437.2021.00026 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

While some efforts have been paid to handle deformation and occlusion in visual tracking, they are still great challenges. In this paper, a dynamic graph-based tracker (DGT) is proposed address these two challenges unified framework. the target graph, nodes local parts encoding appearance information, edges interactions between inner geometric structure information. This graph representation provides much more information for tracking presence of occlusion. The then formulated as undirected...

10.1109/tip.2014.2364919 article EN IEEE Transactions on Image Processing 2014-10-23

The problem of quantizing the activations a deep neural network is considered. An examination popular binary quantization approach shows that this consists approximating classical non-linearity, hyperbolic tangent, by two functions: piecewise constant sign function, which used in feedforward computations, and linear hard tanh backpropagation step during learning. ReLU widely recent learning literature, then half-wave Gaussian quantizer (HWGQ) proposed for forward approximation shown to have...

10.48550/arxiv.1702.00953 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The design of complexity-aware cascaded detectors, combining features very different complexities, is considered. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later stages, where only few difficult candidate...

10.48550/arxiv.1507.05348 preprint EN other-oa arXiv (Cornell University) 2015-01-01

In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The train a detector defines its \textit{quality}. While commonly of 0.5 leads noisy (low-quality) detections, detection performance degrades for larger thresholds. This paradox high-quality has two causes: 1) overfitting, due vanishing positive samples large thresholds, and 2) inference-time quality mismatch between test hypotheses. A multi-stage architecture, Cascade R-CNN,...

10.48550/arxiv.1906.09756 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Visual tracking is an important but challenging problem in the computer vision field. In real world, appearances of target and its surroundings change continuously over space time, which provides effective information to track robustly. However, enough attention has not been paid spatio-temporal appearance previous works. this paper, a robust context model based tracker presented complete task unconstrained environments. The constructed with temporal spatial models. captures historical...

10.1109/tip.2013.2293430 article EN IEEE Transactions on Image Processing 2014-01-15

The problem of pedestrian detection is considered. design complexity-aware cascaded detectors, combining features very different complexities, investigated. A new cascade procedure introduced, by formulating learning as the Lagrangian optimization a risk that accounts for both accuracy and complexity. boosting algorithm, denoted complexity aware training (CompACT), then derived to solve this optimization. CompACT cascades are shown seek an optimal trade-off between pushing higher later...

10.1109/tpami.2019.2910514 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-04-22

Despite increasing efforts on universal representations for visual recognition, few have addressed object detection. In this paper, we develop an effective and efficient detection system that is capable of working various image domains, from human faces traffic signs to medical CT images. Unlike multi-domain models, model does not require prior knowledge the domain interest. This achieved by introduction a new family adaptation layers, based principles squeeze excitation, domain-attention...

10.48550/arxiv.1904.04402 preprint EN other-oa arXiv (Cornell University) 2019-01-01
Coming Soon ...