Naiyu Gao

ORCID: 0000-0003-2033-9821
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Vision and Imaging
  • Image Processing Techniques and Applications
  • Video Surveillance and Tracking Methods
  • Video Analysis and Summarization
  • 3D Shape Modeling and Analysis
  • Optical measurement and interference techniques
  • Image Retrieval and Classification Techniques
  • Animal Disease Management and Epidemiology
  • Multimodal Machine Learning Applications
  • Text and Document Classification Technologies
  • Visual Attention and Saliency Detection
  • Computer Graphics and Visualization Techniques
  • Anomaly Detection Techniques and Applications
  • Industrial Vision Systems and Defect Detection
  • Human Motion and Animation
  • T-cell and Retrovirus Studies
  • Viral Infections and Immunology Research
  • Human Pose and Action Recognition

Horizon Robotics (China)
2024

Chinese Academy of Sciences
2019-2022

University of Chinese Academy of Sciences
2019-2022

Institute of Automation
2021-2022

Beijing Academy of Artificial Intelligence
2020

Shandong Institute of Automation
2020

Recently, proposal-free instance segmentation has received increasing attention due to its concise and efficient pipeline. Generally, methods generate instance-agnostic semantic labels instance-aware features group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. We argue that treating separately is suboptimal. In fact, employing significantly reduces the potential application. The mutual...

10.1109/iccv.2019.00073 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Proposal-free instance segmentation methods mainly generate instance-agnostic semantic labels and instance-aware features to group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. In addition the lack of efficiency, also failed perform as well proposal-based approaches. To this end, work proposes a single-shot proposal-free method that requires only one single pass prediction. Our is based on...

10.1109/tcsvt.2020.2985420 article EN IEEE Transactions on Circuits and Systems for Video Technology 2020-04-03

10.1109/cvpr52733.2024.01915 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient framework, video detectors can gain improvement terms of both accuracy speed. For this purpose, studies object-level propagation, proposes query (QueryProp) framework for high-performance detection. The proposed...

10.1609/aaai.v36i1.19965 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Video object detection is a challenging task because of the presence appearance deterioration in certain video frames. One typical solution to aggregate neighboring features enhance per-frame features. However, such method ignores temporal relations between aggregated frames, which critical for improving recognition accuracy. To handle problem, this paper proposes context enhanced network (TCENet) exploit information by aggregation detection. displacement objects videos, novel DeformAlign...

10.1609/aaai.v34i07.6727 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Although various methods have been proposed for pedestrian attribute recognition, most studies follow the same feature learning mechanism, \ie, a shared image to classify multiple attributes. However, this mechanism leads low-confidence predictions and non-robustness of model in inference stage. In paper, we investigate why is case. We mathematically discover that central cause optimal cannot maintain high similarities with classifiers simultaneously context minimizing classification loss....

10.1609/aaai.v36i1.19991 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Panoptic segmentation (PS) is a complex scene understanding task that requires providing high-quality for both thing objects and stuff regions. Previous methods handle these two classes with semantic instance modules separately, following heuristic fusion or additional to resolve the conflicts between outputs. This work simplifies this pipeline of PS by consistently modeling novel framework, which extends detection model an extra module predict category- instance-aware pixel embedding...

10.1109/tip.2021.3090522 article EN IEEE Transactions on Image Processing 2021-01-01

This paper presents a unified framework for depth-aware panoptic segmentation (DPS), which aims to reconstruct 3D scene with instance-level semantics from one single image. Prior works address this problem by simply adding dense depth regression head (PS) networks, resulting in two independent task branches. neglects the mutually-beneficial relations between these tasks, thus failing exploit handy semantic cues boost accuracy while also producing sub-optimal maps. To overcome limitations, we...

10.1109/cvpr52688.2022.00168 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Recently, proposal-free instance segmentation has received increasing attention due to its concise and efficient pipeline. Generally, methods generate instance-agnostic semantic labels instance-aware features group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. We argue that treating separately is suboptimal. In fact, employing significantly reduces the potential application. The mutual...

10.48550/arxiv.1909.01616 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Video instance segmentation (VIS) aims at segmenting and tracking objects in videos. Prior methods typically generate frame-level or clip-level object instances first then associate them by either additional heads complex matching algorithms. This explicit association approach increases system complexity fails to fully exploit temporal cues In this paper, we design a simple, fast yet effective query-based framework for online VIS. Relying on an query proposal propagation mechanism with...

10.48550/arxiv.2301.01882 preprint EN other-oa arXiv (Cornell University) 2023-01-01

`3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal undertaking in autonomous driving, aiming to predict voxel occupancy within volumetric scenes. However, prevailing methodologies primarily focus on voxel-wise feature aggregation, while neglecting instance semantics scene context. In this paper, we present novel paradigm termed Symphonies (Scene-from-Insts), that delves into the integration of queries orchestrate 2D-to-3D reconstruction 3D modeling. Leveraging our...

10.48550/arxiv.2306.15670 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Although various methods have been proposed for multi-label classification, most approaches still follow the feature learning mechanism of single-label (multi-class) namely, a shared image to classify multiple labels. However, we find this One-shared-Feature-for-Multiple-Labels (OFML) is not conducive discriminative label features and makes model non-robustness. For first time, mathematically prove that inferiority OFML optimal learned cannot maintain high similarities with classifiers...

10.48550/arxiv.2212.01461 preprint EN other-oa arXiv (Cornell University) 2022-01-01

10.1016/s1201-9712(11)60144-6 article EN publisher-specific-oa International Journal of Infectious Diseases 2011-07-01

This paper presents a unified framework for depth-aware panoptic segmentation (DPS), which aims to reconstruct 3D scene with instance-level semantics from one single image. Prior works address this problem by simply adding dense depth regression head (PS) networks, resulting in two independent task branches. neglects the mutually-beneficial relations between these tasks, thus failing exploit handy semantic cues boost accuracy while also producing sub-optimal maps. To overcome limitations, we...

10.48550/arxiv.2206.00468 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...