- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Vision and Imaging
- Image Processing Techniques and Applications
- Video Surveillance and Tracking Methods
- Video Analysis and Summarization
- 3D Shape Modeling and Analysis
- Optical measurement and interference techniques
- Image Retrieval and Classification Techniques
- Animal Disease Management and Epidemiology
- Multimodal Machine Learning Applications
- Text and Document Classification Technologies
- Visual Attention and Saliency Detection
- Computer Graphics and Visualization Techniques
- Anomaly Detection Techniques and Applications
- Industrial Vision Systems and Defect Detection
- Human Motion and Animation
- T-cell and Retrovirus Studies
- Viral Infections and Immunology Research
- Human Pose and Action Recognition
Horizon Robotics (China)
2024
Chinese Academy of Sciences
2019-2022
University of Chinese Academy of Sciences
2019-2022
Institute of Automation
2021-2022
Beijing Academy of Artificial Intelligence
2020
Shandong Institute of Automation
2020
Recently, proposal-free instance segmentation has received increasing attention due to its concise and efficient pipeline. Generally, methods generate instance-agnostic semantic labels instance-aware features group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. We argue that treating separately is suboptimal. In fact, employing significantly reduces the potential application. The mutual...
Proposal-free instance segmentation methods mainly generate instance-agnostic semantic labels and instance-aware features to group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. In addition the lack of efficiency, also failed perform as well proposal-based approaches. To this end, work proposes a single-shot proposal-free method that requires only one single pass prediction. Our is based on...
Video object detection has been an important yet challenging topic in computer vision. Traditional methods mainly focus on designing the image-level or box-level feature propagation strategies to exploit temporal information. This paper argues that with a more effective and efficient framework, video detectors can gain improvement terms of both accuracy speed. For this purpose, studies object-level propagation, proposes query (QueryProp) framework for high-performance detection. The proposed...
Video object detection is a challenging task because of the presence appearance deterioration in certain video frames. One typical solution to aggregate neighboring features enhance per-frame features. However, such method ignores temporal relations between aggregated frames, which critical for improving recognition accuracy. To handle problem, this paper proposes context enhanced network (TCENet) exploit information by aggregation detection. displacement objects videos, novel DeformAlign...
Although various methods have been proposed for pedestrian attribute recognition, most studies follow the same feature learning mechanism, \ie, a shared image to classify multiple attributes. However, this mechanism leads low-confidence predictions and non-robustness of model in inference stage. In paper, we investigate why is case. We mathematically discover that central cause optimal cannot maintain high similarities with classifiers simultaneously context minimizing classification loss....
Panoptic segmentation (PS) is a complex scene understanding task that requires providing high-quality for both thing objects and stuff regions. Previous methods handle these two classes with semantic instance modules separately, following heuristic fusion or additional to resolve the conflicts between outputs. This work simplifies this pipeline of PS by consistently modeling novel framework, which extends detection model an extra module predict category- instance-aware pixel embedding...
This paper presents a unified framework for depth-aware panoptic segmentation (DPS), which aims to reconstruct 3D scene with instance-level semantics from one single image. Prior works address this problem by simply adding dense depth regression head (PS) networks, resulting in two independent task branches. neglects the mutually-beneficial relations between these tasks, thus failing exploit handy semantic cues boost accuracy while also producing sub-optimal maps. To overcome limitations, we...
Recently, proposal-free instance segmentation has received increasing attention due to its concise and efficient pipeline. Generally, methods generate instance-agnostic semantic labels instance-aware features group pixels into different object instances. However, previous mostly employ separate modules for these two sub-tasks require multiple passes inference. We argue that treating separately is suboptimal. In fact, employing significantly reduces the potential application. The mutual...
Video instance segmentation (VIS) aims at segmenting and tracking objects in videos. Prior methods typically generate frame-level or clip-level object instances first then associate them by either additional heads complex matching algorithms. This explicit association approach increases system complexity fails to fully exploit temporal cues In this paper, we design a simple, fast yet effective query-based framework for online VIS. Relying on an query proposal propagation mechanism with...
`3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal undertaking in autonomous driving, aiming to predict voxel occupancy within volumetric scenes. However, prevailing methodologies primarily focus on voxel-wise feature aggregation, while neglecting instance semantics scene context. In this paper, we present novel paradigm termed Symphonies (Scene-from-Insts), that delves into the integration of queries orchestrate 2D-to-3D reconstruction 3D modeling. Leveraging our...
Although various methods have been proposed for multi-label classification, most approaches still follow the feature learning mechanism of single-label (multi-class) namely, a shared image to classify multiple labels. However, we find this One-shared-Feature-for-Multiple-Labels (OFML) is not conducive discriminative label features and makes model non-robustness. For first time, mathematically prove that inferiority OFML optimal learned cannot maintain high similarities with classifiers...
This paper presents a unified framework for depth-aware panoptic segmentation (DPS), which aims to reconstruct 3D scene with instance-level semantics from one single image. Prior works address this problem by simply adding dense depth regression head (PS) networks, resulting in two independent task branches. neglects the mutually-beneficial relations between these tasks, thus failing exploit handy semantic cues boost accuracy while also producing sub-optimal maps. To overcome limitations, we...