NFDI4DS | UHH-SEMS - Publication Details

Knowledge Distillation via the Target-aware Transformer

OPENALEX - Publications

Sihao Lin Hongwei Xie Bing Wang Kaicheng Yu Xiaojun Chang and 2 more

Knowledge distillation becomes a de facto standard to improve the performance of small neural networks. Most previous works propose regress representational features from teacher student in one-to-one spatial matching fashion. However, people tend overlook fact that, due architecture differences, semantic information on same location usually vary. This greatly undermines underlying assumption approach. To this end, we novel one-to-all knowledge Specifically, allow each pixel feature be...

10.1109/cvpr52688.2022.01064 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation

OPENALEX - Publications

Li Liu Qingle Huang Sihao Lin Hongwei Xie Bing Wang and 2 more

Knowledge Distillation has shown very promising ability in transferring learned representation from the larger model (teacher) to smaller one (student). Despite many efforts, prior methods ignore important role of retaining inter-channel correlation features, leading lack capturing intrinsic distribution feature space and sufficient diversity properties features teacher network. To solve issue, we propose novel Inter-Channel Correlation for (ICKD), with which homology student network can...

10.1109/iccv48922.2021.00816 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

BossNAS Family: Block-wisely Self-supervised Neural Architecture Search

OPENALEX - Publications

Changlin Li Sihao Lin Tao Tang Guangrun Wang Mingjie Li and 2 more

Recent advances in hand-crafted neural architectures for visual recognition underscore the pressing need to explore architecture designs comprising diverse building blocks. Concurrently, search (NAS) methods have gained traction as a means alleviate human efforts. Nevertheless, question of whether NAS can efficiently and effectively manage diversified spaces featuring disparate candidates, such Convolutional Neural Networks (CNNs) transformers, remains an open question. In this work, we...

10.1109/tpami.2025.3529517 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

Context Matters: Distilling Knowledge Graph for Enhanced Object Detection

OPENALEX - Publications

Aijia Yang Sihao Lin Chung‐Hsing Yeh Minglei Shu Yi Yang and 1 more

The human visual system is capable of not only recognizing individual objects but also comprehending the contextual relationship between them in real-world scenarios, making it highly advantageous for object detection. However, practical applications, such information often available. Previous attempts to compensate this by utilizing cross-modal data as language and statistics obtain priors have been deemed sub-optimal due a semantic gap. To overcome challenge, we present seamless...

10.1109/tmm.2023.3266897 article EN IEEE Transactions on Multimedia 2023-04-13

MLP Can Be a Good Transformer Learner

OPENALEX - Publications

Sihao Lin Pumeng Lyu Dongrui Liu Tao Tang Xiaodan Liang and 2 more

10.1109/cvpr52733.2024.01843 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Semi-Supervised Human Detection via Region Proposal Networks Aided by Verification

OPENALEX - Publications

Si Wu Wenhao Wu Shiyao Lei Sihao Lin Rui Li and 2 more

In this paper, we explore how to leverage readily available unlabeled data improve semi-supervised human detection performance. For purpose, specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based commonly observed false positive types, verification module is developed assess foreground objects in candidate regions provide an important cue filtering RPN's proposals. The remaining proposals with high confidence scores are then used as pseudo...

10.1109/tip.2019.2944306 article EN IEEE Transactions on Image Processing 2019-10-03

Semi-Supervised Pedestrian Instance Synthesis and Detection With Mutual Reinforcement

OPENALEX - Publications

Si Wu Sihao Lin Wenhao Wu Mohamed Azzam Hau−San Wong

We propose a GAN-based scene-specific instance synthesis and classification model for semi-supervised pedestrian detection. Instead of collecting unreliable detections from unlabeled data, we adopt class-conditional GAN synthesizing instances to alleviate the problem insufficient labeled data. With help base detector, integrate detection by including post-refinement classifier (PRC) into minimax game. A generator PRC can mutually reinforce each other high-fidelity providing more accurate...

10.1109/iccv.2019.00516 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Unreliable-to-Reliable Instance Translation for Semi-Supervised Pedestrian Detection

OPENALEX - Publications

Sihao Lin Wenhao Wu Si Wu Yong Xu Hau−San Wong

Generating realistic pedestrian instances in a semi-supervised setting is promising but challenging due to the limited labeled data. We propose an unreliable-to-reliable instance translation model (Un2Reliab) conditioned on unreliable which poorly align with pedestrians. Un2Reliab mainly consists of encoder-decoder-like generative network and discriminative network, are jointly trained minimax game. adopt regularization ensure that synthesized semantically similar corresponding ground truth....

10.1109/tmm.2021.3058546 article EN IEEE Transactions on Multimedia 2021-02-12

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration

OPENALEX - Publications

Zhijian Huang Sihao Lin Guiyu Liu Mukun Luo Chaoqiang Ye and 3 more

Multi-modality fusion and multi-task learning are becoming trendy in 3D autonomous driving scenario, considering robust prediction computation budget. However, naively extending the existing framework to domain of multi-modality remains ineffective even poisonous due notorious modality bias task conflict. Previous works manually coordinate with empirical knowledge, which may lead sub-optima. To mitigate issue, we propose a novel yet simple multi-level gradient calibration across tasks...

10.1109/iccv51070.2023.00324 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Self-Supervised Multi-Frame Neural Scene Flow

OPENALEX - Publications

Dongrui Liu Daqi Liu Xue-Qian Li Sihao Lin Hongwei Xie and 3 more

Neural Scene Flow Prior (NSFP) and Fast (FNSF) have shown remarkable adaptability in the context of large out-of-distribution autonomous driving. Despite their success, underlying reasons for astonishing generalization capabilities remain unclear. Our research addresses this gap by examining NSFP through lens uniform stability, revealing that its performance is inversely proportional to number input point clouds. This finding sheds light on NSFP's effectiveness handling large-scale cloud...

10.48550/arxiv.2403.16116 preprint EN arXiv (Cornell University) 2024-03-24

MLP Can Be A Good Transformer Learner

OPENALEX - Publications

Sihao Lin Pumeng Lyu Dongrui Liu Tao Tang Xiaodan Liang and 2 more

Self-attention mechanism is the key of Transformer but often criticized for its computation demands. Previous token pruning works motivate their methods from view redundancy still need to load full network and require same memory costs. This paper introduces a novel strategy that simplifies vision transformers reduces computational through selective removal non-essential attention layers, guided by entropy considerations. We identify regarding layer in bottom blocks, subsequent MLP i.e. two...

10.48550/arxiv.2404.05657 preprint EN arXiv (Cornell University) 2024-04-08

Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation

OPENALEX - Publications

Li Liu Qingle Huang Sihao Lin Hongwei Xie Bing Wang and 2 more

Knowledge Distillation has shown very promising abil-ity in transferring learned representation from the largermodel (teacher) to smaller one (student).Despitemany efforts, prior methods ignore important role ofretaining inter-channel correlation of features, leading tothe lack capturing intrinsic distribution featurespace and sufficient diversity properties features theteacher network.To solve issue, we propose thenovel Inter-Channel Correlation for Distillation(ICKD), with which homology...

10.48550/arxiv.2202.03680 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration

OPENALEX - Publications

Zhijian Huang Sihao Lin Guiyu Liu Mukun Luo Chaoqiang Ye and 3 more

Multi-modality fusion and multi-task learning are becoming trendy in 3D autonomous driving scenario, considering robust prediction computation budget. However, naively extending the existing framework to domain of multi-modality remains ineffective even poisonous due notorious modality bias task conflict. Previous works manually coordinate with empirical knowledge, which may lead sub-optima. To mitigate issue, we propose a novel yet simple multi-level gradient calibration across tasks...

10.48550/arxiv.2307.16617 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Feature Scan Context aided Lidar-IMU Simultaneously Localization and Mapping

OPENALEX - Publications

Yan Wen Lijin Han Ying Li Sihao Lin Shida Nie and 1 more

Precise simultaneously localization and mapping is necessary to self-driving cars. In this paper, we present a SLAM system fusing with lidar IMU data. Considering that pose initial value key problem for point cloud ICP alignment, propose method using the Extended Kalman Filter combine yaw obtained by feature scan context preintegrated estimation value, aiming improve of vehicle. addition, adopt loop closure, which beneficial whole reduce accumulative errors. Sufficient experiments are...

10.1109/cvci59596.2023.10397389 article EN 2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI) 2023-10-27

Knowledge Distillation via the Target-aware Transformer

OPENALEX - Publications

Sihao Lin Hongwei Xie Bing Wang Kaicheng Yu Xiaojun Chang and 2 more

Knowledge distillation becomes a de facto standard to improve the performance of small neural networks. Most previous works propose regress representational features from teacher student in one-to-one spatial matching fashion. However, people tend overlook fact that, due architecture differences, semantic information on same location usually vary. This greatly undermines underlying assumption approach. To this end, we novel one-to-all knowledge Specifically, allow each pixel feature be...

10.48550/arxiv.2205.10793 preprint EN other-oa arXiv (Cornell University) 2022-01-01