Sihao Lin

ORCID: 0009-0004-3235-373X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Neural Networks and Applications
  • Video Surveillance and Tracking Methods
  • Domain Adaptation and Few-Shot Learning
  • Human Pose and Action Recognition
  • Robotics and Sensor-Based Localization
  • Anomaly Detection Techniques and Applications
  • Advanced Optical Sensing Technologies
  • Advanced Memory and Neural Computing
  • Industrial Vision Systems and Defect Detection
  • Remote Sensing and LiDAR Applications
  • Machine Learning in Materials Science
  • Advanced Image and Video Retrieval Techniques
  • Model Reduction and Neural Networks
  • Machine Learning and ELM
  • Adversarial Robustness in Machine Learning
  • COVID-19 diagnosis using AI
  • Ferroelectric and Negative Capacitance Devices
  • Multimodal Machine Learning Applications
  • Generative Adversarial Networks and Image Synthesis

MIT University
2022-2025

RMIT University
2022-2025

Beijing Institute of Technology
2023

South China University of Technology
2019-2021

Australian Regenerative Medicine Institute
2021

Monash University
2021

Knowledge distillation becomes a de facto standard to improve the performance of small neural networks. Most previous works propose regress representational features from teacher student in one-to-one spatial matching fashion. However, people tend overlook fact that, due architecture differences, semantic information on same location usually vary. This greatly undermines underlying assumption approach. To this end, we novel one-to-all knowledge Specifically, allow each pixel feature be...

10.1109/cvpr52688.2022.01064 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Knowledge Distillation has shown very promising ability in transferring learned representation from the larger model (teacher) to smaller one (student). Despite many efforts, prior methods ignore important role of retaining inter-channel correlation features, leading lack capturing intrinsic distribution feature space and sufficient diversity properties features teacher network. To solve issue, we propose novel Inter-Channel Correlation for (ICKD), with which homology student network can...

10.1109/iccv48922.2021.00816 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Recent advances in hand-crafted neural architectures for visual recognition underscore the pressing need to explore architecture designs comprising diverse building blocks. Concurrently, search (NAS) methods have gained traction as a means alleviate human efforts. Nevertheless, question of whether NAS can efficiently and effectively manage diversified spaces featuring disparate candidates, such Convolutional Neural Networks (CNNs) transformers, remains an open question. In this work, we...

10.1109/tpami.2025.3529517 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2025-01-01

The human visual system is capable of not only recognizing individual objects but also comprehending the contextual relationship between them in real-world scenarios, making it highly advantageous for object detection. However, practical applications, such information often available. Previous attempts to compensate this by utilizing cross-modal data as language and statistics obtain priors have been deemed sub-optimal due a semantic gap. To overcome challenge, we present seamless...

10.1109/tmm.2023.3266897 article EN IEEE Transactions on Multimedia 2023-04-13

10.1109/cvpr52733.2024.01843 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

In this paper, we explore how to leverage readily available unlabeled data improve semi-supervised human detection performance. For purpose, specifically modify the region proposal network (RPN) for learning on a partially labeled dataset. Based commonly observed false positive types, verification module is developed assess foreground objects in candidate regions provide an important cue filtering RPN's proposals. The remaining proposals with high confidence scores are then used as pseudo...

10.1109/tip.2019.2944306 article EN IEEE Transactions on Image Processing 2019-10-03

We propose a GAN-based scene-specific instance synthesis and classification model for semi-supervised pedestrian detection. Instead of collecting unreliable detections from unlabeled data, we adopt class-conditional GAN synthesizing instances to alleviate the problem insufficient labeled data. With help base detector, integrate detection by including post-refinement classifier (PRC) into minimax game. A generator PRC can mutually reinforce each other high-fidelity providing more accurate...

10.1109/iccv.2019.00516 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Generating realistic pedestrian instances in a semi-supervised setting is promising but challenging due to the limited labeled data. We propose an unreliable-to-reliable instance translation model (Un2Reliab) conditioned on unreliable which poorly align with pedestrians. Un2Reliab mainly consists of encoder-decoder-like generative network and discriminative network, are jointly trained minimax game. adopt regularization ensure that synthesized semantically similar corresponding ground truth....

10.1109/tmm.2021.3058546 article EN IEEE Transactions on Multimedia 2021-02-12

Multi-modality fusion and multi-task learning are becoming trendy in 3D autonomous driving scenario, considering robust prediction computation budget. However, naively extending the existing framework to domain of multi-modality remains ineffective even poisonous due notorious modality bias task conflict. Previous works manually coordinate with empirical knowledge, which may lead sub-optima. To mitigate issue, we propose a novel yet simple multi-level gradient calibration across tasks...

10.1109/iccv51070.2023.00324 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Neural Scene Flow Prior (NSFP) and Fast (FNSF) have shown remarkable adaptability in the context of large out-of-distribution autonomous driving. Despite their success, underlying reasons for astonishing generalization capabilities remain unclear. Our research addresses this gap by examining NSFP through lens uniform stability, revealing that its performance is inversely proportional to number input point clouds. This finding sheds light on NSFP's effectiveness handling large-scale cloud...

10.48550/arxiv.2403.16116 preprint EN arXiv (Cornell University) 2024-03-24

Self-attention mechanism is the key of Transformer but often criticized for its computation demands. Previous token pruning works motivate their methods from view redundancy still need to load full network and require same memory costs. This paper introduces a novel strategy that simplifies vision transformers reduces computational through selective removal non-essential attention layers, guided by entropy considerations. We identify regarding layer in bottom blocks, subsequent MLP i.e. two...

10.48550/arxiv.2404.05657 preprint EN arXiv (Cornell University) 2024-04-08

Knowledge Distillation has shown very promising abil-ity in transferring learned representation from the largermodel (teacher) to smaller one (student).Despitemany efforts, prior methods ignore important role ofretaining inter-channel correlation of features, leading tothe lack capturing intrinsic distribution featurespace and sufficient diversity properties features theteacher network.To solve issue, we propose thenovel Inter-Channel Correlation for Distillation(ICKD), with which homology...

10.48550/arxiv.2202.03680 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Multi-modality fusion and multi-task learning are becoming trendy in 3D autonomous driving scenario, considering robust prediction computation budget. However, naively extending the existing framework to domain of multi-modality remains ineffective even poisonous due notorious modality bias task conflict. Previous works manually coordinate with empirical knowledge, which may lead sub-optima. To mitigate issue, we propose a novel yet simple multi-level gradient calibration across tasks...

10.48550/arxiv.2307.16617 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Precise simultaneously localization and mapping is necessary to self-driving cars. In this paper, we present a SLAM system fusing with lidar IMU data. Considering that pose initial value key problem for point cloud ICP alignment, propose method using the Extended Kalman Filter combine yaw obtained by feature scan context preintegrated estimation value, aiming improve of vehicle. addition, adopt loop closure, which beneficial whole reduce accumulative errors. Sufficient experiments are...

10.1109/cvci59596.2023.10397389 article EN 2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI) 2023-10-27

Knowledge distillation becomes a de facto standard to improve the performance of small neural networks. Most previous works propose regress representational features from teacher student in one-to-one spatial matching fashion. However, people tend overlook fact that, due architecture differences, semantic information on same location usually vary. This greatly undermines underlying assumption approach. To this end, we novel one-to-all knowledge Specifically, allow each pixel feature be...

10.48550/arxiv.2205.10793 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...