Xun Xu

ORCID: 0000-0002-5220-2240
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Video Surveillance and Tracking Methods
  • Advanced Vision and Imaging
  • Anomaly Detection Techniques and Applications
  • Advanced Neural Network Applications
  • 3D Shape Modeling and Analysis
  • Human Pose and Action Recognition
  • 3D Surveying and Cultural Heritage
  • Machine Learning and Algorithms
  • Remote Sensing and LiDAR Applications
  • Remote-Sensing Image Classification
  • Adversarial Robustness in Machine Learning
  • Robot Manipulation and Learning
  • Industrial Vision Systems and Defect Detection
  • Machine Learning and Data Classification
  • Reinforcement Learning in Robotics
  • Advanced Image and Video Retrieval Techniques
  • Image Processing Techniques and Applications
  • Advanced Image Processing Techniques
  • Robotics and Sensor-Based Localization
  • Image and Video Quality Assessment
  • Image and Object Detection Techniques
  • Robotic Path Planning Algorithms
  • COVID-19 diagnosis using AI
  • Retinal Imaging and Analysis

Agency for Science, Technology and Research
2019-2025

South China University of Technology
2009-2025

Institute for Infocomm Research
2019-2024

Southwest Jiaotong University
2023-2024

University of the Witwatersrand
2023

A*STAR Graduate Academy
2021-2023

Shanghai First People's Hospital
2021

National University of Singapore
2018-2020

Point cloud analysis has received much attention recently; and segmentation is one of the most important tasks. The success existing approaches attributed to deep network design large amount labelled training data, where latter assumed be always available. However, obtaining 3d point labels often very costly in practice. In this work, we propose a weakly supervised approach which requires only tiny fraction points stage. This made possible by learning gradient approximation exploitation...

10.1109/cvpr42600.2020.01372 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

10.1109/tgrs.2025.3529031 article EN IEEE Transactions on Geoscience and Remote Sensing 2025-01-01

The ability to understand the ways interact with objects from visual cues, a.k.a. affordance, is essential vision-guided robotic research. This involves categorizing, segmenting and reasoning of affordance. Relevant studies in 2D 2.5D image domains have been made previously, however, a truly functional understanding object affordance requires learning prediction 3D physical domain, which still absent community. In this work, we present AffordanceNet dataset, bench-mark 23k shapes 23 semantic...

10.1109/cvpr46437.2021.00182 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

State-of-the-art methods for semantic segmentation are based on deep neural networks that known to be data-hungry. Region-based active learning has shown a promising method reducing data annotation costs. A key design choice region-based AL is whether use regularly-shaped regions (e.g., rectangles) or irregularly-shaped region superpixels). In this work, we address question under realistic, click-based measurement of particular, revisit the super-pixels and demonstrate inappropriate cost...

10.1109/cvpr46437.2021.01084 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Object detection in remote sensing images relies on a large amount of labeled data for training. However, the increasing number new categories and class imbalance make exhaustive annotation impractical. Few-shot object (FSOD) addresses this issue by leveraging meta-learning seen base classes fine-tuning novel with limited samples. Nonetheless, substantial scale orientation variations objects pose significant challenges to existing few-shot methods. To overcome these challenges, we propose...

10.1109/tgrs.2023.3332652 article EN IEEE Transactions on Geoscience and Remote Sensing 2023-01-01

Addressing the annotation challenge in 3D Point Cloud segmentation has inspired research into weakly supervised learning. Existing approaches mainly focus on exploiting manifold and pseudo-labeling to make use of large unlabeled data points. A fundamental here lies intra-class variations local geometric structure, resulting subclasses within a semantic class. In this work, we leverage intuition opt for maintaining an individual classifier each subclass. Technically, design multi-prototype...

10.1109/tcsvt.2023.3281151 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-05-29

Test-Time Adaptation aims to adapt source domain model testing data at inference stage with success demonstrated in adapting unseen corruptions. However, these attempts may fail under more challenging real-world scenarios. Existing works mainly consider test-time adaptation non-i.i.d. stream and continual shift. In this work, we first complement the existing TTA protocol a globally class imbalanced set. We demonstrate that combining all settings together poses new challenges methods. argue...

10.1609/aaai.v38i13.29435 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Many real-world video sequences cannot be conveniently categorized as general or degenerate; in such cases, imposing a false dichotomy using the fundamental matrix homography model for motion segmentation on would lead to difficulty. Even when we are confronted with scene-motion, approach still suffers from several defects, which discuss this paper. The full potential of could only realized if judiciously harness information simpler model. From these considerations, propose multi-model...

10.1109/tpami.2019.2929146 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-07-17

Deploying models on target domain data subject to distribution shift requires adaptation. Test-time training (TTT) emerges as a solution this adaptation under realistic scenario where access full source is not available, and instant inference the required. Despite many efforts into TTT, there confusion over experimental settings, thus leading unfair comparisons. In work, we first revisit TTT assumptions categorize protocols by two key factors, i.e. whether testing sequentially streamed model...

10.1109/tpami.2024.3370963 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-02-28

Existing approaches towards anomaly detection (AD) often rely on a substantial amount of anomaly-free data to train representation and density models. However, large datasets may not always be available before the inference stage; in which case an model must trained with only handful normal samples, a.k.a. few-shot (FSAD). In this paper, we propose novel methodology address challenge FSAD incorporates two important techniques. Firstly, employ pre-trained source dataset initialize weights....

10.1109/tip.2024.3374048 article EN IEEE Transactions on Image Processing 2024-01-01

10.1109/igarss53475.2024.10642376 article EN IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium 2024-07-07

10.1109/cvpr52733.2024.02207 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Many real-world sequences cannot be conveniently categorized as general or degenerate; in such cases, imposing a false dichotomy using the fundamental matrix homography model for motion segmentation would lead to difficulty. Even when we are confronted with scene-motion, approach still suffers from several defects, which discuss this paper. The full potential of could only realized if judiciously harness information simpler model. From these considerations, propose multi-view spectral...

10.1109/cvpr.2018.00302 article EN 2018-06-01

Recent work on curvilinear structure segmentation has mostly focused backbone network design and loss engineering. The challenge of collecting labelled data, an expensive labor intensive process, been overlooked. While data is to obtain, unlabelled often readily available. In this work, we propose SemiCurv, a semi-supervised learning (SSL) framework for that able utilize such reduce the labelling burden. Our addresses two key challenges in formulating manner. First, fully exploit power...

10.1109/tip.2022.3189823 article EN IEEE Transactions on Image Processing 2022-01-01

Point cloud analysis has received much attention recently; and segmentation is one of the most important tasks. The success existing approaches attributed to deep network design large amount labelled training data, where latter assumed be always available. However, obtaining 3d point labels often very costly in practice. In this work, we propose a weakly supervised approach which requires only tiny fraction points stage. This made possible by learning gradient approximation exploitation...

10.48550/arxiv.2004.04091 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Deep learning models are the state-of-the-art methods for semantic point cloud segmentation, success of which relies on availability large-scale annotated datasets. However, it can be extremely time-consuming and prohibitively expensive to compile such In this work, we propose an active approach maximize model performance given limited annotation budgets. We investigate appropriate sample granularity selection under realistic cost measurement (clicks), demonstrate that super-point based...

10.48550/arxiv.2101.06931 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Subspace clustering has been extensively studied from the hypothesis-and-test, algebraic, and spectral clustering-based perspectives. Most assume that only a single type/class of subspace is present. Generalizations to multiple types are non-trivial, plagued by challenges such as choice numbers models, sampling imbalance parameter tuning. In many real world problems, data may not lie perfectly on linear hand designed models fit into these situations. this work, we formulate multi-type...

10.1109/tcsvt.2021.3069094 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-03-29

3D object detection has recently received much attention due to its great potential in autonomous vehicle (AV). The success of deep learning based detectors relies on the availability large-scale annotated datasets, which is time-consuming and expensive compile, especially for bounding box annotation. In this work, we investigate diversity-based active (AL) as a solution alleviate annotation burden. Given limited budget, only most informative frames objects are automatically selected human...

10.48550/arxiv.2205.07708 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Stereo-LiDAR fusion is often used for autonomous systems such as self-driving cars the two modalities are complementary to each other. Existing stereo-LiDAR methods mostly at feature level or outcome level, without considering uncertainty of depth estimation in modality. To this end, we propose a holistic and contextual evidential network (HCENet) estimation, which considers both intra-modality inter-modality uncertainties from stereo matching LiDAR point cloud completion. We design dual...

10.1109/tiv.2024.3398210 article EN IEEE Transactions on Intelligent Vehicles 2024-01-01

State-of-the-art methods for semantic segmentation are based on deep neural networks trained large-scale labeled datasets. Acquiring such datasets would incur large annotation costs, especially dense pixel-level prediction tasks like segmentation. We consider region-based active learning as a strategy to reduce costs while maintaining high performance. In this setting, batches of informative image regions instead entire images selected labeling. Importantly, we propose that enforcing local...

10.1109/tip.2021.3120041 article EN IEEE Transactions on Image Processing 2021-01-01

Background clutters pose challenges to defocus blur detection. Existing approaches often produce artifact predictions in background areas with clutter and relatively low confident boundary areas. In this work, we tackle the above issues from two perspectives. Firstly, inspired by recent success of self-attention mechanism, introduce channel-wise spatial-wise attention modules attentively aggregate features at different channels spatial locations obtain more discriminative features. Secondly,...

10.1109/tip.2022.3171424 article EN IEEE Transactions on Image Processing 2022-01-01
Coming Soon ...