- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Face recognition and analysis
- Indoor and Outdoor Localization Technologies
- Domain Adaptation and Few-Shot Learning
- Advanced Vision and Imaging
- Gaze Tracking and Assistive Technology
- Optical measurement and interference techniques
- Speech and Audio Processing
- Robotics and Sensor-Based Localization
- Advanced Image and Video Retrieval Techniques
- Privacy-Preserving Technologies in Data
- Gait Recognition and Analysis
- Remote Sensing and LiDAR Applications
- Wireless Networks and Protocols
- Textile materials and evaluations
- Network Security and Intrusion Detection
- Glaucoma and retinal disorders
- Image Processing Techniques and Applications
- Multimodal Machine Learning Applications
- Medical Image Segmentation Techniques
- Video Analysis and Summarization
- Advanced Neural Network Applications
- Anomaly Detection Techniques and Applications
- Mobile Crowdsensing and Crowdsourcing
University of California, San Diego
2022-2024
University of Electronic Science and Technology of China
2018-2022
Shanghai University of Electric Power
2022
International University of the Caribbean
2022
Yunnan Normal University
2022
Donghua University
2022
University of Illinois Urbana-Champaign
2018-2021
Xihua University
2020
Qingdao Eighth People's Hospital
2019
University of California, Los Angeles
2016
Domain adaptation in person re-identification (re-ID) has always been a challenging task. In this work, we explore how to harness the similar natural characteristics existing samples from target domain for learning conduct re-ID an unsupervised manner. Concretely, propose Self-similarity Grouping (SSG) approach, which exploits potential similarity (from global body local parts) of unlabeled build multiple clusters different views automatically. These independent are then assigned with...
Despite the remarkable progress in person re-identification (Re-ID), such approaches still suffer from failure cases where discriminative body parts are missing. To mitigate this type of failure, we propose a simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information given person, so that correct candidates can be identified even if some key With HPM, make following contributions produce more robust feature representations for Re-ID task: 1)...
In this work, we propose a novel Spatial-Temporal Attention (STA) approach to tackle the large-scale person reidentification task in videos. Different from most existing methods, which simply compute representations of video clips using frame-level aggregation (e.g. average pooling), proposed STA adopts more effective way for producing robust clip-level feature representation. Concretely, our fully exploits those discriminative parts one target both spatial and temporal dimensions, results...
Localization is of key importance to a variety applications. Most previous approaches require the objects carry electronic devices, while on many occasions device-free localization are in need. This paper proposes method based WiFi Channel State Information (CSI) and Deep Neural Networks (DNN). In area covered with WiFi, human movements may cause observable variations signals. By analyzing CSI fingerprint patterns modelling dependency between fingerprints locations through deep neural...
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video. Previous approaches only utilize single-frame features the detection, segmentation, tracking of objects they suffer video scenario due several distinct challenges such as motion blur drastic appearance change. To eliminate ambiguities introduced by using features, propose novel comprehensive feature aggregation approach (CompFeat) refine atboth frame-level...
6D object pose estimation is one of the fundamental problems in computer vision and robotics research. While a lot recent efforts have been made on generalizing to novel instances within same category, namely category-level estimation, it still restricted constrained environments given limited number annotated data. In this paper, we collect Wild6D, new unlabeled RGBD video dataset with diverse backgrounds. We utilize data generalize wild semi-supervised learning. propose model, called...
Tracking segmentation masks of multiple instances has been intensively studied, but still faces two fundamental challenges: 1) the requirement large-scale, frame-wise annotation, and 2) complexity two-stage approaches. To resolve these challenges, we introduce a novel semisupervised framework by learning instance tracking networks with only labeled image dataset unlabeled video sequences. With an contrastive objective, learn embedding to discriminate each from others. We show that even when...
Despite the remarkable recent progress, person re-identification (Re-ID) approaches are still suffering from failure cases where discriminative body parts missing. To mitigate such cases, we propose a simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information of given person, so that correct candidates can be identified even some key Within HPM, make following contributions produce more robust feature representation for Re-ID task: 1) learn...
While 6D object pose estimation has wide applications across computer vision and robotics, it remains far from being solved due to the lack of annotations. The problem becomes even more challenging when moving category-level pose, which requires generalization unseen instances. Current approaches are restricted by leveraging annotations simulation or collected humans. In this paper, we overcome barrier introducing a self-supervised learning approach trained directly on large-scale real-world...
3D-2D medical image matching is a crucial task in image-guided surgery, radiation therapy and minimally invasive surgery. The relies on identifying the correspondence between 2D reference projection of 3D target image. In this paper, we propose novel framework CT X-ray image, tailored for vertebra images. main idea to learn detector by means deep neural network. detected represented bounding box projection. Next, annotated doctor matched corresponding We evaluate our proposed method...
People flow counting is to count the number of people passing through a passage or gate. Conventional vision-based approaches require line-of-sight (LoS) and impose privacy concerns, while most radio-based dedicated equipment incur high cost. In this article, we propose exploit commodity WiFi continuous flows in device-free way, requiring one pair transmitter receiver. Leveraging Doppler effect induced by human passing, proposed method, named WiFlowCount, first constructs spectrogram shifts...
In order to solve the problem that indoor positioning is easily affected by environment and there a fixed bias in results, new adaptive UKF proposed achieve high-precision UWB. This method considers error characteristics of UWB environment, through construction state compensation function, together with algorithm, using previous estimation value as reference volume, calculate at time amount. It able perform real-time for any position tag area. improve accuracy. The experimental results show...
Video instance segmentation is a complex task in which we need to detect, segment, and track each object for any given video. Previous approaches only utilize single-frame features the detection, segmentation, tracking of objects they suffer video scenario due several distinct challenges such as motion blur drastic appearance change. To eliminate ambiguities introduced by using features, propose novel comprehensive feature aggregation approach (CompFeat) refine at both frame-level...
Federated learning is a distributed machine framework that enables nodes with computation and storage capabilities to train global model while keeping distributed-stored data locally. This process can promote the efficiency of modeling preserving privacy. Therefore, federated be widely applied in conjoint analysis scenarios, such as smart plant protection systems, which networked IoT devices are used monitor critical production improve crop production. However, collected by different...
To estimate the eye gaze accurately, a high-resolution and good quality image of region is essential. Therefore, one or multiple narrow angle cameras with long lens are commonly used. In case poor resolution absence focusing, tracking would be inaccurate. This paper looks into possibility using low wide camera to track gaze. order combat limitation eye-region which might only defined by few pixels, precise corneal reflection vector estimation based on local reconstruction method proposed. By...
Federated Learning is a distributed machine learning framework that aims to train global shared model while keeping their data locally, and previous researches have empirically proven the ideal performance of federated methods. However, recent found challenge statistical heterogeneity caused by non-independent identically (non-IID), which leads significant decline in because divergence non-IID data. This dramatically restricts application has become one critical challenges learning. In this...
Sound source localization is the problem of estimating positions one or several sound sources. In terms binaural audio, a paramount perceptual characteristic which can be assessed subjectively objectively. For objective evaluation localization, typical methods exploit monaural cues to predict directions Since multiple sources are often perceived simultaneously in daily scenes, an model detect temporally overlapping required. this paper, we propose network (BMSSLnet) model, framewise azimuths...
Domain adaptation in person re-identification (re-ID) has always been a challenging task. In this work, we explore how to harness the natural similar characteristics existing samples from target domain for learning conduct re-ID an unsupervised manner. Concretely, propose Self-similarity Grouping (SSG) approach, which exploits potential similarity (from global body local parts) of unlabeled automatically build multiple clusters different views. These independent are then assigned with...
Novel view synthesis from a sparse set of input images is challenging problem great practical interest, especially when camera poses are absent or inaccurate. Direct optimization and usage estimated depths in neural radiance field algorithms usually do not produce good results because the coupling between depths, inaccuracies monocular depth estimation. In this paper, we leverage recent 3D Gaussian splatting method to develop novel construct-and-optimize for without poses. Specifically,...