- Advanced Image and Video Retrieval Techniques
- Image Retrieval and Classification Techniques
- Visual Attention and Saliency Detection
- Human Pose and Action Recognition
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Video Surveillance and Tracking Methods
- Advanced Vision and Imaging
- Functional Brain Connectivity Studies
- Image Processing and 3D Reconstruction
- EEG and Brain-Computer Interfaces
- Face and Expression Recognition
- 3D Shape Modeling and Analysis
- Robotics and Sensor-Based Localization
- Image and Video Quality Assessment
- Remote Sensing and Land Use
- Advanced Neuroimaging Techniques and Applications
- Robotic Path Planning Algorithms
- Advanced Algorithms and Applications
- Anomaly Detection Techniques and Applications
- Text and Document Classification Technologies
- Remote-Sensing Image Classification
- Topic Modeling
- Inertial Sensor and Navigation
Beihang University
2015-2025
Shanghai Customs College
2025
The University of Sydney
2022-2025
University of Electronic Science and Technology of China
2005-2024
Australian National University
2020-2024
Yan'an University
2024
Naval University of Engineering
2009-2024
Changchun University of Technology
2024
Beijing Institute of Fashion Technology
2023-2024
Beijing Tian Tan Hospital
2021-2024
This paper presents a novel unsupervised domain adaptation method for cross-domain visual recognition. We propose unified framework that reduces the shift between domains both statistically and geometrically, referred to as Joint Geometrical Statistical Alignment (JGSA). Specifically, we learn two coupled projections project source target data into low-dimensional subspaces where geometrical distribution are reduced simultaneously. The objective function can be solved efficiently in closed...
This paper proposes an importance weighted adversarial nets-based method for unsupervised domain adaptation, specific partial adaptation where the target has less number of classes compared to source domain. Previous methods generally assume identical label spaces, such that reducing distribution divergence leads feasible knowledge transfer. However, assumption is no longer valid in a more realistic scenario requires from larger and diverse smaller with classes. extends novel identify...
In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from data labeling process. Existing methods treat task as a point estimation problem, and produce single map following deterministic pipeline. Inspired process, probabilistic network via conditional variational autoencoders model human annotation generate multiple maps each input image sampling in latent space. With proposed consensus are able an accurate based on these...
This paper proposes a new method, i.e., weighted hierarchical depth motion maps (WHDMM) + three-channel deep convolutional neural networks (3ConvNets), for human action recognition from on small training datasets. Three strategies are developed to leverage the capability of ConvNets in mining discriminative features recognition. First, different viewpoints mimicked by rotating 3-D points captured maps. not only synthesizes more data, but also makes trained view-tolerant. Second, WHDMMs at...
Learning-based optical flow estimation has been dominated with the pipeline of cost volume convolutions for regression, which is inherently limited to local correlations and thus hard address long-standing challenge large displacements. To alleviate this, state-of-the-art framework RAFT gradually improves its prediction quality by using a number iterative refinements, achieving remarkable performance but introducing linearly increasing inference time. enable both high accuracy efficiency, we...
The success of current deep saliency detection methods heavily depends on the availability large-scale supervision in form per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder generalization ability learned models. By contrast, traditional handcrafted features based unsupervised methods, even though have been surpassed by supervised are generally dataset-independent could be applied wild. This raises a natural question that "Is it possible...
Although demonstrating great success, previous multi-view unsupervised feature selection (MV-UFS) methods often construct a view-specific similarity graph and characterize the local structure of data within each single view. In such way, cross-view information could be ignored. addition, they usually assume that different views are projected from latent space while diversity cannot fully captured. this work, we resent MV-UFS model via preserved consensus learning, referred to as CvLP-DCL...
Functional connectomes (FCs) have been recently shown to be powerful in characterizing brain conditions. However, many previous studies assumed temporal stationarity of FCs, while their dynamics are rarely explored. Here, based on the structural constructed from diffusion tensor imaging data, FCs derived resting‐state fMRI (R‐fMRI) data and then temporally divided into quasi‐stable segments via a sliding time window approach. After integrating pooling over large number those FC 44...
Hand gesture recognition is very significant for human-computer interaction. In this work, we present a novel real-time method hand recognition. our framework, the region extracted from background with subtraction method. Then, palm and fingers are segmented so as to detect recognize fingers. Finally, rule classifier applied predict labels of gestures. The experiments on data set 1300 images show that performs well highly efficient. Moreover, shows better performance than state-of-art another
In this paper, we propose to adopt ConvNets recognize human actions from depth maps on relatively small datasets based Depth Motion Maps (DMMs). particular, three strategies are developed effectively leverage the capability of in mining discriminative features for recognition. Firstly, different viewpoints mimicked by rotating virtual cameras around subject represented 3D points captured maps. This not only synthesizes more data ones, but also makes trained view-tolerant. Secondly, DMMs...
Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning. In this paper, we introduce a novel multistage cascaded learning framework via mutual information minimization model the between image data. Specifically, first map feature of each mode lower dimensional vector, adopt as regularizer reduce redundancy appearance features from geometric depth. We then perform multi-stage impose constraint at every stage network....
Significant performance improvement has been achieved for fully-supervised video salient object detection with the pixel-wise labeled training datasets, which are time-consuming and expensive to obtain. To relieve burden of data annotation, we present first weakly super-vised model based on relabeled “fixation guided scribble annotations”. Specifically, an "Appearance-motion fusion module" bidirectional ConvLSTM framework proposed achieve effective multi-modal learning long-term temporal...
Domain adaptation aims to leverage a label-rich domain (the source domain) help model learning in label-scarce target domain). Most methods require the co-existence of and samples reduce distribution mismatch. However, access may not always be feasible real-world applications due different problems ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g.</i> , storage, transmission, privacy issues). In this work, we deal with data-free...
Observing that the 3D captioning task and grounding contain both shared complementary information in nature, this work, we propose a unified framework to jointly solve these two distinct but closely related tasks synergistic fashion, which consists of task-agnostic modules lightweight task-specific modules. On one hand, aim learn precise locations objects, fine-grained attribute features characterize different complex relations between benefit visual grounding. other by casting each as proxy...
Pancreatic cancer is a highly aggressive malignant tumor, that becoming increasingly common in recent years. Despite advances intensive treatment modalities including surgery, radiotherapy, biological therapy, and targeted the overall survival rate has not significantly improved patients with pancreatic cancer. This may be attributed to insidious onset, unknown pathophysiology, poor prognosis of disease. It therefore essential identify develop more effective safer treatments for Tumor...
Recently, 3D printing as effective technology has been highlighted in the biomedical field. Previously, a porous hydroxyapatite (HA) scaffold with biocompatibility and osteoconductivity developed by this method. However, its osteoinductivity is limited. The main purpose of study was to improve it introduction recombinant human bone morphogenetic protein-2 (rhBMP-2). This coating rhBMP-2-delivery microspheres collagen. These synthesized scaffolds were characterized Scanning Electron...
This paper presents a novel unsupervised domain adaptation method for cross-domain visual recognition. We propose unified framework that reduces the shift between domains both statistically and geometrically, referred to as Joint Geometrical Statistical Alignment (JGSA). Specifically, we learn two coupled projections project source target data into low dimensional subspaces where geometrical distribution are reduced simultaneously. The objective function can be solved efficiently in closed...
Apple orchard in modern fruiting wall architectures (e.g. Vertical and V-trellis) help to attain high fruit yield quality. These systems are also key developing simpler tree canopies, which improves productivity of manual operations while creating opportunities for automated field such as robotic harvesting and/or pruning. Training trees these is carried out manually, becoming challenging due the increasing labor cost uncertainty availability. With reduced speed robustness sensing...
Along with the arrival of multimedia time, data has replaced textual to transfer information in various fields. As an important form data, images have been widely utilized by many applications, such as face recognition and image classification. Therefore, how accurately annotate each from a large set is vital importance but challenging. To perform these tasks well, it crucial extract suitable features character visual contents learn appropriate distance metric measure similarities between...
Emotion plays a significant role in perceiving external events or situations daily life. Due to ease of use and relative accuracy, Electroencephalography (EEG)-based emotion recognition has become hot topic the affective computing field. However, scalp EEG is mixed-signal cannot directly indicate exact information about active cortex sources different emotions. In this paper, we analyze differences source regions frequency bands for pairs emotions-based reconstructed using sLORETA, 26...
The success of current deep saliency detection methods heavily depends on the availability large-scale supervision in form per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder generalization ability learned models. By contrast, traditional handcrafted features based unsupervised methods, even though have been surpassed by supervised are generally dataset-independent could be applied wild. This raises a natural question that "Is it possible...