Dong Xu

ORCID: 0000-0003-2775-9730
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Face and Expression Recognition
  • Domain Adaptation and Few-Shot Learning
  • Image Retrieval and Classification Techniques
  • Multimodal Machine Learning Applications
  • Advanced Vision and Imaging
  • Advanced Image Processing Techniques
  • Video Analysis and Summarization
  • Advanced Neural Network Applications
  • Video Surveillance and Tracking Methods
  • 3D Shape Modeling and Analysis
  • Anomaly Detection Techniques and Applications
  • Visual Attention and Saliency Detection
  • Remote-Sensing Image Classification
  • Gait Recognition and Analysis
  • Face recognition and analysis
  • Video Coding and Compression Technologies
  • Sparse and Compressive Sensing Techniques
  • Advanced Data Compression Techniques
  • Text and Document Classification Technologies
  • Computer Graphics and Visualization Techniques
  • 3D Surveying and Cultural Heritage
  • Complex Network Analysis Techniques
  • Machine Learning and ELM

Beihang University
2007-2025

Shandong Institute of Business and Technology
2025

Renmin University of China
2025

The University of Sydney
2015-2024

University of Hong Kong
2022-2024

Chinese University of Hong Kong
2006-2024

University of Newcastle Australia
2024

Tencent (China)
2024

Qingdao University
2017-2022

Shanghai University of Engineering Science
2021

A large family of algorithms - supervised or unsupervised; stemming from statistics geometry theory has been designed to provide different solutions the problem dimensionality reduction. Despite motivations these algorithms, we present in this paper a general formulation known as graph embedding unify them within common framework. In embedding, each algorithm can be considered direct its linear/kernel/tensor extension specific intrinsic that describes certain desired statistical geometric...

10.1109/tpami.2007.250598 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2006-12-01

Subspace clustering is an effective method that has been successfully applied to many applications. Here, we propose a novel subspace model for multi-view data using latent representation termed Latent Multi-View Clustering (LMSC). Unlike most existing single-view methods, which directly reconstruct points original features, our explores underlying complementary information from multiple views and simultaneously seeks the representation. Using complementarity of views, depicts more...

10.1109/tpami.2018.2877660 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-10-24

Object detection, including objectness detection (OD), salient object (SOD), and category-specific (COD), is one of the most fundamental yet challenging problems in computer vision community. Over last several decades, great efforts have been made by researchers to tackle this problem, due its broad range applications for other tasks such as activity or event recognition, content-based image retrieval scene understanding, etc. While numerous methods presented recent years, a comprehensive...

10.1109/msp.2017.2749125 article EN IEEE Signal Processing Magazine 2018-01-01

Conventional video compression approaches use the predictive coding architecture and encode corresponding motion information residual information. In this paper, taking advantage of both classical in conventional method powerful non-linear representation ability neural networks, we propose first end-to-end deep model that jointly optimizes all components for compression. Specifically, learning based optical flow estimation is utilized to obtain reconstruct current frames. Then employ two...

10.1109/cvpr.2019.01126 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

In this paper, we propose a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) through domain-collaborative domain-adversarial training of neural networks. We add several classifiers on multiple CNN feature extraction blocks <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> , in which each classifier is connected to the hidden representations from one block loss function defined based...

10.1109/cvpr.2018.00400 article EN 2018-06-01

Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks model temporal dependencies between 3D positional configurations body joints for better analysis activities in skeletal data. The proposed work extends this idea spatial domain as well analyze hidden sources action-related information within skeleton sequences both these domains simultaneously. Based on pictorial structure...

10.1109/tpami.2017.2771306 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-11-08

In this paper, we study the heterogeneous domain adaptation (HDA) problem, in which data from source and target are represented by features with different dimensions. By introducing two projection matrices, first transform domains into a common subspace such that similarity between samples across can be measured. We then propose new feature mapping function for each domain, augments transformed their original zeros. Existing supervised learning methods (e.g., SVM SVR) readily employed...

10.1109/tpami.2013.167 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2013-08-29

We present a new framework for multimedia content analysis and retrieval which consists of two independent algorithms. First, we propose semi-supervised algorithm called ranking with Local Regression Global Alignment (LRGA) to learn robust Laplacian matrix data ranking. In LRGA, each point, local linear regression model is used predict the scores its neighboring points. A unified objective function then proposed globally align models from all points so that an optimal score can be assigned...

10.1109/tpami.2011.170 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2011-08-25

The performance of object detection has recently been significantly improved due to the powerful features learnt through convolutional neural networks (CNNs). Despite remarkable success, there are still several major challenges in detection, including rotation, within-class diversity, and between-class similarity, which generally degenerate performance. To address these issues, we build up existing state-of-the-art systems propose a simple but effective method train rotation-invariant Fisher...

10.1109/tip.2018.2867198 article EN IEEE Transactions on Image Processing 2018-08-24

In this paper, we propose a new image clustering algorithm, referred to as using local discriminant models and global integration (LDMGI). To deal with the data points sampled from nonlinear manifold, for each point, construct clique comprising point its neighboring points. Inspired by Fisher criterion, use model evaluate performance of samples within clique. obtain result, further unified objective function globally integrate all cliques. With function, spectral relaxation rotation are used...

10.1109/tip.2010.2049235 article EN IEEE Transactions on Image Processing 2010-04-30

A large family of algorithms for dimensionality reduction end with solving a Trace Ratio problem in the form arg max <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">W</sub> Tr(W <sup xmlns:xlink="http://www.w3.org/1999/xlink">T</sup> S xmlns:xlink="http://www.w3.org/1999/xlink">P</sub> W)/Tr(WT xmlns:xlink="http://www.w3.org/1999/xlink">I</sub> W) xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> , which is generally transformed into...

10.1109/cvpr.2007.382983 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2007-06-01

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> There is a growing interest in subspace learning techniques for face recognition; however, the excessive dimension of data space often brings algorithms into curse dimensionality dilemma. In this paper, we present novel approach to solve supervised reduction problem by encoding an image object as general tensor second or even higher order. First, propose discriminant criterion, whereby multiple...

10.1109/tip.2006.884929 article EN IEEE Transactions on Image Processing 2006-12-19

We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), learn robust decision function (referred target classifier) for label prediction of patterns from the by leveraging set pre-computed classifiers auxiliary/source classifiers) independently learned with labeled domains. introduce new data-dependent regularizer based on smoothness assumption into Least-Squares SVM (LS-SVM), which enforces that classifier shares similar values auxiliary...

10.1145/1553374.1553411 article EN 2009-06-14

Spectral clustering (SC) methods have been successfully applied to many real-world applications. The success of these SC is largely based on the manifold assumption, namely, that two nearby data points in high-density region a low-dimensional same cluster label. However, such an assumption might not always hold high-dimensional data. When do exhibit clear structure (e.g., and sparse data), performance will be degraded become even worse than K -means clustering. In this paper, motivated by...

10.1109/tnn.2011.2162000 article EN IEEE Transactions on Neural Networks 2011-10-05

Cross-domain learning methods have shown promising results by leveraging labeled patterns from auxiliary domains to learn a robust classifier for target domain, which has limited number of samples. To cope with the tremendous change feature distribution between different in video concept detection, we propose new cross-domain kernel method. Our method, referred as Domain Transfer SVM (DTSVM), simultaneously learns function and minimizing both structural risk functional mismatch unlabeled...

10.1109/cvpr.2009.5206747 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009-06-01

Although current salient object detection (SOD) works have achieved significant progress, they are limited when it comes to the integrity of predicted regions. We define concept at both a micro and macro level. Specifically, level, model should highlight all parts that belong certain object. Meanwhile, needs discover objects in given image. To facilitate learning for SOD, we design novel Integrity Cognition Network (ICON), which explores three important components strong features. 1) Unlike...

10.1109/tpami.2022.3179526 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-01

Learning based video compression attracts increasing attention in the past few years. The previous hybrid coding approaches rely on pixel space operations to reduce spatial and temporal redundancy, which may suffer from inaccurate motion estimation or less effective compensation. In this work, we propose a feature-space network (FVC) by performing all major (i.e., estimation, compression, compensation residual compression) feature space. Specifically, proposed deformable module, first apply...

10.1109/cvpr46437.2021.00155 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Traditional video compression approaches build upon the hybrid coding framework with motion-compensated prediction and residual transform coding. In this paper, we propose first end-to-end deep to take advantage of both classical architecture powerful non-linear representation ability neural networks. Our employs pixel-wise motion information, which is learned from an optical flow network further compressed by auto-encoder save bits. The other components are also implemented well-designed...

10.1109/tpami.2020.2988453 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-04-20

In this paper, we propose a two-stage deep learning framework called VoxelContext-Net for both static and dynamic point cloud compression. Taking advantages of octree based methods voxel schemes, our approach employs the context to compress structured data. Specifically, first extract local representation that encodes spatial neighbouring information each node in constructed octree. Then, entropy coding stage, model symbols non-leaf nodes lossless way. Furthermore, compression, additionally...

10.1109/cvpr46437.2021.00598 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

In the last decades, a large family of algorithms - supervised or unsupervised; stemming from statistic geometry theory have been proposed to provide different solutions problem dimensionality reduction. this paper, beyond motivations these algorithms, we propose general framework, graph embedding along with its linearization and kernelization, which in reveals underlying objective shared by most previous algorithms. It presents unified perspective understand algorithms; that is, each...

10.1109/cvpr.2005.170 article EN 2005-07-27

Dimensionality reduction algorithms, which aim to select a small set of efficient and discriminant features, have attracted great attention for human gait recognition content-based image retrieval (CBIR). In this paper, we present extensions our recently proposed marginal Fisher analysis (MFA) address these problems. For recognition, first direct application MFA, then inspired by recent advances in matrix tensor-based dimensionality matrix-based MFA directly handling 2-D input the form...

10.1109/tip.2007.906769 article EN IEEE Transactions on Image Processing 2007-10-15

We first propose a new spatio-temporal context distribution feature of interest points for human action recognition. Each video is expressed as set relative XYT coordinates between pairwise in local region. learn global GMM (referred to Universal Background Model, UBM) using the coordinate features from all training videos, and then represent each normalized parameters video-specific adapted GMM. In order capture relationships at different levels, multiple GMMs are utilized describe...

10.1109/cvpr.2011.5995624 article EN 2011-06-01
Coming Soon ...