- Face and Expression Recognition
- Advanced Image and Video Retrieval Techniques
- Face recognition and analysis
- Video Surveillance and Tracking Methods
- Text and Document Classification Technologies
- Advanced Computing and Algorithms
- Biometric Identification and Security
- Domain Adaptation and Few-Shot Learning
- Advanced Graph Neural Networks
- Anomaly Detection Techniques and Applications
- Advanced Neural Network Applications
- Tensor decomposition and applications
- Software Engineering Research
- Software Reliability and Analysis Research
- Complex Network Analysis Techniques
- Advanced Clustering Algorithms Research
- Generative Adversarial Networks and Image Synthesis
- Natural Language Processing Techniques
- Software Engineering Techniques and Practices
- Human Pose and Action Recognition
- Remote-Sensing Image Classification
- Machine Learning and Data Classification
- Sparse and Compressive Sensing Techniques
- Topic Modeling
- Adversarial Robustness in Machine Learning
Xidian University
2019-2024
Harbin University of Science and Technology
2024
University of Jinan
2023
Anhui Jianzhu University
2022
Seattle University
2022
Amazon (United States)
2020-2022
Institute of Computing Technology
2021-2022
Hunan University of Technology
2022
Amazon (Germany)
2020-2021
Hefei University of Technology
2020-2021
We propose a new method to detect deepfake images using the cue of source feature inconsistency within forged images. It is based on hypothesis that images' distinct features can be preserved and extracted after going through state-of-the-art generation processes. introduce novel representation learning approach, called pair-wise self-consistency (PCL), for training ConvNets extract these accompanied by image synthesis genera-tor (I2G), provide richly annotated data PCL. Experimental results...
Low-rank representation based on tensor-Singular Value Decomposition (t-SVD) has achieved impressive results for multi-view subspace clustering, but it does not well deal with noise and illumination changes embedded in data. The major reason is that all the singular values have same contribution tensor-nuclear norm t-SVD, which make sense existence of change. To improve robustness clustering performance, we study weighted t-SVD develop an efficient algorithm to optimize minimization (WTNNM)...
Despite the impressive clustering performance and efficiency in characterizing both relationship between data cluster structure, most existing graph-based multi-view methods still have following drawbacks. They suffer from expensive time burden due to construction of graphs eigen-decomposition Laplacian matrix. Moreover, none them simultaneously considers similarity inter-view intra-view. In this article, we propose a variance-based de-correlation anchor selection strategy for bipartite...
We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after long time span. This is realized by preserving large spatio-temporal memory to store identity embeddings tracked objects, adaptively referencing aggregating useful information from as needed. Our model, called MeMOT, consists three main modules are all Transformer-based: 1) Hypothesis Generation produce proposals in current video frame; 2)...
Despite the promising results, tensor robust principal component analysis (TRPCA), which aims to recover underlying low-rank structure of clean data corrupted with noise/outliers by shrinking all singular values equally, cannot well preserve salient content image. The major reason is that, in real applications, there a difference information between image, and larger are generally associated some parts Thus, should be treated differently. Inspired this observation, we investigate whether...
Despite much research progress in image semantic segmentation, it remains challenging under adverse environmental conditions caused by imaging limitations of the visible spectrum, while thermal infrared cameras have several advantages over for such as operating total darkness, insensitive to illumination variations, robust shadow effects, and strong ability penetrate haze smog. These make segmentation objects day night. In this article, we propose a novel network architecture, called...
Despite the promising preliminary results, existing graph convolutional network (GCN) based multi-view learning methods directly use structure as view descriptor, which may inhibit ability of for multimedia data. The major reason is that, in real applications, contain outliers. Moreover, they fail to take advantage information embedded inaccurate clustering labels obtained from their proposed methods, resulting inferior results. These observations motivate us study whether there a better...
Despite the promising preliminary results, tensor-singular value decomposition (t-SVD)-based multiview subspace is incapable of dealing with real problems, such as noise and illumination changes. The major reason that tensor-nuclear norm minimization (TNNM) used in t-SVD regularizes each singular equally, which does not make sense matrix completion coefficient learning. In this case, values represent different perspectives should be treated differently. To well exploit significant difference...
Incomplete multiview clustering is a challenging problem in the domain of unsupervised learning. However, existing incomplete methods only consider similarity structure intraview while neglecting interview. Thus, they cannot take advantage both complementary information and spatial embedded matrices different views. To this end, we complete graph with missing data referring to tensor present novel effective model handel task. be specific, interview graphs via Schatten p -norm-based...
The existing deep multiview clustering (MVC) methods are mainly based on autoencoder networks, which seek common latent variables to reconstruct the original input of each view individually. However, due view-specific reconstruction loss, it is challenging extract consistent representations over multiple views for clustering. To address this challenge, we propose adversarial MVC (AMvC) networks in article. proposed AMvC generates view's samples conditioning fused among different encourage a...
Multi-view spectral clustering has become appealing due to its good performance in capturing the correlations among all views. However, on one hand, many existing methods usually require a quadratic or cubic complexity for graph construction eigenvalue decomposition of Laplacian matrix; other they are inefficient and unbearable burden be applied large scale data sets, which can easily obtained era big data. Moreover, cannot encode complementary information between adjacency matrices, i.e....
Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing lack a unified objective simultaneously learn the inter- and intra-modality consistency, resulting limited representation learning capacity. other processes are modeled finite sample set cannot handle out-of-sample data. To above two challenges, we propose...
We propose a way to learn visual features that are compatible with previously computed ones even when they have different dimensions and learned via neural network architectures loss functions. Compatible means that, if such used compare images, then ``new'' can be compared directly ``old'' features, so interchangeably. This enables search systems bypass computing new for all seen images updating the embedding models, process known as backfilling. Backward compatibility is critical quickly...
We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs long- and short-term memory mechanism to model prolonged sequence data. It consists of an LSTR encoder that dynamically leverages coarse-scale historical information from extended window (e.g., 2048 frames spanning up 8 minutes), together with decoder focuses on short time 32 seconds) the fine-scale characteristics Compared prior work, provides effective efficient method...
Abstract Palmprint recognition and palm vein are two emerging biometrics technologies. In the past decades, many traditional methods have been proposed for palmprint recognition, achieved impressive results. However, research on deep learning-based is still very preliminary. this paper, in order to investigate problem of learning based 2D 3D in-depth, we conduct performance evaluation seventeen representative classic convolutional neural networks (CNNs) one database, five databases...
Graph-based multimedia data clustering has attracted much attention due to the impressive performance for arbitrarily shaped data. However, existing graph-based methods need post-processing get labels with high computational complexity. Moreover, it is sub-optimal label learning fact that they exploit complementary information embedded in different types pixel by pixel. To handle these problems, we present a novel model good interpretability clustering. be specific, our decomposes anchor...
Online tracking of multiple objects in videos requires strong capacity modeling and matching object appearances. Previous methods for learning appearance embedding mostly rely on instance-level without considering the temporal continuity provided by videos. We design a new instance-to-track objective to learn that compares candidate detection tracks persisted tracker. It enables us not only from labeled with complete tracks, but also unlabeled or partially implement this unified form...
We propose a hierarchical graph neural network (GNN) model that learns how to cluster set of images into an unknown number identities using training annotated with labels belonging disjoint identities. Our GNN uses novel approach merge connected components predicted at each level the hierarchy form new next level. Unlike fully unsupervised clustering, choice grouping and complexity criteria stems naturally from supervision in set. The resulting method, Hi-LANDER, achieves average 49%...
Attributed graph clustering, which learns node representation from attribute and topological for is a fundamental challenging task multimedia network-structured data analysis. Recently, contrastive learning (GCL)-based methods have obtained impressive clustering performance on this task. Nevertheless, there still remain some limitations to be solved: 1) most existing fail consider the self-consistency between latent representations cluster structures; 2) require post-processing operation get...