- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Advanced Neural Network Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Gait Recognition and Analysis
- Anomaly Detection Techniques and Applications
- Hydrocarbon exploration and reservoir analysis
- Face recognition and analysis
- Adversarial Robustness in Machine Learning
- Archaeological Research and Protection
- Image Processing and 3D Reconstruction
- Atmospheric and Environmental Gas Dynamics
- Environmental Toxicology and Ecotoxicology
- Video Analysis and Summarization
- Visual Attention and Saliency Detection
- Viral Infections and Vectors
- Handwritten Text Recognition Techniques
- Generative Adversarial Networks and Image Synthesis
- Advanced Image Processing Techniques
- Advanced X-ray and CT Imaging
- Heavy metals in environment
- Methane Hydrates and Related Phenomena
- Advanced Computing and Algorithms
Kingsoft (China)
2017-2024
University of South China
2022-2024
University of Chinese Academy of Sciences
2023-2024
Nanjing Medical University
2023
Institute of Geology and Geophysics
2022-2023
Chinese Academy of Sciences
2022-2023
Jiangsu Cancer Hospital
2023
Peking University
2012-2017
Cloud Computing Center
2017
Linzi District People's Hospital
2011
Feature extraction and matching are two crucial components in person Re-Identification (ReID). The large pose deformations the complex view variations exhibited by captured images significantly increase difficulty of learning features from images. To overcome these difficulties, this work we propose a Pose-driven Deep Convolutional (PDC) model to learn improved feature models end end. Our deep architecture explicitly leverages human part cues alleviate robust representations both global...
To get more accurate saliency maps, recent methods mainly focus on aggregating multi-level features from fully convolutional network (FCN) and introducing edge information as auxiliary supervision. Though remarkable progress has been achieved, we observe that the closer pixel is to edge, difficult it be predicted, because pixels have a very imbalance distribution. address this problem, propose label decoupling framework (LDF) which consists of (LD) procedure feature interaction (FIN). LD...
In this paper, we present a large-scale dataset and establish baseline for prohibited item discovery in Security Inspection X-ray images. Our dataset, named SIXray, consists of 1,059,231 images, which 6 classes 8,929 items are manually annotated. It raises brand new challenge overlapping image data, meanwhile shares the same properties with existing datasets, including complex yet meaningless contexts class imbalance. We propose an approach class-balanced hierarchical refinement (CHR) to...
In unsupervised domain adaptation, rich domain-specific characteristics bring great challenge to learn domain-invariant representations. However, discrepancy is considered be directly minimized in existing solutions, which difficult achieve practice. Some methods alleviate the difficulty by explicitly modeling and parts representations, but adverse influence of explicit construction lies residual constructed this paper, we equip adversarial adaptation with Gradually Vanishing Bridge (GVB)...
We propose a novel Multi-Task Learning with Low Rank Attribute Embedding (MTL-LORAE) framework for person re-identification. Re-identifications from multiple cameras are regarded as related tasks to exploit shared information improve re-identification accuracy. Both low level features and semantic/data-driven attributes utilized. Since generally correlated, we introduce rank attribute embedding into the MTL formulation embed original binary continuous space, where incorrect incomplete...
Optimizing a deep neural network is fundamental task in computer vision, yet direct training methods often suffer from over-fitting. Teacher-student optimization aims at providing complementary cues model trained previously, but these approaches are considerably slow due to the pipeline of few generations sequence, i.e., time complexity increased by several times. This paper presents snapshot distillation (SD), first framework which enables teacher-student one generation. The idea SD very...
Learning visual features from unlabeled image data is an important yet challenging task, which often achieved by training a model on some annotation-free information. We consider spatial contexts, for we solve so-called jigsaw puzzles, i.e., each cut into grids and then disordered, the goal to recover correct configuration. Existing approaches formulated it as classification task defining fixed mapping small subset of configurations class set, but these ignore underlying relationship between...
We propose Multi-Task Learning with Low Rank Attribute Embedding (MTL-LORAE) to address the problem of person re-identification on multi-cameras. Re-identifications different cameras are considered as related tasks, which allows shared information among tasks be explored improve accuracy. The MTL-LORAE framework integrates low-level features mid-level attributes descriptions for persons. To accuracy such description, we introduce low-rank attribute embedding, maps original binary into a...
Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering image information. As result, they suffer from performance drop on out-of-distribution data and inadequate visual explanation. Based experimental analysis existing robust VQA methods, we stress language that comes two aspects, i.e., distribution shortcut bias. We further propose new de-bias framework, Greedy Gradient Ensemble (GGE), which...
According to existing studies, human body edge and pose are two beneficial factors parsing. The effectiveness of each the high-level features (edge pose) is confirmed through concatenation their with parsing features. Driven by insights, this paper studies how semantic boundaries keypoint locations can jointly improve Compared practice feature concatenation, we find that uncovering correlation among three a superior way leveraging pivotal contextual cues provided edges poses. To capture such...
Future activity anticipation is a challenging problem in egocentric vision. As standard future paradigm, recursive sequence prediction suffers from the accumulation of errors. To address this problem, we propose simple and effective Self-Regulated Learning framework, which aims to regulate intermediate representation consecutively produce that (a) emphasizes novel information frame current time-stamp contrast previously observed content, (b) reflects its correlation with frames. The former...
Facial age estimation from a face image is an important yet very challenging task in computer vision, since humans with different races and/or genders, exhibit quite patterns their facial aging processes. To deal the influence of race and gender, previous methods perform within each population separately. In practice, however, it often difficult to collect label sufficient data for population. Therefore, would be helpful exploit existing large labeled dataset one (source) improve performance...
Human parsing and pose estimation are crucial for the understanding of human behaviors. Since these tasks closely related, employing one unified model to perform two simultaneously allows them benefit from each other. However, since is a pixel-wise classification process while usually regression task, it non-trivial extract discriminative features both modeling their correlation in joint learning fashion. Recent studies have shown that Neural Architecture Search (NAS) has ability allocate...
Neural networks often make predictions relying on the spurious correlations from datasets rather than intrinsic properties of task interest, facing with sharp degradation out-of-distribution (OOD) test data. Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail handle complicated OOD scenarios. Others implicitly identify special design low capability biased models or losses, degrade when training and testing data are same distribution. In this...
The mechanisms underlying the involvement of long non-coding RNAs (lncRNAs) in metastasis small cell lung cancer (SCLC) remain largely unknown. Here, we identified that lncRNA ITPR1-AS1 was upregulated SCLC and lymph node tissues positively correlated with malignant features. overexpression an independent risk factor for overall survival patients SCLC. Our data confirmed induces both vitro vivo. Mechanistically, acts as a scaffold to enhance interaction between SRC-associated mitosis 68 kDa...
Weakly supervised instance segmentation (WSIS) with only image-level labels has recently drawn much attention. To date, bottom-up WSIS methods refine discriminative cues from classifiers sophisticated multi-stage training procedures, which also suffer inconsistent object boundaries. And top-down are formulated as cascade detection-to-segmentation pipeline, in the quality of learning heavily depends on pseudo masks generated detectors. In this paper, we propose a unified parallel...
The development of uranium mines has been necessary to obtain abundant and scarce resources, but they also bring inevitable radioactive contamination the surrounding soil, rivers lakes. This paper explores sensitivity Cypridopsis vidua element heavy elements cadmium copper with single combined acute toxicity experiments model predictions. results from showed that degree toxic effects was > uranium. compound U-Cd U-Cu higher than weakest component lower strongest component, whereas Cd-Cu...
Panoptic segmentation aims to partition an image object instances and semantic content for thing stuff categories, respectively. To date, learning weakly supervised panoptic (WSPS) with only image-level labels remains unexplored. In this paper, we propose efficient jointly thing-and-stuff mining (JTSM) framework WSPS. end, design a novel mask of interest pooling (MoIPool) extract fixed-size pixel-accurate feature maps arbitrary-shape segmentations. MoIPool enables branch leverage multiple...