- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Anomaly Detection Techniques and Applications
- Cryospheric studies and observations
- Arctic and Antarctic ice dynamics
- Advanced Neural Network Applications
- Multimodal Machine Learning Applications
- Autonomous Vehicle Technology and Safety
- Climate change and permafrost
- Advanced Neuroimaging Techniques and Applications
- Video Analysis and Summarization
- Cancer, Lipids, and Metabolism
- Generative Adversarial Networks and Image Synthesis
- Digital Media Forensic Detection
- Functional Brain Connectivity Studies
- Extracellular vesicles in disease
- Acute Ischemic Stroke Management
- MRI in cancer diagnosis
- Osteoarthritis Treatment and Mechanisms
- Ferroptosis and cancer prognosis
- Proteins in Food Systems
- Advanced MRI Techniques and Applications
- Fetal and Pediatric Neurological Disorders
- Image Enhancement Techniques
- Drilling and Well Engineering
Peking University
2016-2025
Center for Life Sciences
2024-2025
Yangzhou University
2025
The First Affiliated Hospital, Sun Yat-sen University
2024
Sun Yat-sen University
2021-2024
South China Normal University
2020-2024
Chongqing Institute of Geology and Mineral Resources
2024
Ministry of Natural Resources
2024
Shanghai Jiao Tong University
2014-2023
Tongren Hospital
2019-2023
We propose a new method to detect deepfake images using the cue of source feature inconsistency within forged images. It is based on hypothesis that images' distinct features can be preserved and extracted after going through state-of-the-art generation processes. introduce novel representation learning approach, called pair-wise self-consistency (PCL), for training ConvNets extract these accompanied by image synthesis genera-tor (I2G), provide richly annotated data PCL. Experimental results...
Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after entire video fully observed. However, important real-time applications including surveillance driver assistance systems require identifying soon each frame arrives, based only current historical observations. In this paper, we propose a novel framework, Temporal Recurrent Network (TRN), to model greater context by simultaneously performing online...
We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after long time span. This is realized by preserving large spatio-temporal memory to store identity embeddings tracked objects, adaptively referencing aggregating useful information from as needed. Our model, called MeMOT, consists three main modules are all Transformer-based: 1) Hypothesis Generation produce proposals in current video frame; 2)...
We propose to predict the future trajectories of observed agents (e.g., pedestrians or vehicles) by estimating and using their goals at multiple time scales. argue that goal a moving agent may change over time, modeling continuously provides more accurate detailed information for trajectory estimation. To this end, we present recurrent network prediction, called Stepwise Goal-Driven Network (SGNet). Unlike prior work models only single, long-term goal, SGNet estimates uses temporal In...
Recognizing abnormal events such as traffic violations and accidents in natural driving scenes is essential for successful autonomous advanced driver assistance systems. However, most work on video anomaly detection suffers from two crucial drawbacks. First, they assume cameras are fixed videos have static backgrounds, which reasonable surveillance applications but not vehicle-mounted cameras. Second, pose the problem one-class classification, relying arduously hand-labeled training datasets...
Predicting the future location of vehicles is essential for safety-critical applications such as advanced driver assistance systems (ADAS) and autonomous driving. This paper introduces a novel approach to simultaneously predict both scale target in first-person (egocentric) view an ego-vehicle. We present multi-stream recurrent neural network (RNN) encoder-decoder model that separately captures object pixel-level observations vehicle localization. show incorporating dense optical flow...
We propose TubeR: a simple solution for spatio-temporal video action detection. Different from existing methods that depend on either an offline actor detector or hand-designed actor-positional hypotheses like proposals anchors, we to directly detect tubelet in by simultaneously performing localization and recognition single representation. TubeR learns set of tubelet-queries utilizes tubelet-attention module model the dynamic nature clip, which effectively reinforces capacity compared using...
Video anomaly detection (VAD) has been extensively studied for static cameras but is much more challenging in egocentric driving videos where the scenes are extremely dynamic. This paper proposes an unsupervised method traffic VAD based on future object localization. The idea to predict locations of participants over a short horizon, and then monitor accuracy consistency these predictions as evidence anomaly. Inconsistent tend indicate occurred or about occur. To evaluate our method, we...
Background: The cumulative effect of body mass index (BMI) on brain health remains ill-defined. effects overweight across different age groups need clarification. We analyzed the BMI neuroimaging features in adults ages. Methods: This study was based a multicenter, community-based cohort study. modeled trajectories over 16 years to evaluate exposure. Multimodality data were collected once for volumetric measurements macrostructure, white matter hyperintensity (WMH), and microstructure. used...
We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs long- and short-term memory mechanism to model prolonged sequence data. It consists of an LSTR encoder that dynamically leverages coarse-scale historical information from extended window (e.g., 2048 frames spanning up 8 minutes), together with decoder focuses on short time 32 seconds) the fine-scale characteristics Compared prior work, provides effective efficient method...
Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded. In contrast, humans and other embodied agents have ability move environment actively control viewing angle better understand object shapes semantics. this work, we introduce task of Embodied Amodel Recognition (EAR): an agent is instantiated a 3D close occluded target object, free perform classification, localization, segmentation. To address problem, develop new model called...
We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos. Previous methods aim localize by learning feature representations that can directly separate the start point from its preceding background. It is challenging due subtle appearance difference near lack training data. Instead, decomposes ODAS into two stages: classification (using ClsNet) localization LocNet). ClsNet focuses on...
Partially biobased thermoplastic vulcanizates (TPVs) with shape memory properties were prepared by in situ dynamic vulcanization of eucommia ulmoides gum (EUG) and polyolefin elastomer (POE). The cross-linked EUG phase dispersed POE matrix acted as a framework structure throughout the phase, which limited irreversible deformation provided sufficient resilience force to drive molecular chains recover their initial shape. results demonstrated that EUG/POE TPVs exhibited excellent high fixity...
Abstract Cerebellar dysfunction may substantially contribute to the clinical symptoms of Parkinson’s disease (PD). The role cerebellar subregions in tremors and gait disturbances PD remains unknown. To investigate alterations subregion volumes functional connectivity (FC), as well FC between dentate nucleus (DN) ventral lateral posterior (VLp) thalamus, which are potentially involved different motor subtypes. We conducted morphometric resting-state analyses various 22 tremor-dominant (TD)-PD...
We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, need establish person-level correspondences across first-and videos, is challenging because wearer not visible from his/her own egocentric video, preventing use of direct feature matching. In this paper, propose new semi-Siamese Convolutional...
Abstract To identify the association between functional and structural changes of default mode network (DMN) underlying cognitive impairment in Late-onset depression (LOD), 32 LOD patients 39 normal controls were recruited underwent resting-state fMRI, DTI scans, assessments. Seed-based correlation analysis was conducted to explore connectivity (FC) DMN. Deterministic tractography FC-impaired regions performed examine (SC). Partial analyses employed evaluate those altered FC SC. Compared...
Video anomaly detection (VAD) has been extensively studied. However, research on egocentric traffic videos with dynamic scenes lacks large-scale benchmark datasets as well effective evaluation metrics. This paper proposes a \textit{when-where-what} pipeline to detect, localize, and recognize anomalous events from videos. We introduce new dataset called Detection of Traffic Anomaly (DoTA) containing 4,677 temporal, spatial, categorical annotations. A spatial-temporal area under curve (STAUC)...
Online tracking of multiple objects in videos requires strong capacity modeling and matching object appearances. Previous methods for learning appearance embedding mostly rely on instance-level without considering the temporal continuity provided by videos. We design a new instance-to-track objective to learn that compares candidate detection tracks persisted tracker. It enables us not only from labeled with complete tracks, but also unlabeled or partially implement this unified form...
Multi-modality medical imaging study, especially brain MRI, greatly facilitates the research on subclinical disease. However, there is still a lack of such studies with wider age span participants. The MEdical sTudy bAsed KaiLuan Study (META-KLS) was designed to address this issue large sample size population. We aim enrol at least 1000 subjects in META-KLS. All participants without contraindications will perform multi-modality imaging, including retinal fundus photograph, optical coherence...