Mingze Xu

ORCID: 0000-0002-0671-4122
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Surveillance and Tracking Methods
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Cryospheric studies and observations
  • Arctic and Antarctic ice dynamics
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Autonomous Vehicle Technology and Safety
  • Climate change and permafrost
  • Advanced Neuroimaging Techniques and Applications
  • Video Analysis and Summarization
  • Cancer, Lipids, and Metabolism
  • Generative Adversarial Networks and Image Synthesis
  • Digital Media Forensic Detection
  • Functional Brain Connectivity Studies
  • Extracellular vesicles in disease
  • Acute Ischemic Stroke Management
  • MRI in cancer diagnosis
  • Osteoarthritis Treatment and Mechanisms
  • Ferroptosis and cancer prognosis
  • Proteins in Food Systems
  • Advanced MRI Techniques and Applications
  • Fetal and Pediatric Neurological Disorders
  • Image Enhancement Techniques
  • Drilling and Well Engineering

Peking University
2016-2025

Center for Life Sciences
2024-2025

Yangzhou University
2025

The First Affiliated Hospital, Sun Yat-sen University
2024

Sun Yat-sen University
2021-2024

South China Normal University
2020-2024

Chongqing Institute of Geology and Mineral Resources
2024

Ministry of Natural Resources
2024

Shanghai Jiao Tong University
2014-2023

Tongren Hospital
2019-2023

We propose a new method to detect deepfake images using the cue of source feature inconsistency within forged images. It is based on hypothesis that images' distinct features can be preserved and extracted after going through state-of-the-art generation processes. introduce novel representation learning approach, called pair-wise self-consistency (PCL), for training ConvNets extract these accompanied by image synthesis genera-tor (I2G), provide richly annotated data PCL. Experimental results...

10.1109/iccv48922.2021.01475 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after entire video fully observed. However, important real-time applications including surveillance driver assistance systems require identifying soon each frame arrives, based only current historical observations. In this paper, we propose a novel framework, Temporal Recurrent Network (TRN), to model greater context by simultaneously performing online...

10.1109/iccv.2019.00563 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after long time span. This is realized by preserving large spatio-temporal memory to store identity embeddings tracked objects, adaptively referencing aggregating useful information from as needed. Our model, called MeMOT, consists three main modules are all Transformer-based: 1) Hypothesis Generation produce proposals in current video frame; 2)...

10.1109/cvpr52688.2022.00792 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We propose to predict the future trajectories of observed agents (e.g., pedestrians or vehicles) by estimating and using their goals at multiple time scales. argue that goal a moving agent may change over time, modeling continuously provides more accurate detailed information for trajectory estimation. To this end, we present recurrent network prediction, called Stepwise Goal-Driven Network (SGNet). Unlike prior work models only single, long-term goal, SGNet estimates uses temporal In...

10.1109/lra.2022.3145090 article EN IEEE Robotics and Automation Letters 2022-01-25

Recognizing abnormal events such as traffic violations and accidents in natural driving scenes is essential for successful autonomous advanced driver assistance systems. However, most work on video anomaly detection suffers from two crucial drawbacks. First, they assume cameras are fixed videos have static backgrounds, which reasonable surveillance applications but not vehicle-mounted cameras. Second, pose the problem one-class classification, relying arduously hand-labeled training datasets...

10.1109/iros40897.2019.8967556 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019-11-01

Predicting the future location of vehicles is essential for safety-critical applications such as advanced driver assistance systems (ADAS) and autonomous driving. This paper introduces a novel approach to simultaneously predict both scale target in first-person (egocentric) view an ego-vehicle. We present multi-stream recurrent neural network (RNN) encoder-decoder model that separately captures object pixel-level observations vehicle localization. show incorporating dense optical flow...

10.1109/icra.2019.8794474 article EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

We propose TubeR: a simple solution for spatio-temporal video action detection. Different from existing methods that depend on either an offline actor detector or hand-designed actor-positional hypotheses like proposals anchors, we to directly detect tubelet in by simultaneously performing localization and recognition single representation. TubeR learns set of tubelet-queries utilizes tubelet-attention module model the dynamic nature clip, which effectively reinforces capacity compared using...

10.1109/cvpr52688.2022.01323 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Video anomaly detection (VAD) has been extensively studied for static cameras but is much more challenging in egocentric driving videos where the scenes are extremely dynamic. This paper proposes an unsupervised method traffic VAD based on future object localization. The idea to predict locations of participants over a short horizon, and then monitor accuracy consistency these predictions as evidence anomaly. Inconsistent tend indicate occurred or about occur. To evaluate our method, we...

10.1109/tpami.2022.3150763 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-02-15

Background: The cumulative effect of body mass index (BMI) on brain health remains ill-defined. effects overweight across different age groups need clarification. We analyzed the BMI neuroimaging features in adults ages. Methods: This study was based a multicenter, community-based cohort study. modeled trajectories over 16 years to evaluate exposure. Multimodality data were collected once for volumetric measurements macrostructure, white matter hyperintensity (WMH), and microstructure. used...

10.34133/hds.0087 article EN cc-by Health Data Science 2024-01-01

We present Long Short-term TRansformer (LSTR), a temporal modeling algorithm for online action detection, which employs long- and short-term memory mechanism to model prolonged sequence data. It consists of an LSTR encoder that dynamically leverages coarse-scale historical information from extended window (e.g., 2048 frames spanning up 8 minutes), together with decoder focuses on short time 32 seconds) the fine-scale characteristics Compared prior work, provides effective efficient method...

10.48550/arxiv.2107.03377 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Passive visual systems typically fail to recognize objects in the amodal setting where they are heavily occluded. In contrast, humans and other embodied agents have ability move environment actively control viewing angle better understand object shapes semantics. this work, we introduce task of Embodied Amodel Recognition (EAR): an agent is instantiated a 3D close occluded target object, free perform classification, localization, segmentation. To address problem, develop new model called...

10.1109/iccv.2019.00213 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos. Previous methods aim localize by learning feature representations that can directly separate the start point from its preceding background. It is challenging due subtle appearance difference near lack training data. Instead, decomposes ODAS into two stages: classification (using ClsNet) localization LocNet). ClsNet focuses on...

10.1109/iccv.2019.00564 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Partially biobased thermoplastic vulcanizates (TPVs) with shape memory properties were prepared by in situ dynamic vulcanization of eucommia ulmoides gum (EUG) and polyolefin elastomer (POE). The cross-linked EUG phase dispersed POE matrix acted as a framework structure throughout the phase, which limited irreversible deformation provided sufficient resilience force to drive molecular chains recover their initial shape. results demonstrated that EUG/POE TPVs exhibited excellent high fixity...

10.1021/acs.iecr.8b04710 article EN Industrial & Engineering Chemistry Research 2019-04-08

Abstract Cerebellar dysfunction may substantially contribute to the clinical symptoms of Parkinson’s disease (PD). The role cerebellar subregions in tremors and gait disturbances PD remains unknown. To investigate alterations subregion volumes functional connectivity (FC), as well FC between dentate nucleus (DN) ventral lateral posterior (VLp) thalamus, which are potentially involved different motor subtypes. We conducted morphometric resting-state analyses various 22 tremor-dominant (TD)-PD...

10.1007/s00702-023-02606-9 article EN cc-by Journal of Neural Transmission 2023-03-01

We consider scenarios in which we wish to perform joint scene understanding, object tracking, activity recognition, and other tasks multiple people are wearing body-worn cameras while a third-person static camera also captures the scene. To do this, need establish person-level correspondences across first-and videos, is challenging because wearer not visible from his/her own egocentric video, preventing use of direct feature matching. In this paper, propose new semi-Siamese Convolutional...

10.1109/cvpr.2017.503 preprint EN 2017-07-01

Abstract To identify the association between functional and structural changes of default mode network (DMN) underlying cognitive impairment in Late-onset depression (LOD), 32 LOD patients 39 normal controls were recruited underwent resting-state fMRI, DTI scans, assessments. Seed-based correlation analysis was conducted to explore connectivity (FC) DMN. Deterministic tractography FC-impaired regions performed examine (SC). Partial analyses employed evaluate those altered FC SC. Compared...

10.1038/srep37617 article EN cc-by Scientific Reports 2016-11-25

Video anomaly detection (VAD) has been extensively studied. However, research on egocentric traffic videos with dynamic scenes lacks large-scale benchmark datasets as well effective evaluation metrics. This paper proposes a \textit{when-where-what} pipeline to detect, localize, and recognize anomalous events from videos. We introduce new dataset called Detection of Traffic Anomaly (DoTA) containing 4,677 temporal, spatial, categorical annotations. A spatial-temporal area under curve (STAUC)...

10.48550/arxiv.2004.03044 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Online tracking of multiple objects in videos requires strong capacity modeling and matching object appearances. Previous methods for learning appearance embedding mostly rely on instance-level without considering the temporal continuity provided by videos. We design a new instance-to-track objective to learn that compares candidate detection tracks persisted tracker. It enables us not only from labeled with complete tracks, but also unlabeled or partially implement this unified form...

10.48550/arxiv.2107.02396 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Multi-modality medical imaging study, especially brain MRI, greatly facilitates the research on subclinical disease. However, there is still a lack of such studies with wider age span participants. The MEdical sTudy bAsed KaiLuan Study (META-KLS) was designed to address this issue large sample size population. We aim enrol at least 1000 subjects in META-KLS. All participants without contraindications will perform multi-modality imaging, including retinal fundus photograph, optical coherence...

10.1136/bmjopen-2022-067283 article EN cc-by-nc BMJ Open 2023-02-01
Coming Soon ...