Wen Liu

ORCID: 0000-0002-3867-1825
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Anomaly Detection Techniques and Applications
  • Human Pose and Action Recognition
  • Advanced Vision and Imaging
  • Computer Graphics and Visualization Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Video Surveillance and Tracking Methods
  • Advanced Computational Techniques and Applications
  • Speech and Audio Processing
  • 3D Shape Modeling and Analysis
  • Advanced Image and Video Retrieval Techniques
  • Network Security and Intrusion Detection
  • Seismology and Earthquake Studies
  • Multimodal Machine Learning Applications
  • Earthquake Detection and Analysis
  • Power Systems and Technologies
  • Retinal Imaging and Analysis
  • Advanced Algorithms and Applications
  • Artificial Immune Systems Applications
  • Advanced Decision-Making Techniques
  • earthquake and tectonic studies
  • Neural Networks and Applications
  • Image and Object Detection Techniques
  • Educational Technology and Pedagogy
  • Human Motion and Animation
  • Remote Sensing and Land Use

University of Science and Technology of China
2021-2024

State Grid Corporation of China (China)
2023-2024

Chiba University
2013-2024

Lamar University
2018-2024

Tencent (China)
2023-2024

ShanghaiTech University
2017-2024

Wuhan Institute of Technology
2024

Liaoning University
2024

East China University of Science and Technology
2023

University of Science and Technology Beijing
2023

Anomaly detection in videos refers to the identification of events that do not conform expected behavior. However, almost all existing methods tackle problem by minimizing reconstruction errors training data, which cannot guarantee a larger error for an abnormal event. In this paper, we propose anomaly within video prediction framework. To best our knowledge, is first work leverages difference between predicted future frame and its ground truth detect predict with higher quality normal...

10.1109/cvpr.2018.00684 article EN 2018-06-01

Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where enforce similar neighbouring frames be encoded with reconstruction coefficients. Then map TSC special type stacked Recurrent Neural Network (sRNN). By taking advantage sRNN in learning all parameters simultaneously, nontrivial hyper-parameter selection to can avoided, meanwhile shallow sRNN, coefficients inferred within forward pass, which reduces computational...

10.1109/iccv.2017.45 article EN 2017-10-01

This paper tackles anomaly detection in videos, which is an extremely challenging task because unbounded. We approach this by leveraging a Convolutional Neural Network (CNN or ConvNet) for appearance encoding each frame, and Long Short Term Memory (ConvLSTM) memorizing all past frames corresponds to the motion information. Then we integrate ConvNet ConvLSTM with Auto-Encoder, referred as ConvLSTM-AE, learn regularity of ordinary moments. Compared 3D Auto-Encoder based detection, our main...

10.1109/icme.2017.8019325 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2017-07-01

We tackle the human motion imitation, appearance transfer, and novel view synthesis within a unified framework, which means that model once being trained can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints (pose) estimate body structure. However, they only expresses position information with no abilities characterize personalized shape of individual person limbs rotations. In this paper, we propose 3D mesh recovery module disentangle pose shape,...

10.1109/iccv.2019.00600 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

This paper presents an anomaly detection method that is based on a sparse coding inspired Deep Neural Networks (DNN). Specifically, in light of the success detection, we propose Temporally-coherent Sparse Coding (TSC), where temporally-coherent term used to preserve similarity between two similar frames. The optimization coefficients TSC with Sequential Iterative Soft-Thresholding Algorithm (SIATA) equivalent special stacked Recurrent (sRNN) architecture. Further, reduce computational cost...

10.1109/tpami.2019.2944377 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-09-28

Abnormal event detection in the surveillance video is an essential but challenging task, and many methods have been proposed to deal with this problem. The previous either only consider appearance information or directly integrate results of motion without considering their endogenous consistency semantics explicitly. Inspired by rule humans identify abnormal frames from multi-modality signals, we propose Appearance-Motion Memory Consistency Network (AMMC-Net). Our method first makes full...

10.1609/aaai.v35i2.16177 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

We study a challenging task, conditional human motion generation, which produces plausible sequences according to various inputs, such as action classes or textual descriptors. Since motions are highly diverse and have property of quite different distribution from modalities, descriptors in natural languages, it is hard learn probabilistic mapping the desired modality sequences. Besides, raw data capture system might be redundant contain noises; directly modeling joint over modalities would...

10.1109/cvpr52729.2023.01726 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Classical semi-supervised video anomaly detection assumes that only normal data are available in the training set because of rare and unbounded nature anomalies. It is obviously, however, these infrequently observed abnormal events can actually help with identical or similar events, a line thinking motivates us to study open-set supervised few types many available. Under assumption be well predicted, we propose Margin Learning Embedded Prediction (MLEP) framework. There three features MLEP-...

10.24963/ijcai.2019/419 article EN 2019-07-28

Video Anomaly detection in videos refers to the identification of events that do not conform expected behavior. However, almost all existing methods cast this problem as minimization reconstruction errors training data including only normal events, which may lead self-reconstruction and cannot guarantee a larger error for an abnormal event. In paper, we propose formulate video anomaly within regime prediction. We advocate prediction networks are suitable detection. Then, introduce two...

10.1109/tpami.2021.3129349 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-11-19

Though the advancement of pre-trained large language models unfolds, exploration building a unified model for and other multi-modal data, such as motion, remains challenging untouched so far. Fortunately, human motion displays semantic coupling akin to language, often perceived form body language. By fusing data with large-scale models, motion-language pre-training that can enhance performance motion-related tasks becomes feasible. Driven by this insight, we propose MotionGPT, unified,...

10.48550/arxiv.2306.14795 preprint EN cc-by arXiv (Cornell University) 2023-01-01

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over LLMs. We delve into study laws and present our distinctive findings that facilitate scale two commonly used configurations, 7B 67B. Guided by laws, we introduce DeepSeek LLM, project dedicated to advancing with long-term perspective. To support pre-training phase, have developed...

10.48550/arxiv.2401.02954 preprint EN other-oa arXiv (Cornell University) 2024-01-01

10.1109/cvpr52733.2024.00407 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

We present DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real-world vision and language understanding applications. Our approach is structured around three key dimensions: strive to ensure our data diverse, scalable, extensively covers scenarios including web screenshots, PDFs, OCR, charts, knowledge-based content, aiming a comprehensive representation of practical contexts. Further, we create use case taxonomy from real user construct instruction tuning dataset...

10.48550/arxiv.2403.05525 preprint EN arXiv (Cornell University) 2024-03-08

Diabetic Retinopathy (DR) is a non-negligible eye disease among patients with Diabetes Mellitus, and automatic retinal image analysis algorithm for the DR screening in high demand. Considering resolution of very high, where small pathological tissues can be detected only large local receptive field are required to identify those late stage disease, but directly training neural network deep architecture both time computational expensive difficult because gradient vanishing/exploding problem,...

10.1109/embc.2018.8512828 article EN 2018-07-01

Recently, deep learning has been used for hyperspectral image classification (HSIC) due to its powerful feature and ability. In this letter, a novel learning-based framework based on DeepLab is proposed HSIC. Inspired by the excellent performance of in semantic segmentation, applies excavate spatial features (HSI) pixel pixel. It breaks through limitation patch-wise most existing methods More importantly, it can extract at multiple scales effectively avoid reduction resolution. Furthermore,...

10.1109/lgrs.2018.2871507 article EN IEEE Geoscience and Remote Sensing Letters 2018-10-05

Phase carried by two orthogonal polarizations can be manipulated independently controlling both the geometric size and orientation of dielectric nanopost. With this characteristic, we demonstrate a novel multifunctional metasurface, which converts part incident linearly polarized light into its cross-polarization encodes phase independently. A beam splitter bifocal metalens were realized in single-layer metasurface approach. We fabricated demonstrated that focal spots separated transversely...

10.3788/col202119.053601 article EN Chinese Optics Letters 2021-01-01

This work focuses on image anomaly detection by leveraging only normal images in the training phase. Most previous methods tackle reconstructing input with an autoencoder (AE)-based model, and underlying assumption is that reconstruction errors for are small, those abnormal large. However, these AE-based methods, sometimes, even reconstruct anomalies well; consequently, they less sensitive to anomalies. To conquer this issue, we propose structure-texture correspondence. Specifically, observe...

10.1109/tnnls.2021.3101403 article EN IEEE Transactions on Neural Networks and Learning Systems 2021-08-13

Co-speech gesture generation is to synthesize a sequence that not only looks real but also matches with the input speech audio. Our method generates movements of complete upper body, including arms, hands, and head. Although recent data-driven methods achieve great success, challenges still exist, such as limited variety, poor fidelity, lack objective metrics. Motivated by fact cannot fully determine gesture, we design learns set template vectors model latent conditions, which relieve...

10.1109/iccv48922.2021.01089 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

The Tohoku earthquake of 11 March 2011 caused very large tsunamis and widespread devastation. Various high-resolution satellites captured details affected areas were utilized in emergency response. In this study, pre- post-event TerraSAR-X intensity images used to identify tsunami-flooded damaged buildings. Since water surface generally shows little backscatter, flooded could be extracted by the difference backscattering coefficients between images. Impacted buildings detected calculating...

10.1193/1.4000120 article EN Earthquake Spectra 2013-03-01

We tackle human image synthesis, including motion imitation, appearance transfer, and novel view within a unified framework. It means that the model, once being trained, can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints (pose) estimate body structure. However, they only express position information with no ability characterize personalized shape of person model limb rotations. In this paper, we propose 3D mesh recovery module disentangle pose...

10.1109/tpami.2021.3078270 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-05-08

We present DreamCraft3D, a hierarchical 3D content generation method that produces high-fidelity and coherent objects. tackle the problem by leveraging 2D reference image to guide stages of geometry sculpting texture boosting. A central focus this work is address consistency issue existing works encounter. To sculpt geometries render coherently, we perform score distillation sampling via view-dependent diffusion model. This prior, alongside several training strategies, prioritizes but...

10.48550/arxiv.2310.16818 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...