Jun Ling

ORCID: 0000-0001-7260-7141
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Face recognition and analysis
  • Generative Adversarial Networks and Image Synthesis
  • Speech and Audio Processing
  • Advanced Image Processing Techniques
  • Video Surveillance and Tracking Methods
  • Advanced Vision and Imaging
  • Underwater Acoustics Research
  • Fault Detection and Control Systems
  • Network Security and Intrusion Detection
  • Cooperative Communication and Network Coding
  • Direction-of-Arrival Estimation Techniques
  • Error Correcting Code Techniques
  • Additive Manufacturing Materials and Processes
  • Machine Fault Diagnosis Techniques
  • Human Pose and Action Recognition
  • Video Analysis and Summarization
  • Advanced Computational Techniques and Applications
  • Digital Media Forensic Detection
  • 3D Shape Modeling and Analysis
  • Maritime Navigation and Safety
  • Advanced Neural Network Applications
  • Advanced Wireless Network Optimization
  • Advanced Data Compression Techniques
  • Infrared Target Detection Methodologies
  • Radar Systems and Signal Processing

South China University of Technology
2025

Numerical Method (China)
2025

Shanghai Jiao Tong University
2020-2024

Shanghai Maritime University
2021-2022

Microsoft Research Asia (China)
2022

Guangzhou University
2022

Jiangsu University
2021

Tianjin University
2014

MathWorks (United States)
2014

University of Florida
2009-2011

Image composition plays a common but important role in photo editing. To acquire photo-realistic composite images, one must adjust the appearance and visual style of foreground to be compatible with background. Existing deep learning methods for harmonizing images directly learn an image mapping network from real one, without explicit exploration on consistency between background images. ensure background, this paper, we treat harmonization as transfer problem. In particular, propose simple...

10.1109/cvpr46437.2021.00924 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Abstract Coastal surveillance video helps officials to obtain on-site visual information on maritime traffic situations, which benefits building up the transportation detection infrastructure. The previous ship methods focused detecting distant small ships in videos, with less attention paid task of from coastal video. To address this challenge, a novel framework is proposed detect images three typical situations consecutive steps. First Canny detector introduced determine potential edges...

10.1017/s0373463321000540 article EN Journal of Navigation 2021-07-09

We introduce a missing data recovery methodology based on weighted least squares iterative adaptive approach (IAA). The proposed method is referred to as the missing-data IAA (MIAA) and it can be used for uniform or non-uniform sampling well arbitrary patterns. MIAA uses spectrum estimates retrieve data, spectral criterion similar that by IAA. Numerical examples are presented show effectiveness of recovery. also outperform an existing competitive approach, this at much lower computational cost.

10.1109/icassp.2009.4960347 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2009-04-01

In complex and dynamic environments, traditional motion detection techniques that rely on visual feature extraction face significant challenges when detecting tracking small-sized moving objects. These difficulties primarily stem from the limited information inherent in small objects substantial interference caused by irrelevant backgrounds. Inspired intricate mechanisms for insect brains, some bio-inspired systems have been designed to identify natural While these insect-inspired can...

10.3390/app15031649 article EN cc-by Applied Sciences 2025-02-06

In this paper, a deep learning fault detection and prediction framework combining principal component analysis (PCA) Informer is proposed to solve the problem of online monitoring nuclear power valves which hard implement. More specifically, PCA plays role dimensionality reduction feature extraction. It maps data with multi-dimensional space low-dimensional extracts main features. At same time, T-square Q statistic thresholds are also provided realize abnormal status monitoring. Meanwhile,...

10.3390/machines10040240 article EN cc-by Machines 2022-03-29

While previous methods for speech-driven talking face generation have shown significant advances in improving the visual and lip-sync quality of synthesized videos, they paid less attention to lip motion jitters which can substantially undermine perceived videos. What causes jitters, how mitigate problem? In this article, we conduct systematic analyses investigate jittering problem based on a state-of-the-art pipeline that utilizes 3D representations bridge input audio output video,...

10.1109/jstsp.2023.3333552 article EN IEEE Journal of Selected Topics in Signal Processing 2023-11-01

360° panoramas are extensively utilized as environmental light sources in computer graphics. However, capturing a × 180° panorama poses challenges due to the necessity of specialized and costly equipment, additional human resources. Prior studies develop various learning-based generative methods synthesize from single Narrow Field-of-View (NFoV) image, but they limited alterable input patterns, generation quality, controllability. To address these issues, we propose novel pipeline called...

10.1145/3581783.3612508 preprint EN 2023-10-26

Face reenactment aims to generate an animation of a source face using the poses and expressions from target face. Although recent methods have made remarkable progress by exploiting generative adversarial networks, they are limited in generating high-fidelity identity-preserving results due inappropriate driving information insufficiently effective animating strategies. In this work, we propose novel framework that achieves both generation identity preservation. Instead sparse...

10.1145/3571857 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-11-23

Video conferences introduce a new scenario for video transmission, which focuses on keeping the fidelity of faces even in low bandwidth network environment. In this work, we propose VSBNet, one frameworks to utilize face landmarks compression. Our method utilizes adversarial learning reconstruct origin frames from landmarks. To recover more details and keep consistency identity, concept visual sensitivity separate contour fast-moving parts, such as eyes mouth. Experimental results...

10.1109/icmew53276.2021.9455985 article EN 2021-06-21

Talking-head video editing aims to efficiently insert, delete, and substitute the word of a pre-recorded through text transcript editor. The key challenge for this task is obtaining an model that generates new talking-head clips which simultaneously have accurate lip synchronization motion smoothness. Previous approaches, including 3DMM-based (3D Morphable Model) methods NeRF-based (Neural Radiance Field) methods, are sub-optimal in they either require minutes source videos days training...

10.1145/3581783.3611765 article EN 2023-10-26

Talking face generation aims at generating photorealistic video portraits of a target person driven by input audio. According to the nature audio lip motions mapping, same speech content may have different appearances even for occasions. Such one-to-many mapping problem brings ambiguity during training and thus causes inferior visual results. Although this could be alleviated in part two-stage framework (i.e., an audioto- expression model followed neural-rendering model), it is still...

10.1109/tpami.2024.3409380 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

We propose a novel robust and efficient Speech-to-Animation (S2A) approach for synchronized facial animation generation in human-computer interaction. Compared with conventional approaches, the proposed utilizes phonetic posteriorgrams (PPGs) of spoken phonemes as input to ensure cross-language cross-speaker ability, introduces corresponding prosody features (i.e. pitch energy) further enhance expression generated animation. Mixture-of-experts (MOE)-based Transformer is employed better model...

10.1109/icassp43922.2022.9747495 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

As the latest video coding standard, versatile (VVC) has shown its ability in retaining pixel quality. To excavate more compression potential for conference scenarios under ultra-low bitrate, this paper proposes a bitrate-adjustable hybrid scheme face video. This combines pixel-level precise recovery capability of traditional with generation deep learning based on abridged information, where Pixel-wise Bi-Prediction, Low-Bitrate-FOM and Lossless Keypoint Encoder collaborate to achieve PSNR...

10.1109/icme52920.2022.9859867 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

Active sonar systems involve the transmission and reception of one or more probing sequences, which provide a basis for extraction target information in region interest. The sequences at transmitter signal processing receiver play crucial roles overall system performance. In this paper, CAN (cyclic algorithm-new) is employed to synthesize with good aperiodic autocorrelation properties. performance will be compared those pseudo random noise phase sequences. Two adaptive designs, namely...

10.1121/1.3575604 article EN The Journal of the Acoustical Society of America 2011-06-01

Multistatic active sonar systems involve the transmission and reception of multiple probing sequences. Since simultaneously transmitted sequences act as interferences to one another, adaptive receiver filters are needed for interference suppression target range-Doppler imaging. Two designs, namely, iterative approach (IAA) sparse learning via minimization (SLIM) method, considered imaging multistatic sonar. The so-obtained images allow us further estimate parameters. Specifically, we use...

10.1109/joe.2013.2249851 article EN IEEE Journal of Oceanic Engineering 2014-01-31

Building an efficient and reliable small target motion detection visual system is challenging for artificial intelligence robotics because a only occupies few pixels hardly displays features in images. Biological systems that have evolved over millions of years could be ideal templates designing systems. Insects benefit from class specialized neurons, called detectors (STMDs), which endow them with excellent ability to detect moving targets against cluttered dynamic environment. Some...

10.3389/fnbot.2022.984430 article EN cc-by Frontiers in Neurorobotics 2022-09-20

ABSTRACT Optimal control theory and reinforcement learning are gradually being used in the field of industrial control. In this article, a new optimal tracking scheme for 160 MW boiler‐turbine systems is proposed based on an online policy iteration integral (IRL) method. Firstly, self‐learning state with cost function developed to deal problems nonlinear system. Then modified function, iteration‐based IRL method introduced obtain law. Finally, monotonicity convergence analyzed detailed...

10.1002/oca.2792 article EN Optimal Control Applications and Methods 2021-10-12

This paper presents an automata-based algorithm for answering the \emph{provenance-aware} regular path queries (RPQs) over RDF graphs on Semantic Web. The provenance-aware RPQs can explain why pairs of nodes in classical semantics appear result RPQ. We implement a parallel version using Pregel framework Giraph to efficiently evaluate large graphs. experimental results show that our algorithms are effective and efficient answer real-world

10.1145/2567948.2577284 article EN 2014-04-07

Recent studies have shown remarkable success in synthesizing realistic talking faces by exploiting generative adversarial networks. However, existing methods are mostly target specific that cannot generate images of previously unseen people, and they suffer from artifacts such as blurriness mismatching facial details. In this paper, we tackle these problems proposing a target-agnostic framework. We introduce geometry-aware feature transformation module to achieve shape transfer while...

10.1109/icip40778.2020.9190699 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2020-09-30

Current talking face generation methods mainly focus on speech-lip synchronization. However, insufficient investigation the facial style leads to a lifeless and monotonous avatar. Most previous works fail imitate expressive styles from arbitrary video prompts ensure authenticity of generated video. This paper proposes an unsupervised variational transfer model (VAST) vivify neutral photo-realistic avatars. Our consists three key components: encoder that extracts representations given...

10.1109/iccvw60793.2023.00320 article EN 2023-10-02
Coming Soon ...