NFDI4DS | UHH-SEMS - Publication Details

Anni Tang

ORCID: 0000-0002-9772-3293

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5006595285

Research Areas

Face recognition and analysis
Generative Adversarial Networks and Image Synthesis
Video Coding and Compression Technologies
Advanced Data Compression Techniques
Advanced Image Processing Techniques
Speech and Audio Processing
Computer Graphics and Visualization Techniques
Image Enhancement Techniques
Advanced Vision and Imaging
Corruption and Economic Development
Regional Economic and Spatial Analysis
3D Shape Modeling and Analysis
Rural development and sustainability
Taxation and Compliance Studies
Migration and Labor Dynamics
Regional Economics and Spatial Analysis
Face and Expression Recognition
Migration, Health and Trauma
Urbanization and City Planning
Underwater Acoustics Research
Land Use and Ecosystem Services
Digital Media Forensic Detection
Music and Audio Processing
Simulation and Modeling Applications
Interpreting and Communication in Healthcare

Shanghai Jiao Tong University
2021-2024

Xi'an University of Architecture and Technology
2022-2024

Anti-corruption Efforts, Corruption Perception and Public Evaluation of Local Governments in China

OPENALEX - Publications

Zhou Zhou Yong Zhou Meng Yuan Anni Tang

Abstract China has taken significant steps to combat corruption since the 18th National Congress of Chinese Communist Party (CCP). However, whether and how anti-corruption efforts influence public's evaluation local government performance remain understudied. Using multiple data sources, including panel survey from Family Panel Studies 2010 2018, this research examines improve evaluations by reducing public perception existing corruption. Additional analysis reveals that reduce perceived...

10.1017/s030574102400167x article EN The China Quarterly 2025-01-24

High-Fidelity Face Reenactment Via Identity-Matched Correspondence Learning

OPENALEX - Publications

Han Xue Jun Ling Anni Tang Li Song Rong Xie and 1 more

Face reenactment aims to generate an animation of a source face using the poses and expressions from target face. Although recent methods have made remarkable progress by exploiting generative adversarial networks, they are limited in generating high-fidelity identity-preserving results due inappropriate driving information insufficiently effective animating strategies. In this work, we propose novel framework that achieves both generation identity preservation. Instead sparse...

10.1145/3571857 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-11-23

A Generative Compression Framework For Low Bandwidth Video Conference

OPENALEX - Publications

Dahu Feng Yan Huang Yiwei Zhang Jun Ling Anni Tang and 1 more

Video conferences introduce a new scenario for video transmission, which focuses on keeping the fidelity of faces even in low bandwidth network environment. In this work, we propose VSBNet, one frameworks to utilize face landmarks compression. Our method utilizes adversarial learning reconstruct origin frames from landmarks. To recover more details and keep consistency identity, concept visual sensitivity separate contour fast-moving parts, such as eyes mouth. Experimental results...

10.1109/icmew53276.2021.9455985 article EN 2021-06-21

Memories are One-to-Many Mapping Alleviators in Talking Face Generation

OPENALEX - Publications

Anni Tang Tianyu He Xu Tan Jun Ling Runnan Li and 3 more

Talking face generation aims at generating photorealistic video portraits of a target person driven by input audio. According to the nature audio lip motions mapping, same speech content may have different appearances even for occasions. Such one-to-many mapping problem brings ambiguity during training and thus causes inferior visual results. Although this could be alleviated in part two-stage framework (i.e., an audioto- expression model followed neural-rendering model), it is still...

10.1109/tpami.2024.3409380 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization

OPENALEX - Publications

Zhiyu Zhang Lu Guo Huanxiong Liang Anni Tang Qiang Hu and 1 more

10.1109/icme57554.2024.10687892 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15

Generative Compression for Face Video: A Hybrid Scheme

OPENALEX - Publications

Anni Tang Yan Huang Jun Ling Zhiyu Zhang Yiwei Zhang and 2 more

As the latest video coding standard, versatile (VVC) has shown its ability in retaining pixel quality. To excavate more compression potential for conference scenarios under ultra-low bitrate, this paper proposes a bitrate-adjustable hybrid scheme face video. This combines pixel-level precise recovery capability of traditional with generation deep learning based on abridged information, where Pixel-wise Bi-Prediction, Low-Bitrate-FOM and Lossless Keypoint Encoder collaborate to achieve PSNR...

10.1109/icme52920.2022.9859867 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

Exploring the High-Quality County-Level Development and Governance Response for Farming–Pastoral Ecotone in China: A Case Study of Kulun

OPENALEX - Publications

Zhe Cheng Anni Tang Jianming Cai Tao Song

As a special territory type, the farming–pastoral ecotone is facing challenges surrounding path creation and high-quality sustainable development. Counties are not only an important spatial unit to promote development, but also part of modernization national governance system. County-level development critical driving force breakthrough in farming-pastoral ecotone. First, this study systematically reviews progress Then, adopts “Driving Forces-Pressure-State-Impact-Responses” (DPSIR) model...

10.3390/agriculture12122042 article EN cc-by Agriculture 2022-11-29

Memories are One-to-Many Mapping Alleviators in Talking Face Generation

OPENALEX - Publications

Anni Tang Tianyu He Xu Tan Jun Ling Li Song

Talking face generation aims at generating photo-realistic video portraits of a target person driven by input audio. Due to its nature one-to-many mapping from the audio output (e.g., one speech content may have multiple feasible visual appearances), learning deterministic like previous works brings ambiguity during training, and thus causes inferior results. Although this could be alleviated in part two-stage framework (i.e., an audio-to-expression model followed neural-rendering model), it...

10.48550/arxiv.2212.05005 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization

OPENALEX - Publications

Zhiyu Zhang Lu Guo Huanxiong Liang Anni Tang Qiang Hu and 1 more

Volumetric videos, benefiting from immersive 3D realism and interactivity, hold vast potential for various applications, while the tremendous data volume poses significant challenges compression. Recently, NeRF has demonstrated remarkable in volumetric video compression thanks to its simple representation powerful modeling capabilities, where a notable work is ReRF. However, ReRF separates process, resulting suboptimal efficiency. In contrast, this paper, we propose method based on dynamic...

10.48550/arxiv.2402.01380 preprint EN arXiv (Cornell University) 2024-02-02

Compositional 3D-aware Video Generation with LLM Director

OPENALEX - Publications

Hanxin Zhu Tianyu He Anni Tang Junliang Guo Zhibo Chen and 1 more

Significant progress has been made in text-to-video generation through the use of powerful generative models and large-scale internet data. However, substantial challenges remain precisely controlling individual concepts within generated video, such as motion appearance specific characters movement viewpoints. In this work, we propose a novel paradigm that generates each concept 3D representation separately then composes them with priors from Large Language Models (LLM) 2D diffusion models....

10.48550/arxiv.2409.00558 preprint EN arXiv (Cornell University) 2024-08-31

SingAvatar: High-fidelity Audio-driven Singing Avatar Synthesis

OPENALEX - Publications

Wentao Ma Anni Tang Jun Ling Han Xue Huiheng Liao and 2 more

10.1109/icme57554.2024.10687925 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15

ViCoFace: Learning Disentangled Latent Motion Representations for Visual-Consistent Face Reenactment

OPENALEX - Publications

Jun Ling Han Xue Anni Tang Rong Xie Li Song

Unsupervised face reenactment aims to animate a source image imitate the motions of target while retaining portrait’s attributes like facial geometry, identity, hair texture, and background. While prior methods can extract motion from via compact representations (e.g., key-points or latent bases [50]), they are not robust in predicting that disentangled with portrait attributes, thus failing preserve cross-subject reenactment. In this work, we propose an effective cost-efficient approach...

10.1145/3698769 article EN ACM Transactions on Multimedia Computing Communications and Applications 2024-10-04

Rate-aware Compression for NeRF-based Volumetric Video

OPENALEX - Publications

Z.M. Zhang Guo Lu Huanxiong Liang Zhi‐Lin Cheng Anni Tang and 1 more

10.1145/3664647.3680970 article EN 2024-10-26

Rate-aware Compression for NeRF-based Volumetric Video

OPENALEX - Publications

Z.M. Zhang Guo Lu Huanxiong Liang Zhi‐Lin Cheng Anni Tang and 1 more

The neural radiance fields (NeRF) have advanced the development of 3D volumetric video technology, but large data volumes they involve pose significant challenges for storage and transmission. To address these problems, existing solutions typically compress NeRF representations after training stage, leading to a separation between representation compression. In this paper, we try directly learn compact in stage based on proposed rate-aware compression framework. Specifically, video, use...

10.48550/arxiv.2411.05322 preprint EN arXiv (Cornell University) 2024-11-07

Spatial Pattern, Classification, and Influencing Factors of High-Quality County Development in China

OPENALEX - Publications

Zhe Cheng Min Wang Yun Li Qiong Wang Anni Tang

10.1061/jupddm.upeng-5121 article EN Journal of Urban Planning and Development 2024-11-26

Dense 3D Coordinate Code Prior Guidance for High-Fidelity Face Swapping and Face Reenactment

OPENALEX - Publications

Anni Tang Han Xue Jun Ling Rong Xie Sang Li

In face synthesis tasks, commonly used 2D representations (e.g. landmarks, segmentation maps, etc.) are usually sparse and discontinuous. To combat these shortcomings, we utilize a dense continuous representation, named Projected Normalized Coordinate Code (PNCC), as the guidance develop PNCC-Spatio-Normalization (PSN) method to achieve regarding arbitrary head poses expressions. Based on PSN, provide an effective framework for reenactment swapping task. ensure harmonious seamless swapping,...

10.1109/fg52635.2021.9667065 article EN 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021) 2021-12-15

High-Fidelity Free-View Talking Head Synthesis for Low-Bandwidth Video Conference

OPENALEX - Publications

Zhiyu Zhang Anni Tang Chen Zhu Lu Guo Rong Xie and 1 more

As video conferencing becomes an indispensable part of human's daliy life, how to achieve a high-fidelity calling experience under low bandwidth has been popular and challenging issue. Deep generative models have great potential in low-bandwidth facial compression due the excellent generation capability based on abridged information. Nevertheless, exsiting deep generation-based methods tend handle motion information pure 2D or pseudo 3D space, causing distortion when large head poses are...

10.1109/vcip59821.2023.10402733 article EN 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) 2023-12-04

Generative Compression for Face Video: A Hybrid Scheme

OPENALEX - Publications

Anni Tang Yan Huang Jun Ling Zhiyu Zhang Yiwei Zhang and 2 more

As the latest video coding standard, versatile (VVC) has shown its ability in retaining pixel quality. To excavate more compression potential for conference scenarios under ultra-low bitrate, this paper proposes a bitrate adjustable hybrid scheme face video. This combines pixel-level precise recovery capability of traditional with generation deep learning based on abridged information, where Pixel wise Bi-Prediction, Low-Bitrate-FOM and Lossless Keypoint Encoder collaborate to achieve PSNR...

10.48550/arxiv.2204.10055 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Coming Soon ...