NFDI4DS | UHH-SEMS - Publication Details

Qi Chen

ORCID: 0009-0000-7982-9329

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100340138

Research Areas

Speech and Audio Processing
Caching and Content Delivery
Music and Audio Processing
Peer-to-Peer Network Technologies
Advanced MIMO Systems Optimization
Image Processing Techniques and Applications
Generative Adversarial Networks and Image Synthesis
IoT and Edge/Fog Computing
Face recognition and analysis
Blind Source Separation Techniques
Adversarial Robustness in Machine Learning
Advanced Adaptive Filtering Techniques
Recommender Systems and Techniques
Video Analysis and Summarization
Advanced Research in Science and Engineering
Speech Recognition and Synthesis
Energy Harvesting in Wireless Networks
Image Processing and 3D Reconstruction
Age of Information Optimization
Cooperative Communication and Network Coding
Digital Media Forensic Detection
Advanced Computational Techniques and Applications

Shanghai University of Political Science and Law
2023-2024

PLA Information Engineering University
2020

Privacy-Preserving Resource Management for Distributed Collaborative Edge Caching Systems

OPENALEX - Publications

Qi Chen Yitu Wang Wei Wang Takayuki Nakachi Zhaoyang Zhang

10.1109/jiot.2024.3437452 article EN IEEE Internet of Things Journal 2024-08-02

Content-Caching-Oriented Popularity Forecast and User Clustering

OPENALEX - Publications

Yitu Wang Qi Chen Wei Wang Takayuki Nakachi Guangchen Zhang and 1 more

10.1109/jiot.2024.3446591 article EN IEEE Internet of Things Journal 2024-08-20

Experimental research of precise terrain occlusion prediction algorithm based on image sequences of the Chang’E-4 lunar rover

OPENALEX - Publications

Youqing Ma Peng Song Bo Wen Yang Jia Zhenrong Shen and 6 more

10.1360/n092018-00386 article EN Scientia Sinica Technologica 2019-09-24

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder

OPENALEX - Publications

Chenpng Du Qi Chen Tianyu He Xu Tan Xie Chen and 3 more

While recent research has made significant progress in speech-driven talking face generation, the quality of generated video still lags behind that real recordings. One reason for this is use handcrafted intermediate representations like facial landmarks and 3DMM coefficients, which are designed based on human knowledge insufficient to precisely describe movements. Additionally, these methods require an external pretrained model extracting representations, whose performance sets upper bound...

10.48550/arxiv.2303.17550 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Learning-efficient Transmission Scheduling for Distributed Knowledge-aware Edge Learning

OPENALEX - Publications

Qi Chen Zhilian Zhang Wei Wang Zhaoyang Zhang

Edge learning is a promising enabler to leverage the distributed local data for powering artificial intelligence at edge network. Moreover, incorporating external domain knowledge into purely data-driven models can further enhance performance. In this paper, by taking both benefits of and fusion, we propose novel knowledge-aware framework, in which devices individually train with assistance bases global base server. Due limited cache capability, device only small-scale base, restricts...

10.1109/wcnc55385.2023.10119099 article EN 2022 IEEE Wireless Communications and Networking Conference (WCNC) 2023-03-01

Speaker Diarization Based on Improved Loss Functions

OPENALEX - Publications

Zhefei Yuan Lianhai Zhang Xukui Yang Qi Chen

The speaker embeddings need to have the characteristics of compactness within class and large degree separation between classes, while traditional cross-entropy with softmax loss function only guarantees separability, resulting in dispersion learned features poor generalization measurement space. Therefore, from perspective enhancing discrimination embeddings, we improve model system, which effectively improves segmentation clustering performance system. On one hand, introduce AM-Softmax...

10.1145/3436369.3437440 article EN 2020-10-30

V2C: Visual Voice Cloning

OPENALEX - Publications

Qi Chen Yuanqing Li Yuankai Qi Jiaqiu Zhou Mingkui Tan and 1 more

Existing Voice Cloning (VC) tasks aim to convert a paragraph text speech with desired voice specified by reference audio. This has significantly boosted the development of artificial applications. However, there also exist many scenarios that cannot be well reflected these VC tasks, such as movie dubbing, which requires emotions consistent plots. To fill this gap, in work we propose new task named Visual (V2C), seeks both audio and emotion video. facilitate research field, construct dataset,...

10.48550/arxiv.2111.12890 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Coming Soon ...