NFDI4DS | UHH-SEMS - Publication Details

Kunpeng Song

ORCID: 0009-0009-2439-4263

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5091015171

Research Areas

Generative Adversarial Networks and Image Synthesis
Computer Graphics and Visualization Techniques
Advanced Vision and Imaging
Video Analysis and Summarization
Advanced Data Compression Techniques
Multimodal Machine Learning Applications
Advanced Image Processing Techniques
Cancer-related molecular mechanisms research
Natural Language Processing Techniques
Advanced Image and Video Retrieval Techniques
Image Retrieval and Classification Techniques
Whipple's Disease and Interleukins
Wireless Communication Security Techniques
Molecular Communication and Nanonetworks
IoT and Edge/Fog Computing
Human Pose and Action Recognition
Antenna Design and Analysis
Advanced Wireless Communication Technologies
Digital Media Forensic Detection
Domain Adaptation and Few-Shot Learning
Age of Information Optimization
Handwritten Text Recognition Techniques
Musicology and Musical Analysis
Aesthetic Perception and Analysis
Advanced Steganography and Watermarking Techniques

China Academy of Space Technology
2025

Peking University
2023

Rutgers Sexual and Reproductive Health and Rights
2021-2022

Rutgers, The State University of New Jersey
2021

Chinese Academy of Sciences
2017

BestConfig

OPENALEX - Publications

Yuqing Zhu Jianxun Liu Mengying Guo Yungang Bao Wenlong Ma and 3 more

An ever increasing number of configuration parameters are provided to system users. But many users have used one setting across different workloads, leaving untapped the performance potential systems. A good can greatly improve a deployed under certain workloads. with tens or hundreds parameters, it becomes highly costly task decide which leads best performance. While such requires strong expertise in both and application, commonly lack expertise. To help tap systems, we present BestConfig,...

10.1145/3127479.3128605 preprint EN 2017-09-24

TIME: Text and Image Mutual-Translation Adversarial Networks

OPENALEX - Publications

Bingchen Liu Kunpeng Song Yizhe Zhu Gerard de Melo Ahmed Elgammal

Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns T2I generator G an image captioning discriminator D under the Generative Network framework. While previous methods tackle problem as uni-directional task use pre-trained language models to enforce image--text consistency, TIME requires neither extra modules nor pre-training. We show performance of can be boosted...

10.1609/aaai.v35i3.16305 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Deep‐Learning‐Based Pilot Optimization for Near‐Field Channel Estimation in Ultra‐Massive MIMO

OPENALEX - Publications

Zexian Chen Kunpeng Song Zhengwei Qu Yanjiao Zhang Yongjia Shang

ABSTRACT As sixth‐generation (6G) communication technology evolves, the increase in frequency and number of antennas has made traditional far‐field channel estimation methods less effective. This paper proposes a deep neural network (DNN)‐based method to optimize pilot signals for near‐field ultra‐massive multiple‐input multiple‐output (MIMO) systems. By optimizing signals, can accurately estimate distance angle scatterers, addressing challenges sparse techniques. Simulation results...

10.1049/ell2.70227 article EN cc-by-nc-nd Electronics Letters 2025-01-01

Self-Supervised Sketch-to-Image Synthesis

OPENALEX - Publications

Bingchen Liu Yizhe Zhu Kunpeng Song Ahmed Elgammal

Imagining a colored realistic image from an arbitrary-drawn sketch is one of human capabilities that we eager machines to mimic. Unlike previous methods either require the sketch-image pairs or utilize low-quantity detected edges as sketches, study exemplar-based sketch-to-image (s2i) synthesis task in self-supervised learning manner, eliminating necessity paired data. To this end, first propose unsupervised method efficiently synthesize line-sketches for general RGB-only datasets. With...

10.1609/aaai.v35i3.16304 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

OPENALEX - Publications

Bingchen Liu Yizhe Zhu Kunpeng Song Ahmed Elgammal

Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images. In this paper, we study the few-shot image synthesis task for GAN with minimum computing cost. We propose light-weight structure that gains superior quality 1024*1024 resolution. Notably, model converges from scratch just few hours single RTX-2080 GPU, has consistent performance, even less than 100 samples. Two technique designs constitute our...

10.48550/arxiv.2101.04775 preprint EN other-oa arXiv (Cornell University) 2021-01-01

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

OPENALEX - Publications

Kunpeng Song Yizhe Zhu Bingchen Liu Qing Yan Ahmed Elgammal and 1 more

In this paper, we present MoMA: an open-vocabulary, training-free personalized image model that boasts flexible zero-shot capabilities. As foundational text-to-image models rapidly evolve, the demand for robust image-to-image translation grows. Addressing need, MoMA specializes in subject-driven generation. Utilizing open-source, Multimodal Large Language Model (MLLM), train to serve a dual role as both feature extractor and generator. This approach effectively synergizes reference text...

10.48550/arxiv.2404.05674 preprint EN arXiv (Cornell University) 2024-04-08

A single-stage automatic license plate recognition network with Balanced-IoU loss

OPENALEX - Publications

Sen Liu Yufei Xie Longbin Wu Kunpeng Song Kaiwen Gong and 1 more

Abstract In this paper, we propose a license plate recognition model, which can detect and recognize the in single forward. The features of input image are extracted by our 15-layer convolutional neural network. detection branch, use loss function with better nonlinear to fit process plate. To catch location less information loss, add Intersection over Ground-truth (IoG) into Union (IoU) get Balanced-IoU (BIoU loss). combination these two functions make model predictive result. introduce an...

10.1088/1742-6596/2504/1/012039 article EN Journal of Physics Conference Series 2023-05-01

DirectorLLM for Human-Centric Video Generation

OPENALEX - Publications

Kunpeng Song Tingbo Hou Zhisong He Haoyu Ma Jialiang Wang and 10 more

In this paper, we introduce DirectorLLM, a novel video generation model that employs large language (LLM) to orchestrate human poses within videos. As foundational text-to-video models rapidly evolve, the demand for high-quality motion and interaction grows. To address need enhance authenticity of motions, extend LLM from text generator director simulator. Utilizing open-source resources Llama 3, train DirectorLLM generate detailed instructional signals, such as poses, guide generation. This...

10.48550/arxiv.2412.14484 preprint EN arXiv (Cornell University) 2024-12-18

Sketch-to-Art

OPENALEX - Publications

Ahmed Elgammal Bingchen Liu Kunpeng Song

Sketch-to-Art is an AI tool that allows creatives to sketch idea and get fully rendered images, stylized the way they want in real time. Users can define a style by either choosing reference image, or group of selecting artist, art movement.

10.1145/3407662.3407757 article EN 2020-08-14

Sketch-to-Art: Synthesizing Stylized Art Images From Sketches

OPENALEX - Publications

Bingchen Liu Kunpeng Song Ahmed Elgammal

We propose a new approach for synthesizing fully detailed art-stylized images from sketches. Given sketch, with no semantic tagging, and reference image of specific style, the model can synthesize meaningful details colors textures. The consists three modules designed explicitly better artistic style capturing generation. Based on GAN framework, dual-masked mechanism is introduced to enforce content constraints (from sketch), feature-map transformation technique developed strengthen...

10.48550/arxiv.2002.12888 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Self-Supervised Sketch-to-Image Synthesis

OPENALEX - Publications

Bingchen Liu Yizhe Zhu Kunpeng Song Ahmed Elgammal

Imagining a colored realistic image from an arbitrarily drawn sketch is one of the human capabilities that we eager machines to mimic. Unlike previous methods either requires sketch-image pairs or utilize low-quantity detected edges as sketches, study exemplar-based sketch-to-image (s2i) synthesis task in self-supervised learning manner, eliminating necessity paired data. To this end, first propose unsupervised method efficiently synthesize line-sketches for general RGB-only datasets. With...

10.48550/arxiv.2012.09290 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Using IRS to Improve the Secrecy Rate of Millimeter Wave Communication System

OPENALEX - Publications

Kunpeng Song Fangshu Ma Zexian Chen Sen Liu Yong Shang and 1 more

With the development of 6G, millimeter wave communication has received extensive attention. Due to characteristics wireless transmission, information secrecy transmission is facing significant challenges. This paper uses physical layer security (PLS) explore transmission. Specifically, we use Intelligent Reflecting Surface (IRS) control propagation environment and improve rate communication. The active beamforming matrix base station passive IRS are optimized achieve maximum rate. We deduce...

10.1109/vtc2023-spring57618.2023.10200911 article EN 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring) 2023-06-01

From Voice to Visualization ― Visual Analysis of Voice Data of Shandong Tax Hotline Based on NLP

OPENALEX - Publications

Kunpeng Song Qinggang Meng Rui-peng JIANG Bo Ni Xiao-xiao QU and 1 more

This paper presents a method through which we can realize the procedure of visual analysis voice data. We first transform amounts audio files gained by tax hotline 12366 to text using Baidu speech recognition service, and divide these into words phrases via â€˜ Chinese Segmentation â€™. propose NLP algorithm that select keywords from Word2Vec â€™ contradiction - handling method. These serve as indications classification model, is service requirement. Then, some results are visualized in form...

10.12783/dtcse/aiie2017/18217 article EN DEStech Transactions on Computer Science and Engineering 2018-02-12

TIME: Text and Image Mutual-Translation Adversarial Networks

OPENALEX - Publications

Bingchen Liu Kunpeng Song Yizhe Zhu Gerard de Melo Ahmed Elgammal

10.48550/arxiv.2005.13192 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...