NFDI4DS | UHH-SEMS - Publication Details

Menghan Xia

ORCID: 0000-0001-9664-4967

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5046222739

Research Areas

Advanced Vision and Imaging
Image Enhancement Techniques
Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Advanced Image Processing Techniques
Video Analysis and Summarization
Human Motion and Animation
Computer Graphics and Visualization Techniques
Robotics and Sensor-Based Localization
Advanced Image Fusion Techniques
Color Science and Applications
Image and Signal Denoising Methods
Image Processing Techniques and Applications
Visual Attention and Saliency Detection
Face recognition and analysis
Music and Audio Processing
Advanced Neural Network Applications
Color perception and design
Video Coding and Compression Technologies
Speech and Audio Processing
Industrial Vision Systems and Defect Detection
Optical measurement and interference techniques
Human Pose and Action Recognition
Multimodal Machine Learning Applications
Infrared Target Detection Methodologies

Tencent (China)
2022-2025

Wuhan Institute of Technology
2023-2025

Kuaishou (China)
2025

Dalian University of Technology
2024

Anqing Normal University
2023

Chinese University of Hong Kong
2018-2022

Renmin University of China
2022

Wuhan University
2015-2019

Central South University
2019

State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
2017-2019

A Unified Framework for Street-View Panorama Stitching

OPENALEX - Publications

Li Li Jian Yao Renping Xie Menghan Xia Wei Zhang

In this paper, we propose a unified framework to generate pleasant and high-quality street-view panorama by stitching multiple panoramic images captured from the cameras mounted on mobile platform. Our proposed is comprised of four major steps: image warping, color correction, optimal seam line detection blending. Since input are without precisely common projection center scenes with depth differences respect different extents, such cannot be aligned in geometry. Therefore, an efficient...

10.3390/s17010001 article EN cc-by Sensors 2016-12-22

RoadNet: Learning to Comprehensively Analyze Road Networks in Complex Urban Scenes From High-Resolution Remotely Sensed Images

OPENALEX - Publications

Yahui Liu Jian Yao Xiaohu Lu Menghan Xia Xingbo Wang and 1 more

It is a classical task to automatically extract road networks from very high-resolution (VHR) images in remote sensing. This paper presents novel method for extracting VHR remotely sensed complex urban scenes. Inspired by image segmentation, edge detection, and object skeleton extraction, we develop multitask convolutional neural network (CNN), called RoadNet, simultaneously predict surfaces, edges, centerlines, which the first work such field. The RoadNet solves seven important issues this...

10.1109/tgrs.2018.2870871 article EN IEEE Transactions on Geoscience and Remote Sensing 2018-10-24

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

OPENALEX - Publications

Jinbo Xing Menghan Xia Yuechen Zhang Xiaodong Cun Jue Wang and 1 more

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due the highly ill-posed nature scarcity of audio-visual data. Existing works typically formulate cross-modal mapping into regression task, which suffers from regression-to-mean problem leading over-smoothed motions. In this paper, we propose cast speech-driven as code query task in finite proxy space learned codebook, effectively promotes generated motions by reducing...

10.1109/cvpr52729.2023.01229 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

OPENALEX - Publications

Haoxin Chen Yong Zhang Xiaodong Cun Menghan Xia Xintao Wang and 2 more

10.1109/cvpr52733.2024.00698 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance

OPENALEX - Publications

Jinbo Xing Menghan Xia Yuxin Liu Yuechen Zhang Yong Zhang and 7 more

Creating a vivid video from the event or scenario in our imagination is truly fascinating experience. Recent advancements text-to-video synthesis have unveiled potential to achieve this with prompts only. While text convenient conveying overall scene context, it may be insufficient control precisely. In paper, we explore customized generation by utilizing as context description and motion structure (e.g. frame- wise depth) concrete guidance. Our method, dubbed Make-Your-Video, involves...

10.1109/tvcg.2024.3365804 article EN IEEE Transactions on Visualization and Computer Graphics 2024-01-01

T2EA: Target-aware Taylor Expansion Approximation Network for Infrared and Visible Image Fusion

OPENALEX - Publications

Zhenghua Huang Cheng-Hui Lin Biyun Xu Menghan Xia Qian Li and 2 more

In the image fusion mission, crucial task is to generate high-quality images for highlighting key objects while enhancing scenes be understood. To complete this and provide a powerful interpretability as well strong generalization ability in producing enjoyable results which are comfortable vision tasks (such detection their segmentation), we present novel interpretable decomposition scheme develop target-aware Taylor expansion approximation (T <sup...

10.1109/tcsvt.2024.3524794 article EN IEEE Transactions on Circuits and Systems for Video Technology 2025-01-01

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

OPENALEX - Publications

Kun Cheng Xiaodong Cun Yong Zhang Menghan Xia Fei Yin and 4 more

We present VideoReTalking, a new system to edit the faces of real-world talking head video according input audio, producing high-quality and lip-syncing output even with different emotion. Our disentangles this objective into three sequential tasks: (1) face generation canonical expression; (2) audio-driven lip-sync; (3) enhancement for improving photo-realism. Given talking-head video, we first modify expression each frame same template using editing network, resulting in expression. This...

10.1145/3550469.3555399 article EN 2022-11-29

CoordFill: Efficient High-Resolution Image Inpainting via Parameterized Coordinate Querying

OPENALEX - Publications

Weihuang Liu Xiaodong Cun Chi‐Man Pun Menghan Xia Yong Zhang and 1 more

Image inpainting aims to fill the missing hole of input. It is hard solve this task efficiently when facing high-resolution images due two reasons: (1) Large reception field needs be handled for image inpainting. (2) The general encoder and decoder network synthesizes many background pixels synchronously form matrix. In paper, we try break above limitations first time thanks recent development continuous implicit representation. detail, down-sample encode degraded produce spatial-adaptive...

10.1609/aaai.v37i2.25263 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

OPENALEX - Publications

Haoxin Chen Menghan Xia Yingqing He Yong Zhang Xiaodong Cun and 7 more

Video generation has increasingly gained interest in both academia and industry. Although commercial tools can generate plausible videos, there is a limited number of open-source models available for researchers engineers. In this work, we introduce two diffusion high-quality video generation, namely text-to-video (T2V) image-to-video (I2V) models. T2V synthesize based on given text input, while I2V incorporate an additional image input. Our proposed model realistic cinematic-quality videos...

10.48550/arxiv.2310.19512 preprint EN cc-by arXiv (Cornell University) 2023-01-01

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

OPENALEX - Publications

Zhouxia Wang Ziyang Yuan Xintao Wang Y. Li Tianshui Chen and 3 more

10.1145/3641519.3657518 article EN 2024-07-12

Automatic multi-image stitching for concrete bridge inspection by combining point and line features

OPENALEX - Publications

Renping Xie Jian Yao Kang Liu Xiaohu Lu Yahui Liu and 2 more

10.1016/j.autcon.2018.02.021 article EN Automation in Construction 2018-03-14

Effects of the Talent War on Urban Innovation in China: A Difference-in-Differences Analysis

OPENALEX - Publications

Xiaoli Shi Ying Chen Menghan Xia Yongli Zhang

Talent is an important strategic resource for regional economic development. Based on the background of “the talent war” that has broken out between various cities in recent years, this study empirically verified influence policy urban innovation 277 prefecture-level China from 2010 to 2019 using multi-period difference-in-differences model. The results indicated caused by positively influenced innovation, causing, instance, a dramatic increase number patents inventions. Among subsidy...

10.3390/land11091485 article EN cc-by Land 2022-09-05

Line-based Multi-Label Energy Optimization for fisheye image rectification and calibration

OPENALEX - Publications

Mi Zhang Jian Yao Menghan Xia Kai Li Yi Zhang and 1 more

Fisheye image rectification and estimation of intrinsic parameters for real scenes have been addressed in the literature by using line information on distorted images. In this paper, we propose an easily implemented fisheye algorithm with constrains undistorted perspective plane. A novel Multi-Label Energy Optimization (MLEO) method is adopted to merge short circular arcs sharing same or approximately select long camera rectification. Further efficient estimate automatically selecting three...

10.1109/cvpr.2015.7299041 article EN 2015-06-01

Disentangled Image Colorization via Global Anchors

OPENALEX - Publications

Menghan Xia Wenbo Hu Tien‐Tsin Wong Jue Wang

Colorization is multimodal by nature and challenges existing frameworks to achieve colorful structurally consistent results. Even the sophisticated autoregressive model struggles maintain long-distance color consistency due fragility of sequential dependence. To overcome this challenge, we propose a novel colorization framework that disentangles multimodality structure through global anchors, so both aspects could be learned effectively. Our key insight several carefully located anchors...

10.1145/3550454.3555432 article EN ACM Transactions on Graphics 2022-11-30

MSCS: Multi-stage feature learning with channel-spatial attention mechanism for infrared and visible image fusion

OPENALEX - Publications

Zhenghua Huang Biyun Xu Menghan Xia Qian Li Lianying Zou and 2 more

10.1016/j.infrared.2024.105514 article EN Infrared Physics & Technology 2024-08-23

Guided color consistency optimization for image mosaicking

OPENALEX - Publications

Renping Xie Menghan Xia Jian Yao Li Li

10.1016/j.isprsjprs.2017.11.012 article EN ISPRS Journal of Photogrammetry and Remote Sensing 2017-11-24

Invertible grayscale

OPENALEX - Publications

Menghan Xia Xueting Liu Tien‐Tsin Wong

Once a color image is converted to grayscale, it common belief that the original cannot be fully restored, even with state-of-the-art colorization methods. In this paper, we propose an innovative method synthesize invertible grayscale. It grayscale can restore its color. The key idea here encode information into synthesized in way users recognize any anomalies. We learn and embed color-encoding scheme via convolutional neural network (CNN). consists of encoding convert decoding invert then...

10.1145/3272127.3275080 article EN ACM Transactions on Graphics 2018-11-28

Mononizing binocular videos

OPENALEX - Publications

Wenbo Hu Menghan Xia Chi‐Wing Fu Tien‐Tsin Wong

This paper presents the idea ofmono-nizingbinocular videos and a frame-work to effectively realize it. Mono-nize means we purposely convert abinocular video into regular monocular with stereo informationimplicitly encoded in visual but nearly-imperceptible form. Hence, wecan impartially distribute show mononized as an ordinarymonocular video. Unlike ordinary videos, can restore from itthe original binocular it on stereoscopic display. To start,we formulate encoding-and-decoding framework...

10.1145/3414685.3417764 article EN ACM Transactions on Graphics 2020-11-27

Interactive Story Visualization with Multiple Characters

OPENALEX - Publications

Yuan Gong Youxin Pang Xiaodong Cun Menghan Xia Yingqing He and 6 more

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, a reasonable layout of objects in images. Most previous works endeavor to meet these requirements by fitting text-to-image (T2I) model on set videos same style with characters, e.g., FlintstonesSV dataset. However, learned T2I models typically struggle adapt new scenes, styles, often lack flexibility revise synthesized This paper...

10.1145/3610548.3618184 article EN cc-by 2023-12-10

ToonCrafter: Generative Cartoon Interpolation

OPENALEX - Publications

Jinbo Xing Hanyuan Liu Menghan Xia Yong Zhang Xintao Wang and 2 more

We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video interpolation, paving the way for generative interpolation. Traditional methods, implicitly assume linear motion and absence of complicated phenomena like dis-occlusion, often struggle with exaggerated non-linear large motions occlusion commonly found in cartoons, resulting implausible or even failed interpolation results. To overcome these limitations, we explore potential adapting...

10.1145/3687761 article EN other-oa ACM Transactions on Graphics 2024-11-19

CodePhys: Robust Video-Based Remote Physiological Measurement Through Latent Codebook Querying

OPENALEX - Publications

Shuyang Chu Menghan Xia Mengyao Yuan Xin Liu Tapio Seppänen and 2 more

Remote photoplethysmography (rPPG) aims to measure non-contact physiological signals from facial videos, which has shown great potential in many applications. Most existing methods directly extract video-based rPPG features by designing neural networks for heart rate estimation. Although they can achieve acceptable results, the recovery of signal faces intractable challenges when interference real-world scenarios takes place on video. Specifically, videos are inevitably affected...

10.1109/jbhi.2025.3540134 article EN IEEE Journal of Biomedical and Health Informatics 2025-01-01

Color Consistency Correction Based on Remapping Optimization for Image Stitching

OPENALEX - Publications

Menghan Xia Jian Yao Renping Xie Mi Zhang Jinsheng Xiao

Color consistency correction is a challenging problem in image stitching, because it matters several factors, including tone, contrast and fidelity, to present natural appearance. In this paper, we propose an effective color method which feasible optimize the across images guarantee imaging quality of individual meanwhile. Our first apply well-directed alteration detection algorithms find coherent-content regions inter-image overlaps where reliable correspondences are extracted. Then,...

10.1109/iccvw.2017.351 article EN 2017-10-01

Seamless manga inpainting with semantics awareness

OPENALEX - Publications

Minshan Xie Menghan Xia Xueting Liu Chengze Li Tien‐Tsin Wong

Manga inpainting fills up the disoccluded pixels due to removal of dialogue balloons or "sound effect" text. This process is long needed by industry for language localization and conversion animated manga. It mostly done manually, as existing methods (mostly natural image inpainting) cannot produce satisfying results. more tricky than because its highly abstract illustration using structural lines screentone patterns, which confuses semantic interpretation visual content synthesis. In this...

10.1145/3450626.3459822 article EN ACM Transactions on Graphics 2021-07-19

Coming Soon ...