NFDI4DS | UHH-SEMS - Publication Details

Heng Wang

ORCID: 0009-0009-5473-5751

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100453993

Research Areas

Advanced Vision and Imaging
Image Processing Techniques and Applications
Cell Image Analysis Techniques
Multimodal Machine Learning Applications
Speech and Audio Processing
Advanced Image and Video Retrieval Techniques
Generative Adversarial Networks and Image Synthesis
Video Surveillance and Tracking Methods
Human Pose and Action Recognition
Advanced Image Processing Techniques
Advanced Neural Network Applications
Image and Signal Denoising Methods
Video Analysis and Summarization
Domain Adaptation and Few-Shot Learning
Medical Image Segmentation Techniques
Music and Audio Processing
Computer Graphics and Visualization Techniques
Advanced Data Compression Techniques
Digital Imaging for Blood Diseases
AI in cancer detection
Image Enhancement Techniques
Retinal Imaging and Analysis
Advanced Fluorescence Microscopy Techniques
Hand Gesture Recognition Systems
Face recognition and analysis

The University of Sydney
2018-2025

East China Jiaotong University
2024

Jingchu University of Technology
2020-2024

University of Electronic Science and Technology of China
2024

Shanghai Electric (China)
2024

Henan University of Technology
2024

Ningxia University
2023

Tianjin University
2023

Texas Instruments (United States)
2023

Inner Mongolia University
2022

Face Aging Effect Simulation Using Hidden Factor Analysis Joint Sparse Representation

OPENALEX - Publications

Hongyu Yang Di Huang Yunhong Wang Heng Wang Yuanyan Tang

Face aging simulation has received rising investigations nowadays, whereas it still remains a challenge to generate convincing and natural age-progressed face images. In this paper, we present novel approach such an issue by using hidden factor analysis joint sparse representation. contrast the majority of tasks in literature that handle facial texture integrally, proposed separately models person-specific properties tend be stable relatively long period age-specific clues change gradually...

10.1109/tip.2016.2547587 article EN IEEE Transactions on Image Processing 2016-03-28

Segmenting Neuronal Structure in 3D Optical Microscope Images via Knowledge Distillation with Teacher-Student Network

OPENALEX - Publications

Heng Wang Donghao Zhang Yang Song Siqi Liu Yue Wang and 3 more

Three-dimensional (3D) volumetric neural image segmentation is crucial to reconstructing accurate neuron structures. However, due the structural complexity of neurons and diverse imaging qualities microscopes, it challenging achieve both accuracy efficiency. In this paper, we propose a teacher-student learning framework for fast segmentation. The inference performed using light-weighted student network which benefits from knowledge distillation teacher with higher capacity. Evaluated on...

10.1109/isbi.2019.8759326 article EN 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) 2019-04-01

Devon: Deformable Volume Network for Learning Optical Flow

OPENALEX - Publications

Yao Lu Jack Valmadre Heng Wang Juho Kannala Mehrtash Harandi and 1 more

State-of-the-art neural network models estimate large displacement optical flow in multi-resolution and use warping to propagate the estimation between two resolutions. Despite their impressive results, it is known that there are problems with approach. First, of fails situations where small objects move fast. Second, creates artifacts when occlusion or dis-occlusion happens. In this paper, we propose a new module, Deformable Cost Volume, which alleviates problems. Based on designed Volume...

10.1109/wacv45572.2020.9093590 article EN 2020-03-01

Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds

OPENALEX - Publications

Heng Wang Chaoyi Zhang Jianhui Yu Weidong Cai

Dense captioning in 3D point clouds is an emerging vision-and-language task involving object-level scene understanding. Apart from coarse semantic class prediction and bounding box regression as traditional object detection, dense aims at producing a further finer instance-level label of natural language description on visual appearance spatial relations for each interest. To detect describe objects scene, following the spirit neural machine translation, we propose transformer-based...

10.24963/ijcai.2022/194 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

Temporal Perceiving Video-Language Pre-training

OPENALEX - Publications

Fan Ma Xiaojie Jin Heng Wang Jingjia Huang Linchao Zhu and 2 more

Video-Language Pre-training models have recently significantly improved various multi-modal downstream tasks. Previous dominant works mainly adopt contrastive learning to achieve global feature alignment across modalities. However, the local associations between videos and texts are not modeled, restricting pre-training models' generality, especially for tasks requiring temporal video boundary certain query texts. This work introduces a novel text-video localization pre-text task enable...

10.48550/arxiv.2301.07463 preprint EN cc-by arXiv (Cornell University) 2023-01-01

PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters

OPENALEX - Publications

Shuhong Chen Kevin Zhang Yichun Shi Heng Wang Yiheng Zhu and 5 more

We propose PAniC-3D, a system to reconstruct stylized 3D character heads directly from illustrated (p)ortraits of (ani)me (c)haracters. Our anime-style domain poses unique challenges single-view reconstruction; compared natural images human heads, portrait illustrations have hair and accessories with more complex diverse geometry, are shaded non-photorealistic contour lines. In addition, there is lack both model illustration data suitable train evaluate this ambiguous reconstruction task....

10.1109/cvpr52729.2023.02018 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Enhancing long-distance speech signals using PDMA speech enhancement system

OPENALEX - Publications

Tao Zhang Rongze Xia Heng Wang MD Sazzad Hossen Yanzhang Geng

10.1016/j.apacoust.2025.110736 article EN Applied Acoustics 2025-04-14

Learning a Deep Vector Quantization Network for Image Compression

OPENALEX - Publications

Xiaotong Lu Heng Wang Weisheng Dong Fangfang Wu Zhonglong Zheng and 1 more

Deep convolutional neural network (DCNN) based image codecs, consisting of encoder, quantizer and decoder, have achieved promising compression results. The major challenge in learning these DCNN models lies the joint optimization as well adaptivity to input images. In this paper, we proposed a architecture for compression, where decoder are jointly learned. Specifically, fully vector quantization (VQNet) has been quantize feature vectors representation, representative VQNet optimized with...

10.1109/access.2019.2934731 article EN cc-by IEEE Access 2019-01-01

Analysis of Time Characteristics during the Breakdown Process of Argon Gas Switches

OPENALEX - Publications

Zhaoxiang Wang Guisheng Jiang Yijun Zheng Hongliang Ma Yu Liu and 6 more

Abstract This study establishes a two-dimensional fluid theoretical model for two-electrode spark gap switch, linking the behavior of microscopic particles with macroscopic discharge phase through multiscale dynamic coupling. It further investigates temporal characteristics switch's conductive process and streamer evolution from perspective particles. Using finite element analysis method, effects factors such as gas pressure, operating voltage, electrode spacing, curvature on time key...

10.1088/1361-6463/adc69b article EN Journal of Physics D Applied Physics 2025-03-28

Dance any Beat: Blending Beats with Visuals in Dance Video Generation

OPENALEX - Publications

Xuanchen Wang Heng Wang Dongnan Liu Weidong Cai

10.1109/wacv61041.2025.00502 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

Multiscale Kernels for Enhanced U-Shaped Network to Improve 3D Neuron Tracing

OPENALEX - Publications

Heng Wang Donghao Zhang Yang Song Siqi Liu Heng Huang and 3 more

Digital neuron morphology reconstruction from three-dimensional (3D) volumetric optical microscope images is an important procedure to rebuild the connections and structures of neural circuits. Even though many approaches have been proposed achieve precise tracing, it still a challenging task especially when are polluted by noise or discontinuity in their structures. In this paper, we propose new framework overcome these issues performing segmentation prior tracing. Our adopts novel 3D...

10.1109/cvprw.2019.00144 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2019-06-01

Direct observation of nodeless superconductivity and phonon modes in electron-doped copper oxide Sr1−xNdxCuO2

OPENALEX - Publications

Jia-Qi Fan Xue-Qing Yu Fang-Jun Cheng Heng Wang Rui-Feng Wang and 6 more

The microscopic understanding of high-temperature superconductivity in cuprates has been hindered by the apparent complexity crystal structures these materials. We used scanning tunneling microscopy and spectroscopy to study an electron-doped copper oxide compound Sr$_{1-x}$Nd$_x$CuO$_2$ that only bare cations separating CuO$_2$ planes thus simplest infinite-layer structure among all cuprate superconductors. Tunneling conductance spectra major superconducting state revealed direct evidence...

10.1093/nsr/nwab225 article EN cc-by National Science Review 2021-12-13

A speech separation algorithm based on the comb-filter effect

OPENALEX - Publications

Tao Zhang Heng Wang Yanzhang Geng Xin Zhao Lingguo Kong

10.1016/j.apacoust.2022.109197 article EN Applied Acoustics 2023-01-05

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

OPENALEX - Publications

Xinlu Zhang Yujie Lu Weizhi Wang Yan An Jun Yan and 5 more

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due limitations in accounting for fine-grained details. Although GPT-4V has shown promising results various multi-modal tasks, leveraging as a generalist evaluator these not yet been systematically explored. We comprehensively validate GPT-4V's capabilities evaluation purposes, addressing ranging from foundational image-to-text and text-to-image synthesis high-level...

10.48550/arxiv.2311.01361 preprint EN other-oa arXiv (Cornell University) 2023-01-01

PointNeuron: 3D Neuron Reconstruction via Geometry and Topology Learning of Point Clouds

OPENALEX - Publications

Runkai Zhao Heng Wang Chaoyi Zhang Weidong Cai

Digital neuron reconstruction from 3D microscopy images is an essential technique for investigating brain connectomics and morphology. Existing frameworks use convolution-based segmentation networks to partition the noisy backgrounds before applying tracing algorithm. The results are sensitive raw image quality accuracy. In this paper, we propose a novel framework reconstruction. Our key idea geometric representation power of point cloud better explore intrinsic structural information...

10.1109/wacv56688.2023.00574 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?

OPENALEX - Publications

Cheng‐En Wu Tian Yu Haichao Yu Heng Wang Pedro Morgado and 2 more

Vision-language models such as CLIP [27] learn a generic text-image embedding from large-scale training data. A vision-language model can be adapted to new classification task through few-shot prompt tuning. We find that tuning process is highly robust label noises. This intrigues us study the key reasons contributing robustness of paradigm. conducted extensive experiments explore this property and factors are: 1) fixed classname tokens provide strong regularization optimization model,...

10.1109/iccv51070.2023.01420 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

3D Conditional Adversarial Learning for Synthesizing Microscopic Neuron Image Using Skeleton-to-Neuron Translation

OPENALEX - Publications

Zihao Tang Donghao Zhang Yang Song Heng Wang Dongnan Liu and 4 more

The automatic reconstruction of single neuron cells from microscopic images is essential to establishing the research on morphology. However, performance algorithms constrained by both quantity and quality annotated 3D since annotating large-scale models highly labour intensive. We propose a framework for synthesizing microscopy-realistic simulated skeletons using conditional Generative Adversarial Networks (cGAN). build generator network with multi-resolution sub-modules improve output...

10.1109/isbi45749.2020.9098345 article EN 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) 2020-04-01

Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens

OPENALEX - Publications

Fan Ma Xiaojie Jin Heng Wang Yuchen Xian Jiashi Feng and 1 more

Recent advances in large video-language models have displayed promising outcomes video comprehension. Current approaches straightforwardly convert into language tokens and employ for multi-modal tasks. However, this method often leads to the generation of irrelevant content, commonly known as "hallucination", length text increases impact diminishes. To address problem, we propose Vista-LLaMA, a novel framework that maintains consistent distance between all visual any tokens, irrespective...

10.48550/arxiv.2312.08870 preprint EN cc-by arXiv (Cornell University) 2023-01-01

CSFNet: A novel crowd counting network for occlusion and scale variation

OPENALEX - Publications

Liyan Xiong Zhida Li Xiaohui Huang Heng Wang Peng Huang

Abstract The goal of crowd-counting techniques is to estimate the number people in an image or video real-time and accurately. In recent years, with development deep learning, accuracy task has been improving. However, this still faces great challenges crowded scenarios large individual size variations. To cope situation, paper proposes a new type network: Context-Scaled Fusion Network. details include (1) design Multi-Scale Receptive Field Module (MRFF Module), which employs multiple...

10.21203/rs.3.rs-3875418/v1 preprint EN cc-by Research Square (Research Square) 2024-01-22

Deep-learning-based method for concealed object detection in terahertz (THz) images

OPENALEX - Publications

Zihao Ge Yuan Zhang Xuyang Wu Zhiyuan Jia Heng Wang and 1 more

Terahertz (THz) technology has become a new trend in various fields due to its high penetration and harmlessness towards human body objects. The object detection of concealed hidden objects based on THz images is great significance for ensuring public safety. However, the poor quality original leads insufficient accuracy target detection. Therefore, it necessary preprocess before performing In this work, order investigate impact different pre-processing methods using images, we adopt two...

10.1117/12.3021687 article EN 2024-03-18

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

OPENALEX - Publications

Mude Hui Siwei Yang Bingchen Zhao Yichun Shi Heng Wang and 3 more

This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200,000 edits. Unlike prior approaches relying on attribute guidance or human feedback building datasets, we devise scalable data collection pipeline leveraging advanced foundation models, namely GPT-4V and DALL-E 3. To ensure its high quality, diverse examples are first collected online, expanded, then used to create diptychs featuring input output images detailed text prompts, followed by...

10.48550/arxiv.2404.09990 preprint EN arXiv (Cornell University) 2024-04-15

Advancements in 3D Lane Detection Using LiDAR Point Clouds: From Data Collection to Model Development

OPENALEX - Publications

Runkai Zhao Yuwen Heng Heng Wang Yuanda Gao Shilei Liu and 3 more

10.1109/icra57147.2024.10610087 article EN 2024-05-13

Memory and Time Efficient 3D Neuron Morphology Tracing in Large-Scale Images

OPENALEX - Publications

Heng Wang Donghao Zhang Yang Song Siqi Liu Rong Gao and 2 more

3D reconstruction of neuronal morphology is crucial to solving neuron-related problems in neuroscience as it a key technique for investigating the connectivity and functionality neuron system. Many methods have been proposed improve accuracy digital reconstruction. However, large amount computer memory computation time they require process large-scale images posed new challenge us. To solve this problem, we introduce novel Memory (and Time) Efficient Image Tracing (MEIT) framework. Evaluated...

10.1109/dicta.2018.8615765 article EN 2018-12-01

Hand Gesture Target Model Updating and Result Forecasting Algorithm based on Mean Shift

OPENALEX - Publications

Xiao Zou Heng Wang Qiuyu Zhang

To propose a gesture model updating and results forecasting algorithm based on Mean Shift, to solve the problem of target changing influenced tracking in process. Firstly, background difference skin color detection methods are used detect get model, Shift is track update finally use Kalman predict results. The experimental show that this reduces influence surrounding environment process, better result.

10.4304/jmm.8.1.1-7 article EN Journal of Multimedia 2013-02-01

Coming Soon ...