NFDI4DS | UHH-SEMS - Publication Details

Kun Zhou

ORCID: 0000-0001-9592-6575

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100722046

Research Areas

Advanced Vision and Imaging
Advanced Image Processing Techniques
Image Processing Techniques and Applications
Image and Signal Denoising Methods
Computer Graphics and Visualization Techniques
Image Enhancement Techniques
Advanced Image and Video Retrieval Techniques
Generative Adversarial Networks and Image Synthesis
Image Retrieval and Classification Techniques
Multimodal Machine Learning Applications
Topic Modeling
Optical measurement and interference techniques
Recommender Systems and Techniques
Higher Education and Teaching Methods
Visual Attention and Saliency Detection
3D Surveying and Cultural Heritage
Salmonella and Campylobacter epidemiology
Diabetic Foot Ulcer Assessment and Management
Semantic Web and Ontologies
Speech and dialogue systems
Human Pose and Action Recognition
Machine Learning and Algorithms
Video Surveillance and Tracking Methods
Automated Road and Building Extraction
Photoacoustic and Ultrasonic Imaging

University of Maryland, College Park
2024

Xidian University
2023

University of Hong Kong
2022-2023

Renmin University of China
2023

Beijing Institute of Big Data Research
2023

SMART Reading
2022

Hong Kong University of Science and Technology
2022

Shenzhen University
2022

Wuhan University
2022

Nanyang Technological University
2022

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

OPENALEX - Publications

Wenbo Li Zhe Lin Kun Zhou Lu Qi Yi Wang and 1 more

Recent studies have shown the importance of modeling long-range interactions in inpainting problem. To achieve this goal, existing approaches exploit either standalone attention techniques or transformers, but usually under a low resolution consideration computational cost. In paper, we present novel transformer-based model for large hole inpainting, which unifies merits transformers and convolutions to efficiently process high-resolution images. We carefully design each component our...

10.1109/cvpr52688.2022.01049 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

HEMlets Pose: Learning Part-Centric Heatmap Triplets for Accurate 3D Human Pose Estimation

OPENALEX - Publications

Kun Zhou Xiaoguang Han Nianjuan Jiang Kui Jia Jiangbo Lu

Estimating 3D human pose from a single image is challenging task. This work attempts to address the uncertainty of lifting detected 2D joints space by introducing an intermediate state - Part-Centric Heatmap Triplets (HEMlets), which shortens gap between observation and interpretation. The HEMlets utilize three joint-heatmaps represent relative depth information end-joints for each skeletal body part. In our approach, Convolutional Network(ConvNet) first trained predict HEMlests input image,...

10.1109/iccv.2019.00243 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

LAPAR: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and Beyond

OPENALEX - Publications

Wenbo Li Kun Zhou Lu Qi Nianjuan Jiang Jiangbo Lu and 1 more

Single image super-resolution (SISR) deals with a fundamental problem of upsampling low-resolution (LR) to its high-resolution (HR) version. Last few years have witnessed impressive progress propelled by deep learning methods. However, one critical challenge faced existing methods is strike sweet spot model complexity and resulting SISR quality. This paper addresses this pain point proposing linearly-assembled pixel-adaptive regression network (LAPAR), which casts the direct LR HR mapping...

10.48550/arxiv.2105.10422 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Best-Buddy GANs for Highly Detailed Image Super-resolution

OPENALEX - Publications

Wenbo Li Kun Zhou Lu Qi Liying Lu Jiangbo Lu

We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) is generated based on low-resolution (LR) input. Recently, generative adversarial networks (GANs) become popular to hallucinate details. Most methods along this line rely predefined single-LR-single-HR mapping, which not flexible enough for ill-posed SISR task. Also, GAN-generated fake details may often undermine realism of whole image. address these issues by proposing best-buddy GANs (Beby-GAN)...

10.1609/aaai.v36i2.20030 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Revisiting Temporal Alignment for Video Restoration

OPENALEX - Publications

Kun Zhou Wenbo Li Liying Lu Xiaoguang Han Jiangbo Lu

Long-range temporal alignment is critical yet challenging for video restoration tasks. Recently, some works attempt to divide the long-range into several sub-alignments and handle them progressively. Although this operation helpful in modeling distant correspondences, error accumulation inevitable due propagation mechanism. In work, we present a novel, generic iterative module which employs gradual refinement scheme sub-alignments, yielding more accurate motion compensation. To further...

10.1109/cvpr52688.2022.00596 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

OPENALEX - Publications

Kun Zhou Wenbo Li Xiaoguang Han Jiangbo Lu

For video frame interpolation (VFI), existing deep-learning-based approaches strongly rely on the ground-truth (GT) intermediate frames, which sometimes ignore non-unique nature of motion judging from given adjacent frames. As a result, these methods tend to produce averaged solutions that are not clear enough. To alleviate this issue, we propose relax requirement reconstructing an as close GT possible. Towards end, develop texture consistency loss (TCL) upon assumption interpolated content...

10.1109/cvpr52729.2023.02123 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

OPENALEX - Publications

Kun Zhou Wenbo Li Yi Wang Tao Hu Nianjuan Jiang and 2 more

Neural radiance fields (NeRF) show great success in novel view synthesis. However, real-world scenes, recovering high-quality details from the source images is still challenging for existing NeRF-based approaches, due to potential imperfect calibration information and scene representation inaccuracy. Even with training frames, synthetic views produced by NeRF models suffer notable rendering artifacts, such as noise, blur, etc. Towards improve synthesis quality of we propose NeRFLiX, a...

10.1109/cvpr52729.2023.01190 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Personalized prediction for recurrence of cystitis glandularis: insights from SHAP and machine learning models

OPENALEX - Publications

Yuyang Yuan Fu‐Chun Zheng Jiming Yao Kun Zhou Jiaqing Yang and 6 more

Cystitis glandularis (CG) is a rare urological condition characterized by glandular metaplasia of the bladder mucosa. Recurrence following transurethral resection (TUR) significant clinical challenge. Traditional predictive models often fail to capture complexity data, resulting in insufficient accuracy. In contrast, machine learning (ML) has demonstrated substantial potential medical prediction identifying and analyzing complex patterns that are undetectable conventional methods. This study...

10.21037/tau-2024-665 article EN Translational Andrology and Urology 2025-03-01

Laplacian optimal design for image retrieval

OPENALEX - Publications

Xiaofei He Wanli Min Deng Cai Kun Zhou

Relevance feedback is a powerful technique to enhance Content-Based Image Retrieval (CBIR) performance. It solicits the user's relevance judgments on retrieved images returned by CBIR systems. The labeling then used learn classifier distinguish between relevant and irrelevant images. However, top returnedimages may not be most informative ones. challenge thus determine which unlabeled would (i.e., improve most) if they were labeled as training samples. In this paper, we propose novel active...

10.1145/1277741.1277764 article EN 2007-07-23

A geodesic-preserving method for image warping

OPENALEX - Publications

Dongping Li Kaiming He Jian Sun Kun Zhou

The manipulation of panoramic/wide-angle images is usually achieved via image warping. Though various techniques have been developed for preserving shapes and straight lines warping, these are not sufficient images. projections will turn the into curved "geodesic lines", it fundamentally impossible to keep all straight. In this work, we propose a geodesic-preserving method content-aware An energy term introduced preserve geodesic appearance lines, can be used with shape-preserving terms. Our...

10.1109/cvpr.2015.7298617 article EN 2015-06-01

Context-Aware and Attention-Driven Weighted Fusion Traffic Sign Detection Network

OPENALEX - Publications

Guibao Wang Kun Zhou Lizhe Wang Lanmei Wang

The detection and recognition of traffic signs in complex environments has received extensive attention, the correct small targets occluded are two key issues. This paper proposes a context-aware attention-driven weighted fusion network for sign detection. Specifically, design context module not only enhances diversity global features, but also reduces sensitivity convolution to objects. In addition, feature pyramid is designed efficiently fuse deep semantic information shallow...

10.1109/access.2023.3264214 article EN cc-by-nc-nd IEEE Access 2023-01-01

Image Inpainting via Iteratively Decoupled Probabilistic Modeling

OPENALEX - Publications

Wenbo Li Xin Yu Kun Zhou Yibing Song Zhe Lin and 1 more

Generative adversarial networks (GANs) have made great success in image inpainting yet still difficulties tackling large missing regions. In contrast, iterative probabilistic algorithms, such as autoregressive and denoising diffusion models, to be deployed with massive computing resources for decent effect. To achieve high-quality results low computational cost, we present a novel pixel spread model (PSM) that iteratively employs decoupled modeling, combining the optimization efficiency of...

10.48550/arxiv.2212.02963 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Message from the Best Paper Award Committee

OPENALEX - Publications

Ming C. Lin Baoquan Chen Ying He Wenping Wang Kun Zhou and 1 more

the Best Paper Award Committee to select Paper.After careful deliberation, following paper was chosen with unanimous consensus as winner, on basis of its intellectual merit and potential impact:Visual attention network [1] Two other papers were awarded an

10.1007/s41095-024-0435-z article EN cc-by Computational Visual Media 2024-05-14

Exploring the Design Space of Visual Context Representation in Video MLLMs

OPENALEX - Publications

Yifan Du Yuqi Huo Kun Zhou Zijia Zhao Haoyu Lu and 5 more

Video Multimodal Large Language Models (MLLMs) have shown remarkable capability of understanding the video semantics on various downstream tasks. Despite advancements, there is still a lack systematic research visual context representation, which refers to scheme select frames from and further tokens frame. In this paper, we explore design space for aim improve performance MLLMs by finding more effective representation schemes. Firstly, formulate task as constrained optimization problem,...

10.48550/arxiv.2410.13694 preprint EN arXiv (Cornell University) 2024-10-17

Point spread function(PSF) encoding EPI versus BLADE DWI in brain tumor diagnosis

OPENALEX - Publications

Wen Zhong Yuan Lian Zhimin Huang Mangsuo Zhao Kun Zhou and 6 more

Motivation: High-resolution DWI plays a crucial role in brain tumor diagnosis. Previous studies have introduced two high-resolution distortion-free techniques: PSF and BLADE. However, no one has yet compared the MR imaging. Goal(s): To compare image quality of BLADE DWI. Approach: In this study, scan parameters were adjusted to achieve optimized for Subsequently, scans performed on patients, final was compared. Results: With scanning times being similar, exhibits superior SNR while its...

10.58530/2024/3499 article EN Proceedings on CD-ROM - International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition/Proceedings of the International Society for Magnetic Resonance in Medicine, Scientific Meeting and Exhibition 2024-11-26

Enhancing Visual Reasoning with Autonomous Imagination in Multimodal Large Language Models

OPENALEX - Publications

Jingming Liu Yumeng Li Bo Xiao Yu‐Cin Jian Zheng Qin and 3 more

There have been recent efforts to extend the Chain-of-Thought (CoT) paradigm Multimodal Large Language Models (MLLMs) by finding visual clues in input scene, advancing reasoning ability of MLLMs. However, current approaches are specially designed for tasks where clue plays a major role whole process, leading difficulty handling complex scenes does not actually simplify task. To deal with this challenge, we propose new enabling MLLMs autonomously modify scene ones based on its status, such...

10.48550/arxiv.2411.18142 preprint EN arXiv (Cornell University) 2024-11-27

Vector solid textures

OPENALEX - Publications

Lvdi Wang Kun Zhou Yizhou Yu Baining Guo

In this paper, we introduce a compact random-access vector representation for solid textures made of intermixed regions with relatively smooth internal color variations. It is feature-preserving and resolution-independent. representation, texture volume divided into multiple regions. Region boundaries are implicitly defined using signed distance function. Color variations within the represented compactly supported radial basis functions (RBFs). With spatial indexing structure, such RBFs...

10.1145/1833349.1778823 article EN 2010-07-15

Visually-augmented pretrained language models for NLP tasks without images

OPENALEX - Publications

Hangyu Guo Kun Zhou Wayne Xin Zhao Qinyu Zhang Ji-Rong Wen

Although pre-trained language models (PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense. Existing solutions often rely on explicit images for knowledge augmentation (requiring time-consuming retrieval generation), and also conduct the whole input text, without considering whether it is actually needed in specific inputs tasks. To address these issues, we propose a novel **V**isually-**A**ugmented fine-tuning...

10.18653/v1/2023.acl-long.833 article EN cc-by 2023-01-01

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

OPENALEX - Publications

Wenbo Li Zhe Lin Kun Zhou Qi Lü Yi Wang and 1 more

10.48550/arxiv.2203.15270 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Best-Buddy GANs for Highly Detailed Image Super-Resolution

OPENALEX - Publications

Wenbo Li Kun Zhou Qi Lü Liying Lu Nianjuan Jiang and 2 more

We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) is generated based on low-resolution (LR) input. Recently, generative adversarial networks (GANs) become popular to hallucinate details. Most methods along this line rely predefined single-LR-single-HR mapping, which not flexible enough for SISR task. Also, GAN-generated fake details may often undermine realism of whole image. address these issues by proposing best-buddy GANs (Beby-GAN) rich-detail...

10.48550/arxiv.2103.15295 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Segmentation Rectification for Video Cutout via One-Class Structured Learning

OPENALEX - Publications

Junyan Wang Sai-Kit Yeung Jue Wang Kun Zhou

Recent works on interactive video object cutout mainly focus designing dynamic foreground-background (FB) classifiers for segmentation propagation. However, the research optimally removing errors from FB classification is sparse, and often accumulate rapidly, causing significant in propagated frames. In this work, we take initial steps to addressing problem, call new task \emph{segmentation rectification}. Our key observation that possibly asymmetrically distributed false positive negative...

10.48550/arxiv.1602.04906 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Coming Soon ...