NFDI4DS | UHH-SEMS - Publication Details

Dezhi Peng

ORCID: 0000-0002-3263-3449

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5012042767

Research Areas

Handwritten Text Recognition Techniques
Natural Language Processing Techniques
Image Processing and 3D Reconstruction
Advanced Neural Network Applications
Multimodal Machine Learning Applications
Digital Media Forensic Detection
Advanced Image and Video Retrieval Techniques
Image Retrieval and Classification Techniques
Text and Document Classification Technologies
Mathematics, Computing, and Information Processing
Vehicle License Plate Recognition
Face recognition and analysis
Music and Audio Processing
Hand Gesture Recognition Systems
Topic Modeling
Speech Recognition and Synthesis
Livestock and Poultry Management
Image Processing Techniques and Applications
Generative Adversarial Networks and Image Synthesis
Genetic and phenotypic traits in livestock
Plant Virus Research Studies
Video Surveillance and Tracking Methods
Advanced Steganography and Watermarking Techniques
Advanced Image Processing Techniques
Genetic diversity and population structure

South China University of Technology
2018-2025

Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou)
2025

Sun Yat-sen University
2019-2025

China Agricultural University
2011-2024

Ministry of Agriculture and Rural Affairs
2023

University of Minnesota
2013

FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning

OPENALEX - Publications

Zhenhua Yang Dezhi Peng Yuxin Kong Yuyi Zhang Cong Yao and 1 more

Automatic font generation is an imitation task, which aims to create a library that mimics the style of reference images while preserving content from source images. Although existing methods have achieved satisfactory performance, they still struggle with complex characters and large variations. To address these issues, we propose FontDiffuser, diffusion-based image-to-image one-shot method, innovatively models task as noise-to-denoise paradigm. In our introduce Multi-scale Content...

10.1609/aaai.v38i7.28482 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

SPTS: Single-Point Text Spotting

OPENALEX - Publications

Dezhi Peng Xinyu Wang Yuliang Liu Jiaxin Zhang Mingxin Huang and 7 more

Existing scene text spotting (i.e., end-to-end detection and recognition) methods rely on costly bounding box annotations (e.g., text-line, word-level, or character-level boxes). For the first time, we demonstrate that training models can be achieved with an extremely low-cost annotation of a single-point for each instance. We propose method tackles as sequence prediction task. Given image input, formulate desired recognition results discrete tokens use auto-regressive Transformer to predict...

10.1145/3503161.3547942 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Detecting Heads using Feature Refine Net and Cascaded Multi-scale Architecture

OPENALEX - Publications

Dezhi Peng Zikai Sun Zirong Chen Zirui Cai Lele Xie and 1 more

This paper presents a method that can accurately detect heads especially small under the indoor scene. To achieve this, we propose novel method, Feature Refine Net (FRN), and cascaded multi-scale architecture. FRN exploits hierarchical features created by deep convolutional neural networks. The proposed channel weighting enables to make use of alternatively effectively. improve performance head detection, architecture which has two detectors. One called global detector is responsible for...

10.1109/icpr.2018.8545068 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2018-08-01

Recognition of Handwritten Chinese Text by Segmentation: A Segment-Annotation-Free Approach

OPENALEX - Publications

Dezhi Peng Lianwen Jin Weihong Ma Canyu Xie Hesuo Zhang and 2 more

Online and offline handwritten Chinese text recognition (HTCR) has been studied for decades. Early methods adopted oversegmentation-based strategies but suffered from low speed, insufficient accuracy, high cost of character segmentation annotations. Recently, segmentation-free based on connectionist temporal classification (CTC) attention mechanism, have dominated the field HCTR. However, people actually read by character, especially ideograms such as Chinese. This raises question: are...

10.1109/tmm.2022.3146771 article EN IEEE Transactions on Multimedia 2022-01-27

SPTS v2: Single-Point Scene Text Spotting

OPENALEX - Publications

Yuliang Liu Jiaxin Zhang Dezhi Peng Mingxin Huang Xinyu Wang and 6 more

End-to-end scene text spotting has made significant progress due to its intrinsic synergy between detection and recognition. Previous methods commonly regard manual annotations such as horizontal rectangles, rotated quadrangles, polygons a prerequisite, which are much more expensive than using single-point. Our new framework, SPTS v2, allows us train high-performing text-spotting models single-point annotation. v2 reserves the advantage of auto-regressive Transformer with an Instance...

10.1109/tpami.2023.3312285 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-09-05

Revisiting Scene Text Recognition: A Data Perspective

OPENALEX - Publications

Qing Jiang Jiapeng Wang Dezhi Peng Chongyu Liu Lianwen Jin

This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective. We begin by revisiting the six commonly used benchmarks in STR and observe trend of performance saturation, whereby only 2.91% benchmark images cannot be accurately recognized an ensemble 13 representative models. While these results are impressive suggest that could considered solved, however, we argue this is primarily due less challenging nature common benchmarks, thus concealing underlying issues...

10.1109/iccv51070.2023.01878 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

On the Hidden Mystery of OCR in Large Multimodal Models

OPENALEX - Publications

Yuliang Liu Zhang Li Hongliang Li Wenwen Yu Mingxin Huang and 6 more

Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness text-related visual tasks remains relatively unexplored. In this paper, we conducted comprehensive evaluation of Multimodal Models, such as GPT4V Gemini, various including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), Handwritten Mathematical Expression...

10.48550/arxiv.2305.07895 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

OPENALEX - Publications

Mingxin Huang Jiaxin Zhang Dezhi Peng Hao Lu Can Huang and 3 more

In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-based framework. While previous studies have shown crucial importance of intrinsic synergy between detection and recognition, advances in methods usually adopt an implicit strategy with shared query, which can not fully realize potential these two interactive tasks. this paper, we argue that explicit considering distinct characteristics recognition significantly improve performance spotting. To end,...

10.1109/iccv51070.2023.01786 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Maize splicing-mediated mRNA surveillance impeded by sugarcane mosaic virus-coded pathogenic protein NIa-Pro

OPENALEX - Publications

Kaitong Du Dezhi Peng Jiqiu Wu Yabing Zhu Tong Jiang and 7 more

The eukaryotic mRNA surveillance pathway, a pivotal guardian of fidelity, stands at the nexus diverse biological processes, including antiviral immunity. Despite recognized function splicing factors on fate, intricate interplay shaping pathway remains elusive. We illustrate that conserved factor U2 snRNP auxiliary large subunit B (U2AF65B) modulates complex, contributing to transcriptomic homeostasis in maize. functionality requires ZmU2AF65B-mediated normal upstream frameshift 3 ( ZmUPF3 )...

10.1126/sciadv.adn3010 article EN cc-by-nc Science Advances 2024-08-23

Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution

OPENALEX - Publications

Chenfan Qu Chongyu Liu Yuliang Liu Xinhong Chen Dezhi Peng and 2 more

Recently, tampered text detection in document image has attracted increasingly attention due to its essential role on information security. However, detecting visually consistent photographed images is still a main challenge. In this paper, we propose novel framework capture more fine-grained clues complex scenarios for detection, termed as Document Tampering Detector (DTD), which consists of Frequency Perception Head (FPH) compensate the deficiencies caused by inconspicuous visual features,...

10.1109/cvpr52729.2023.00575 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Apaf-1 is an evolutionarily conserved DNA sensor that switches the cell fate between apoptosis and inflammation

OPENALEX - Publications

Jie Ruan Xuxia Wei Sen Li Zijian Ye Linyi Hu and 8 more

Abstract Apoptotic protease activating factor 1 (Apaf-1) was traditionally defined as a scaffold protein in mammalian cells for assembling caspase activation platform known the ‘apoptosome’ after its binding to cytochrome c . Although Apaf-1 structurally resembles animal NOD-like receptor (NLR) and plant resistance ( R ) proteins, whether it is directly involved innate immunity still largely unknown. Here, we found that Apaf-1-like molecules from lancelets, fruit flies, mice, humans have...

10.1038/s41421-024-00750-4 article EN cc-by Cell Discovery 2025-01-21

A large-scale dataset for Chinese historical document recognition and analysis

OPENALEX - Publications

Yongxin Shi Dezhi Peng Y Zhang Jiahuan Cao Lianwen Jin

The development of Chinese civilization has produced a vast collection historical documents. Recognizing and analyzing these documents hold significant value for the research ancient culture. Recently, researchers have tried to utilize deep-learning techniques automate recognition analysis. However, existing document datasets, which are heavily relied upon by models, suffer from limited data scale, insufficient character category, lack book-level annotation. To fill this gap, we introduce...

10.1038/s41597-025-04495-x article EN cc-by-nc-nd Scientific Data 2025-01-29

Beyond Token Compression: A Training-Free Reduction Framework for Efficient Visual Processing in MLLMs

OPENALEX - Publications

Hongliang Li Jiaxin Zhang Wenhui Liao Dezhi Peng Kai Ding and 1 more

Multimodal Large Language Models (MLLMs) are typically based on decoder-only or cross-attention architectures. While MLLMs outperform their counterparts, they require significantly higher computational resources due to extensive self-attention and FFN operations visual tokens. This raises the question: can we eliminate these expensive while maintaining performance? To this end, present a novel analysis framework investigate necessity of costly in MLLMs. Our introduces two key innovations:...

10.48550/arxiv.2501.19036 preprint EN arXiv (Cornell University) 2025-01-31

Towards Accurate Readings of Water Meters by Eliminating Transition Error: New Dataset and Effective Solution

OPENALEX - Publications

Jiaxin Zhang Deming Jia Chongyu Liu Dezhi Peng Bangdong Chen and 2 more

10.1109/tim.2025.3547081 article EN IEEE Transactions on Instrumentation and Measurement 2025-01-01

Predicting the Original Appearance of Damaged Historical Documents

OPENALEX - Publications

Zhenhua Yang Dezhi Peng Yongxin Shi Y Zhang Chongyu Liu and 1 more

Historical documents encompass a wealth of cultural treasures but suffer from severe damages including character missing, paper damage, and ink erosion over time. However, existing document processing methods primarily focus on binarization, enhancement, etc., neglecting the repair these damages. To this end, we present new task, termed Document Repair (HDR), which aims to predict original appearance damaged historical documents. fill gap in field, propose large-scale dataset HDR28K...

10.1609/aaai.v39i9.33016 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and Out-of-Vocabulary Text

OPENALEX - Publications

Canjie Luo Yuanzhi Zhu Lianwen Jin Zhe Li Dezhi Peng

Large amounts of labeled data are urgently required for the training robust text recognizers. However, collecting handwriting diverse styles, along with an immense lexicon, is considerably expensive. Although synthesis a promising way to relieve hunger, two key issues synthesis, namely, style representation and content embedding, remain unsolved. To this end, we propose novel method that can synthesize parameterized controllable S tyles arbitrary-Length O ut-of-vocabulary based on G...

10.1109/tnnls.2022.3151477 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-02-28

HierCode: A lightweight hierarchical codebook for zero-shot Chinese text recognition

OPENALEX - Publications

Y Zhang Yuanzhi Zhu Dezhi Peng Peirong Zhang Zhenhua Yang and 3 more

10.1016/j.patcog.2024.110963 article EN Pattern Recognition 2024-08-31

Widespread introgression in Chinese indigenous chicken breeds from commercial broiler

OPENALEX - Publications

Zhang Chun-yuan Deng Lin Yuzhe Wang Dezhi Peng Huifang Li and 6 more

Chinese indigenous chickens (CICs) constitute world-renowned genetic resources due to their excellent traits, including early puberty, good meat quality and strong resistance disease. Unfortunately, the introduction of a large number commercial in past two decades has had an adverse effect on CICs. Using chicken 60 K single nucleotide polymorphism chip, we assessed diversity population structure 1,187 chickens, representing eight breeds, hybrid ancestral populations additional red jungle...

10.1111/eva.12742 article EN cc-by Evolutionary Applications 2018-11-30

DDX23, an Evolutionary Conserved dsRNA Sensor, Participates in Innate Antiviral Responses by Pairing With TRIF or MAVS

OPENALEX - Publications

Jie Ruan Yange Cao Tao Ling Peiyi Li Shengpeng Wu and 6 more

DExD/H-box helicases play essential roles in RNA metabolism, and emerging data suggest that they have additional functions antiviral immunity across species. However, little is known about this evolutionarily conserved family responses lower Here, by isolation of poly(I:C)-binding proteins amphioxus, an extant basal chordate, we found DHX9, DHX15 DDX23 to be responsible for cytoplasmic dsRNA detection amphioxus. Since the not been characterized mammals, performed further poly(I:C) pull down...

10.3389/fimmu.2019.02202 article EN cc-by Frontiers in Immunology 2019-09-18

A Fast and Accurate Fully Convolutional Network for End-to-End Handwritten Chinese Text Segmentation and Recognition

OPENALEX - Publications

Dezhi Peng Lianwen Jin Yaqiang Wu Zhepeng Wang Mingxiang Cai

Handwritten Chinese Text Recognition (HCTR) is a challenging problem due to its high complexity. Previous methods based on over-segmentation, hidden Markov model (HMM) or long short-term memory recurrent neural network (LSTM-RNN) have achieved great success in recognition results. However, all of them, including over-segmentation methods, are incompetent accurate segmentation single character. To solve this problem, we propose fast and fully convolutional for end-to-end handwritten text....

10.1109/icdar.2019.00014 article EN 2019-09-01

SideNet: Learning representations from interactive side information for zero-shot Chinese character recognition

OPENALEX - Publications

Ziyan Li Yuhao Huang Dezhi Peng Mengchao He Lianwen Jin

10.1016/j.patcog.2023.110208 article EN Pattern Recognition 2023-12-15

SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting

OPENALEX - Publications

Mingxin Huang Dezhi Peng Hongliang Li Zhenghao Peng Chongyu Liu and 4 more

End-to-end scene text spotting, which aims to read the in natural images, has garnered significant attention recent years. However, state-of-the-art methods usually incorporate detection and recognition simply by sharing backbone, does not directly take advantage of feature interaction between two tasks. In this paper, we propose a new end-to-end spotting framework termed SwinTextSpotter v2, seeks find better synergy recognition. Specifically, enhance relationship tasks using novel...

10.48550/arxiv.2401.07641 preprint EN other-oa arXiv (Cornell University) 2024-01-01

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

OPENALEX - Publications

Dezhi Peng Chongyu Liu Yuliang Liu Lianwen Jin

Scene text removal (STR) aims at replacing strokes in natural scenes with visually coherent backgrounds. Recent STR approaches rely on iterative refinements or explicit masks, resulting high complexity and sensitivity to the accuracy of localization. Moreover, most existing methods adopt convolutional architectures while potential vision Transformers (ViTs) remains largely unexplored. In this paper, we propose a simple-yet-effective ViT-based eraser, dubbed ViTEraser. Following concise...

10.1609/aaai.v38i5.28245 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

OPENALEX - Publications

Jiaxin Zhang Dezhi Peng Chongyu Liu Peirong Zhang Lianwen Jin

10.1109/cvpr52733.2024.01482 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods

OPENALEX - Publications

Chenfan Qu Yiwu Zhong Chongyu Liu Guitao Xu Dezhi Peng and 2 more

10.1109/cvpr52733.2024.01025 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Coming Soon ...