Dezhi Peng

ORCID: 0000-0002-3263-3449
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Handwritten Text Recognition Techniques
  • Natural Language Processing Techniques
  • Image Processing and 3D Reconstruction
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Digital Media Forensic Detection
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Text and Document Classification Technologies
  • Mathematics, Computing, and Information Processing
  • Vehicle License Plate Recognition
  • Face recognition and analysis
  • Music and Audio Processing
  • Hand Gesture Recognition Systems
  • Topic Modeling
  • Speech Recognition and Synthesis
  • Livestock and Poultry Management
  • Image Processing Techniques and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Genetic and phenotypic traits in livestock
  • Plant Virus Research Studies
  • Video Surveillance and Tracking Methods
  • Advanced Steganography and Watermarking Techniques
  • Advanced Image Processing Techniques
  • Genetic diversity and population structure

South China University of Technology
2018-2025

Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou)
2025

Sun Yat-sen University
2019-2025

China Agricultural University
2011-2024

Ministry of Agriculture and Rural Affairs
2023

University of Minnesota
2013

Automatic font generation is an imitation task, which aims to create a library that mimics the style of reference images while preserving content from source images. Although existing methods have achieved satisfactory performance, they still struggle with complex characters and large variations. To address these issues, we propose FontDiffuser, diffusion-based image-to-image one-shot method, innovatively models task as noise-to-denoise paradigm. In our introduce Multi-scale Content...

10.1609/aaai.v38i7.28482 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Existing scene text spotting (i.e., end-to-end detection and recognition) methods rely on costly bounding box annotations (e.g., text-line, word-level, or character-level boxes). For the first time, we demonstrate that training models can be achieved with an extremely low-cost annotation of a single-point for each instance. We propose method tackles as sequence prediction task. Given image input, formulate desired recognition results discrete tokens use auto-regressive Transformer to predict...

10.1145/3503161.3547942 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

This paper presents a method that can accurately detect heads especially small under the indoor scene. To achieve this, we propose novel method, Feature Refine Net (FRN), and cascaded multi-scale architecture. FRN exploits hierarchical features created by deep convolutional neural networks. The proposed channel weighting enables to make use of alternatively effectively. improve performance head detection, architecture which has two detectors. One called global detector is responsible for...

10.1109/icpr.2018.8545068 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2018-08-01

Online and offline handwritten Chinese text recognition (HTCR) has been studied for decades. Early methods adopted oversegmentation-based strategies but suffered from low speed, insufficient accuracy, high cost of character segmentation annotations. Recently, segmentation-free based on connectionist temporal classification (CTC) attention mechanism, have dominated the field HCTR. However, people actually read by character, especially ideograms such as Chinese. This raises question: are...

10.1109/tmm.2022.3146771 article EN IEEE Transactions on Multimedia 2022-01-27

End-to-end scene text spotting has made significant progress due to its intrinsic synergy between detection and recognition. Previous methods commonly regard manual annotations such as horizontal rectangles, rotated quadrangles, polygons a prerequisite, which are much more expensive than using single-point. Our new framework, SPTS v2, allows us train high-performing text-spotting models single-point annotation. v2 reserves the advantage of auto-regressive Transformer with an Instance...

10.1109/tpami.2023.3312285 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-09-05

This paper aims to re-assess scene text recognition (STR) from a data-oriented perspective. We begin by revisiting the six commonly used benchmarks in STR and observe trend of performance saturation, whereby only 2.91% benchmark images cannot be accurately recognized an ensemble 13 representative models. While these results are impressive suggest that could considered solved, however, we argue this is primarily due less challenging nature common benchmarks, thus concealing underlying issues...

10.1109/iccv51070.2023.01878 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness text-related visual tasks remains relatively unexplored. In this paper, we conducted comprehensive evaluation of Multimodal Models, such as GPT4V Gemini, various including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), Handwritten Mathematical Expression...

10.48550/arxiv.2305.07895 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-based framework. While previous studies have shown crucial importance of intrinsic synergy between detection and recognition, advances in methods usually adopt an implicit strategy with shared query, which can not fully realize potential these two interactive tasks. this paper, we argue that explicit considering distinct characteristics recognition significantly improve performance spotting. To end,...

10.1109/iccv51070.2023.01786 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The eukaryotic mRNA surveillance pathway, a pivotal guardian of fidelity, stands at the nexus diverse biological processes, including antiviral immunity. Despite recognized function splicing factors on fate, intricate interplay shaping pathway remains elusive. We illustrate that conserved factor U2 snRNP auxiliary large subunit B (U2AF65B) modulates complex, contributing to transcriptomic homeostasis in maize. functionality requires ZmU2AF65B-mediated normal upstream frameshift 3 ( ZmUPF3 )...

10.1126/sciadv.adn3010 article EN cc-by-nc Science Advances 2024-08-23

Recently, tampered text detection in document image has attracted increasingly attention due to its essential role on information security. However, detecting visually consistent photographed images is still a main challenge. In this paper, we propose novel framework capture more fine-grained clues complex scenarios for detection, termed as Document Tampering Detector (DTD), which consists of Frequency Perception Head (FPH) compensate the deficiencies caused by inconspicuous visual features,...

10.1109/cvpr52729.2023.00575 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Abstract Apoptotic protease activating factor 1 (Apaf-1) was traditionally defined as a scaffold protein in mammalian cells for assembling caspase activation platform known the ‘apoptosome’ after its binding to cytochrome c . Although Apaf-1 structurally resembles animal NOD-like receptor (NLR) and plant resistance ( R ) proteins, whether it is directly involved innate immunity still largely unknown. Here, we found that Apaf-1-like molecules from lancelets, fruit flies, mice, humans have...

10.1038/s41421-024-00750-4 article EN cc-by Cell Discovery 2025-01-21

The development of Chinese civilization has produced a vast collection historical documents. Recognizing and analyzing these documents hold significant value for the research ancient culture. Recently, researchers have tried to utilize deep-learning techniques automate recognition analysis. However, existing document datasets, which are heavily relied upon by models, suffer from limited data scale, insufficient character category, lack book-level annotation. To fill this gap, we introduce...

10.1038/s41597-025-04495-x article EN cc-by-nc-nd Scientific Data 2025-01-29

Multimodal Large Language Models (MLLMs) are typically based on decoder-only or cross-attention architectures. While MLLMs outperform their counterparts, they require significantly higher computational resources due to extensive self-attention and FFN operations visual tokens. This raises the question: can we eliminate these expensive while maintaining performance? To this end, present a novel analysis framework investigate necessity of costly in MLLMs. Our introduces two key innovations:...

10.48550/arxiv.2501.19036 preprint EN arXiv (Cornell University) 2025-01-31

Historical documents encompass a wealth of cultural treasures but suffer from severe damages including character missing, paper damage, and ink erosion over time. However, existing document processing methods primarily focus on binarization, enhancement, etc., neglecting the repair these damages. To this end, we present new task, termed Document Repair (HDR), which aims to predict original appearance damaged historical documents. fill gap in field, propose large-scale dataset HDR28K...

10.1609/aaai.v39i9.33016 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Large amounts of labeled data are urgently required for the training robust text recognizers. However, collecting handwriting diverse styles, along with an immense lexicon, is considerably expensive. Although synthesis a promising way to relieve hunger, two key issues synthesis, namely, style representation and content embedding, remain unsolved. To this end, we propose novel method that can synthesize parameterized controllable S tyles arbitrary-Length O ut-of-vocabulary based on G...

10.1109/tnnls.2022.3151477 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-02-28

Chinese indigenous chickens (CICs) constitute world-renowned genetic resources due to their excellent traits, including early puberty, good meat quality and strong resistance disease. Unfortunately, the introduction of a large number commercial in past two decades has had an adverse effect on CICs. Using chicken 60 K single nucleotide polymorphism chip, we assessed diversity population structure 1,187 chickens, representing eight breeds, hybrid ancestral populations additional red jungle...

10.1111/eva.12742 article EN cc-by Evolutionary Applications 2018-11-30

DExD/H-box helicases play essential roles in RNA metabolism, and emerging data suggest that they have additional functions antiviral immunity across species. However, little is known about this evolutionarily conserved family responses lower Here, by isolation of poly(I:C)-binding proteins amphioxus, an extant basal chordate, we found DHX9, DHX15 DDX23 to be responsible for cytoplasmic dsRNA detection amphioxus. Since the not been characterized mammals, performed further poly(I:C) pull down...

10.3389/fimmu.2019.02202 article EN cc-by Frontiers in Immunology 2019-09-18

Handwritten Chinese Text Recognition (HCTR) is a challenging problem due to its high complexity. Previous methods based on over-segmentation, hidden Markov model (HMM) or long short-term memory recurrent neural network (LSTM-RNN) have achieved great success in recognition results. However, all of them, including over-segmentation methods, are incompetent accurate segmentation single character. To solve this problem, we propose fast and fully convolutional for end-to-end handwritten text....

10.1109/icdar.2019.00014 article EN 2019-09-01

End-to-end scene text spotting, which aims to read the in natural images, has garnered significant attention recent years. However, state-of-the-art methods usually incorporate detection and recognition simply by sharing backbone, does not directly take advantage of feature interaction between two tasks. In this paper, we propose a new end-to-end spotting framework termed SwinTextSpotter v2, seeks find better synergy recognition. Specifically, enhance relationship tasks using novel...

10.48550/arxiv.2401.07641 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Scene text removal (STR) aims at replacing strokes in natural scenes with visually coherent backgrounds. Recent STR approaches rely on iterative refinements or explicit masks, resulting high complexity and sensitivity to the accuracy of localization. Moreover, most existing methods adopt convolutional architectures while potential vision Transformers (ViTs) remains largely unexplored. In this paper, we propose a simple-yet-effective ViT-based eraser, dubbed ViTEraser. Following concise...

10.1609/aaai.v38i5.28245 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

10.1109/cvpr52733.2024.01482 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

10.1109/cvpr52733.2024.01025 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16
Coming Soon ...