- Handwritten Text Recognition Techniques
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Anomaly Detection Techniques and Applications
- Generative Adversarial Networks and Image Synthesis
- Image Retrieval and Classification Techniques
- Digital Media Forensic Detection
- Natural Language Processing Techniques
- Image Processing and 3D Reconstruction
- Video Analysis and Summarization
- Human Pose and Action Recognition
- Vehicle License Plate Recognition
- Multimodal Machine Learning Applications
- Adversarial Robustness in Machine Learning
- Remote-Sensing Image Classification
- Advanced Vision and Imaging
- Model Reduction and Neural Networks
- Machine Fault Diagnosis Techniques
- Automated Road and Building Extraction
- Domain Adaptation and Few-Shot Learning
- Advanced Image Processing Techniques
- Computer Graphics and Visualization Techniques
- Video Surveillance and Tracking Methods
- Image Processing Techniques and Applications
University of Electronic Science and Technology of China
2019-2024
To achieve high coverage of target boxes, a normal strategy conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present simple and intuitive method for multi-oriented where location feature maps only associates with one reference box. The idea inspired from the two-stage R-CNN framework that can estimate objects any shape by using learned proposals. aim our integrate mechanism into...
Recently, scene text recognition methods based on deep learning have sprung up in computer vision area. The existing achieved great performances, but the of irregular is still challenging due to various shapes and distorted patterns. Consider that at time reading words real world, normally we will not rectify it our mind adjust focus visual fields. Similarly, through utilizing deformable convolutional layers whose geometric structures are adjustable, present an enhanced network without steps...
Previous feature alignment methods in Unsupervised domain adaptation(UDA) mostly only align global features without considering the mismatch between class-wise features. In this work, we propose a new coarse-to-fine method using contrastive learning called CFContra. It draws closer than coarse or only, therefore improves model's performance to great extent. We build it upon one of most effective UDA entropy minimization further improve performance. particular, prevent excessive memory...
Image synthesis is a critical task in various computer vision technologies, and lots of methods tried to translate semantic images into realistic ones for controllable synthesis. With the increasing image resolution, networks are becoming larger, applications related restricted. To alleviate problem, we propose lightweight mutable network The based on generative adversarial networks. We introduce feature pyramid architecture generator reduce hidden node numbers. also design scheme where will...
Image synthesis is a critical technique in the image processing field. Recently, generative adversarial networks (GANs) have played significant role tasks. However, issue of mode collapse remains major challenge GANs, which limits their potential applications. We propose method to address problem. Our approach focuses on minimizing divergence between distributions real and generated features, thereby reducing learning pressure discriminator. An advantage our that it does not require prior...
Mode collapse is a significant unsolved issue of generative adversarial networks (GANs). In this work, we examine the causes mode from novel perspective. Due to nonuniform sampling in training process, some subdistributions may be missed when data. As result, even generated distribution differs real one, GAN objective can still achieve minimum. To address issue, propose global fitting (GDF) method with penalty term confine data distribution. When GDF will make harder reach minimal value,...
At present, object detection performance can meet some routine tasks' requirements. However, the for small-sized objects is far from satisfactory. Therefore, we propose feature layer attention module and nonlinear positioning loss penalty based on size to improve small performance. Our work proposes module, which introduces an mechanism in enhance model's objects. Through fusion scheme proposed this paper, solve problem of insufficient features a certain extent reduce difficulty model...
To achieve high coverage of target boxes, a normal strategy conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present simple and intuitive method for multi-oriented where location feature maps only associates with one reference box. The idea inspired from the twostage R-CNN framework that can estimate objects any shape by using learned proposals. aim our integrate mechanism into...
The interpretability of Convolutional Neural Networks (CNNs) is an important topic in the field computer vision. In recent years, works this generally adopt a mature model to reveal internal mechanism CNNs, helping understand CNNs thoroughly. paper, we argue working can be revealed through totally different interpretation, by comparing communication systems and CNNs. This paper successfully obtained corresponding relationship between modules two, verified rationality with experiments....
High Resolution Remote Sensing Images (HRRSIs) usually have a larger size compared with natural images. Because of the limitation GPU memory, it is not possible to train semantic segmentation models on HRRSIs directly. Commonly used methodologies perform training and prediction cropped sub-images. Thus they fail model potential dependencies between pixels beyond To solve this problem, we firstly propose extra context attention capture global information from receptive fields discriminative...
In the image-to-image translation field, most researchers tend to achieve overall of images without paying too much attention texture details images. However, it is also great importance have enhanced and more realistic textures for synthesized images, which could bring better impressions. Therefore, in this work, we propose a method based on CycleGAN output highly improved. The presented generator involves dilated convolutions are conducive processing image details. Furthermore, an improved...