Xinyu Wang

ORCID: 0000-0002-0151-9133
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Generative Adversarial Networks and Image Synthesis
  • Image and Signal Denoising Methods
  • Image Enhancement Techniques
  • Digital Media Forensic Detection
  • Advanced Neural Network Applications
  • Advanced Image and Video Retrieval Techniques
  • Optical measurement and interference techniques
  • Handwritten Text Recognition Techniques
  • Advanced Image Processing Techniques
  • AI in cancer detection
  • Hand Gesture Recognition Systems
  • Natural Language Processing Techniques
  • Human Pose and Action Recognition
  • Underwater Vehicles and Communication Systems
  • Multimodal Machine Learning Applications
  • Image Retrieval and Classification Techniques
  • Advanced Vision and Imaging
  • Speech Recognition and Synthesis
  • Image Processing Techniques and Applications
  • Adversarial Robustness in Machine Learning
  • Advanced Technologies in Various Fields
  • Advanced Graph Neural Networks
  • Brain Tumor Detection and Classification
  • Artificial Intelligence in Healthcare
  • Gait Recognition and Analysis

North China University of Water Resources and Electric Power
2023-2025

University of Macau
2024

Baidu (China)
2024

Civil Aviation University of China
2023-2024

Chongqing University of Posts and Telecommunications
2022-2023

Donghua University
2023

Harbin Huade University
2023

University of Southern California
2022

Beijing Sport University
2022

Peking University
2020

Underwater object detection is an important computer vision task that has been widely used in marine life identification and tracking. However, problems such as low contrast conditions, occlusion condition, unbalanced light condition small dense objects bring a series of challenges to underwater detection. Considering these challenges, several methods have proposed extract features more efficiently. Attention mechanism proven powerful feature extraction. the attention ignores internal...

10.1117/12.3057861 article EN 2025-02-13

A novel method for detecting CNN-generated images, called Attentive PixelHop (or A-PixelHop), is proposed in this work. It has three advantages: 1) low computational complexity and a small model size, 2) high detection performance against wide range of generative models, 3) mathematical transparency. A-PixelHop designed under the assumption that it difficult to synthesize high-quality, high-frequency components local regions. contains four building modules: selecting edge/texture blocks...

10.1109/icassp43922.2022.9747901 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed main reason for superiority of Graph Neural Networks (GNNs) over traditional (NNs) on graph-structured data, especially node-level tasks. However, recent work identified a non-trivial set datasets where GNN's performance compared NN's is not satisfactory. Heterophily, i.e. low homophily, considered cause this empirical observation. People have begun revisit...

10.48550/arxiv.2407.09618 preprint EN arXiv (Cornell University) 2024-07-12

Medical multi-modal retrieval aims to provide doctors with similar medical images from different modalities, which can greatly promote the efficiency and accuracy of clinical diagnosis. However, most existing methods hardly support images, i.e., number modalities is greater than 2, just convert classification or clustering. It futilely breaks gap between visual information semantic in image modalities. To solve problem, a S upervised C ontrast L earning method based on M ultiple P seudo-...

10.1145/3637441 article EN ACM Transactions on Multimedia Computing Communications and Applications 2023-12-13

Deep learning-based algorithms for enhancing underwater images have demonstrated outstanding performance in recent years. However, numerous limitations exist applying existing methods to various environments, viewed as water types. The lack of generalization ability and visual consistency remain significant challenges, they neglect the domain gaps scenes differences multi-frequency nature information. A critical desideratum image enhancement is establish a strong correspondence between...

10.1117/1.jei.33.5.053035 article EN Journal of Electronic Imaging 2024-10-10

10.1561/116.00000005 article EN cc-by-nc APSIPA Transactions on Signal and Information Processing 2022-01-01

In this paper, we present a robust method of sketch recognition for course action (COA) diagrams. User input is through free-hand sketching. COA symbols are recognized incrementally and the informal sketching replaced with formal vector graphs or images symbols. Multi-stroke divided into temporally continuous uni-stroke shapes. Firstly, it removes over-crossed end part Then, extracts invariant geometric features convex hull, largest-area inscribed smallest-area enclosing polygons, perimeter...

10.1109/icma.2010.5588659 article EN 2010-08-01

Abstract Cross-modal hashing is an efficient method to embed high-dimensional heterogeneous modal feature descriptors into a consistency-preserving Hamming space with low-dimensional. Most existing cross-modal methods have been able bridge the modality gap, but there are still two challenges resulting in limited retrieval accuracy: (1) ignoring continuous similarity of samples on manifold; (2) lack discriminability hash codes same semantics. To cope these problems, we propose Deep...

10.1038/s41598-023-29320-6 article EN cc-by Scientific Reports 2023-02-09

A novel method for detecting CNN-generated images, called Attentive PixelHop (or A-PixelHop), is proposed in this work. It has three advantages: 1) low computational complexity and a small model size, 2) high detection performance against wide range of generative models, 3) mathematical transparency. A-PixelHop designed under the assumption that it difficult to synthesize high-quality, high-frequency components local regions. contains four building modules: selecting edge/texture blocks...

10.48550/arxiv.2111.04012 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Alzheimer’s disease (AD) is an irreversible neu-rological disorder, so early medical diagnosis extremely im-portant. Magnetic resonance imaging (MRI) one of the main methods used clinically to detect and diagnose AD. However, most existing computer-aided diagnostic only use MRI slices for model architecture design. They ig-nore informational differences between all slices. In addition, physicians often multimodal data, such as images clinical information, patients. The approach helps make...

10.14569/ijacsa.2023.01401108 article EN International Journal of Advanced Computer Science and Applications 2023-01-01

CycleGAN has been a benchmark in the style transfer field and various extensions with wide applications excellent performance have introduced recent years, however, discussion about its architecture exploration which could enable us to further understand concept of generative model is scarce. In this paper, several architectures referenced from classical convolutional neural networks are implemented into generator discriminator cycleGAN model, including AlexNet, DenseNet, GoogLeNet, ResNet....

10.54254/2755-2721/50/20241144 article EN cc-by Applied and Computational Engineering 2024-03-22

Extremely low-light text images are common in natural scenes, making scene detection and recognition challenging. One solution is to enhance these using image enhancement methods before extraction. However, previous often do not try particularly address the significance of low-level features, which crucial for optimal performance on downstream tasks. Further research also hindered by lack extremely datasets. To limitations, we propose a novel encoder-decoder framework with an edge-aware...

10.48550/arxiv.2404.14135 preprint EN arXiv (Cornell University) 2024-04-22

10.1109/icarcv63323.2024.10821605 article EN 2022 17th International Conference on Control, Automation, Robotics and Vision (ICARCV) 2024-12-12

A statistical attention localization (SAL) method is proposed to facilitate the object classification task in this work. SAL consists of three steps: 1) preliminary window selection via decision statistics, 2) map refinement, and 3) rectangular region finalization. computes soft-decision scores local squared windows uses them identify salient regions Step 1. To accommodate various sizes shapes, refines result obtain an more flexible shape 2. Finally, yields a using refined bounding box...

10.48550/arxiv.2208.01823 preprint EN other-oa arXiv (Cornell University) 2022-01-01

<title>Abstract</title> Automatically learned aesthetic assessment for images can provide auxiliary value the fields of art and design. In this study, we explore aesthetics methods on visual esthetics automobile headlamp images, compare feasibility calculation method artificial degree evaluation. We take Image Aesthetics Assessment Network using Graph Attention (AAGN) headlamps dataset calibration. To enable testing effectiveness AAGN, apply entropy weight, as well based deep Convolutional...

10.21203/rs.3.rs-2704318/v1 preprint EN cc-by Research Square (Research Square) 2023-03-30

The distribution of underwater images exhibits diverse due to the varied scattering and absorption light in different water types. However, most existing enhancement methods typically focus on single-frequency information single-water-type, thereby limiting their applicability scenes. key challenge is achieve consistency between learned features types while preserving multi-frequency information. Thus, we propose a domain-guided image network (DGMF), which guides decoder learn...

10.2139/ssrn.4557550 preprint EN 2023-01-01

Abstract When the digital speckle correlation method takes images in some working conditions, left and right appear weak due to extreme inclination of camera, which leads difficulty matching. In order solve above problem, a stereo matching algorithm based on epipolar correction is proposed this paper, an estimation deformational iterative initial values first-order shape function process given through theoretical derivation. The Newton-Raphson (NR) can be directly used for fine original...

10.21203/rs.3.rs-3671043/v1 preprint EN cc-by Research Square (Research Square) 2023-11-30

End-to-end scene text spotting has made significant progress due to its intrinsic synergy between detection and recognition. Previous methods commonly regard manual annotations such as horizontal rectangles, rotated quadrangles, polygons a prerequisite, which are much more expensive than using single-point. Our new framework, SPTS v2, allows us train high-performing text-spotting models single-point annotation. v2 reserves the advantage of auto-regressive Transformer with an Instance...

10.48550/arxiv.2301.01635 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01
Coming Soon ...