Xin Yuan

ORCID: 0000-0003-3140-3243
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Advanced Neural Network Applications
  • Image Enhancement Techniques
  • Domain Adaptation and Few-Shot Learning
  • Face and Expression Recognition
  • Multimodal Machine Learning Applications
  • Advanced Image Processing Techniques
  • Image and Signal Denoising Methods
  • Human Pose and Action Recognition
  • Face recognition and analysis
  • Robotics and Sensor-Based Localization
  • Image Retrieval and Classification Techniques
  • Gaze Tracking and Assistive Technology
  • Advanced Vision and Imaging
  • Neural Networks and Applications
  • Handwritten Text Recognition Techniques
  • Gait Recognition and Analysis
  • Color Science and Applications
  • Radiomics and Machine Learning in Medical Imaging
  • Generative Adversarial Networks and Image Synthesis
  • Biometric Identification and Security
  • Visual Attention and Saliency Detection
  • Image Processing Techniques and Applications
  • Remote-Sensing Image Classification

Wuhan University of Science and Technology
2020-2025

China Ocean Shipping (China)
2025

University of Chicago
2020-2024

Tsinghua University
2017-2024

China Geological Survey
2024

China Electronics Technology Group Corporation
2024

Hefei University of Technology
2022-2023

University of Electronic Science and Technology of China
2019-2022

Waseda University
2017-2021

Tencent (China)
2019-2021

Video streaming is crucial for AI applications that gather videos from sources to servers inference by deep neural nets (DNNs). Unlike traditional video optimizes visual quality, this new type of permits aggressive compression/pruning pixels not relevant achieving high DNN accuracy. However, much potential left unrealized, because current protocols are driven the source (camera) where compute rather limited. We advocate protocol should be real-time feedback server-side DNN. Our insight...

10.1145/3387514.3405887 article EN 2020-07-30

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing pre-training methods, which solve proxy prediction task in single domain, our method exploits intrinsic data properties within each modality semantic information from cross-modal correlation simultaneously, hence improving the quality learned representations. By including training unified framework with...

10.1109/cvpr46437.2021.00692 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Cross-modality face recognition is an emerging topic due to the wide-spread usage of different sensors in day-to-day life applications. The development systems relies greatly on existing databases for evaluation and obtaining training examples data-hungry machine learning algorithms. However, currently, there no publicly available database that includes more than two modalities same subject. In this work, we introduce Tufts Face Database images acquired various modalities: photograph images,...

10.1109/tpami.2018.2884458 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-11-30

Current visible-infrared cross-modality person re-identification research has only focused on exploring the bi-modality mutual retrieval paradigm, and we propose a new more practical mix-modality paradigm. Existing V isible- I nfrared (VI-ReID) methods have achieved some results in paradigm by learning correspondence between visible infrared modalities. However, significant performance degradation occurs due to modality confusion problem when these are applied Therefore, this paper proposes...

10.1145/3715142 article EN ACM Transactions on Multimedia Computing Communications and Applications 2025-01-28

This paper presents a discrepancy minimizing model to address the discrete optimization problem in hashing learning. The introduced by binary constraint is an NP-hard mixed integer programming problem. It usually addressed relaxing variables into continuous adapt gradient based learning of functions, especially training deep neural networks. To deal with objective caused relaxation, we transform original differentiable over hash functions through series expansion. transformation decouples...

10.1109/cvpr.2018.00715 article EN 2018-06-01

Designing an effective loss function plays important role in visual analysis. Most existing designs rely on hand-crafted heuristics that require domain experts to explore the large design space, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Loss Function Search (AM-LFS) leverages REINFORCE search functions during training process. The key contribution of work space can guarantee generalization transferability different vision tasks by including a bunch...

10.1109/iccv.2019.00850 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

In this paper, we address the challenging unconstrained set-based face recognition problem where each subject is instantiated by a set of media (images and videos) instead single image. Naively aggregating information from all within would suffer large intra-set variance caused heterogeneous factors (e.g., varying modalities, poses illumination) fail to learn discriminative representations. A novel Multi-Prototype Network (MP- Net) model thus proposed multiple prototype representations...

10.24963/ijcai.2019/611 article EN 2019-07-28

Person re-identification (Re-ID) aims to retrieve all images of the specific person captured by non-overlapping cameras and scenarios. Regardless significant success achieved daytime Re-ID methods, they will perform poorly due degraded imaging quality under low-light conditions. Therefore, some works attempt synthesize explore challenges in nighttime, which omits fact that synthetic may not realistically reflect at night. Moreover, other follow "enhancement-then-match" manner, but it is...

10.3390/s25030862 article EN cc-by Sensors 2025-01-31

10.1109/icassp49660.2025.10888758 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Diffusion models (DMs) have recently been introduced in image deblurring and exhibited promising performance, particularly terms of details reconstruction. However, the diffusion model requires a large number inference iterations to recover clean from pure Gaussian noise, which consumes massive computational resources. Moreover, distribution synthesized by is often misaligned with target results, leading restrictions distortion-based metrics. To address above issues, we propose Hierarchical...

10.48550/arxiv.2305.12966 preprint EN other-oa arXiv (Cornell University) 2023-01-01

We propose an efficient diffusion-based text-to-video super-resolution (SR) tuning approach that leverages the readily learned capacity of pixel level image diffusion model to capture spatial information for video generation. To accomplish this goal, we design architecture by inflating weightings text-to-image SR into our generation framework. Additionally, incorporate a temporal adapter ensure coherence across frames. investigate different approaches based on inflated and report trade-offs...

10.1109/wacvw60836.2024.00059 article EN 2024-01-01

Person re-identification (re-ID) is commonly investigated as a ranking problem. However, the performance of existing re-ID models drops dramatically, when they encounter extreme positive-negative class imbalance (e.g., very small ratio positive and negative samples) during training. To alleviate this problem, article designs rank-in-rank loss to optimize distribution feature embeddings. Specifically, we propose Differentiable Retrieval-Sort Loss (DRSL) model by each sample ahead samples...

10.1145/3532866 article EN ACM Transactions on Multimedia Computing Communications and Applications 2022-04-30

The goal of re-identification (re-ID) is to find an object (e.g., person or vehicle) interest across cameras. In re-ID, designing suitable and effective loss functions plays essential imperative role in learning identifiable features. Regardless the significant success achieved by using retrieval- verification-based due re-ID can be formulated as a retrieval verification task, model performance might degraded owing inconsistency between evaluation metrics. Moreover, current hand-designed...

10.1109/jstsp.2023.3250989 article EN IEEE Journal of Selected Topics in Signal Processing 2023-03-01

We develop an approach to growing deep network architectures over the course of training, driven by a principled combination accuracy and sparsity objectives. Unlike existing pruning or architecture search techniques that operate on full-sized models supernet architectures, our method can start from small, simple seed dynamically grow prune both layers filters. By combining continuous relaxation discrete structure optimization with scheme for sampling sparse subnetworks, we produce compact,...

10.48550/arxiv.2007.15353 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The <i>Journal of Biomedical Optics</i> (JBO) is a Gold Open Access journal that publishes peer-reviewed papers on the use novel optical systems and techniques for improved health care biomedical research.

10.1117/1.1628244 article EN Journal of Biomedical Optics 2004-01-01

This paper studied the spatial distribution and influencing factors of heavy metals (HMs) such as Cu, Pb, Zn, Cr, Ni, Cd As in soil Linzhou County Lhasa River basin. By collecting 504 surface samples, using descriptive statistics, Kriging interpolation geoaccumulation index methods, combined with geographic detector model, characteristics HMs content its interaction 19 environmental were systematically analyzed.The results showed that this area was generally higher than background value...

10.1038/s41598-024-78910-5 article EN cc-by-nc-nd Scientific Reports 2024-11-21

Designing an effective loss function plays important role in visual analysis. Most existing designs rely on hand-crafted heuristics that require domain experts to explore the large design space, which is usually sub-optimal and time-consuming. In this paper, we propose AutoML for Loss Function Search (AM-LFS) leverages REINFORCE search functions during training process. The key contribution of work space can guarantee generalization transferability different vision tasks by including a bunch...

10.48550/arxiv.1905.07375 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In this paper, we propose an Enhanced Bayesian Compression method to flexibly compress the deep networks via reinforcement learning. Unlike existing compression which cannot explicitly enforce quantization weights during training, our learns flexible codebooks in each layer for optimal network quantization. To dynamically adjust state of codebooks, employ Actor-Critic collaborate with original network. Different from most methods, EBC does not require re-training procedures after...

10.1109/cvpr.2019.00711 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Image super-resolution (SR) methods typically model degradation to improve reconstruction accuracy in complex and unknown scenarios. However, extracting information from low-resolution images is challenging, which limits the performance. To boost image SR performance, one feasible approach introduce additional priors. Inspired by advancements multi-modal text prompt processing, we prompts provide Specifically, first design a text-image generation pipeline integrate into dataset through...

10.48550/arxiv.2311.14282 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Previous studies recognize pain expressions based on the entire face, for example, Prkachin and Solomon Pain intensity (PSPI). However, patients face is often masked by instruments in an intensive care unit (ICU), such as respirator, gauzes, just name a few, which causes agent cannot measure using PSPI directly. To tackle this problem, we explore recognition from face. First, conducted four levels of measurement experiments with types Swin-Transformer. Experiment results show that accuracy...

10.1109/robio55434.2022.10011731 article EN 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO) 2022-12-05

Abstract In existing remote sensing image retrieval (RSIR) datasets, the number of images among different classes varies dramatically, which leads to a severe class imbalance problem. Some studies propose train model with ranking‐based metric (e.g., average precision [AP]), because AP is robust imbalance. However, current AP‐based methods overlook an important issue: only optimising samples ranking before each positive sample, limited by definition and prone local optimum. To achieve global...

10.1049/cit2.12151 article EN cc-by-nc-nd CAAI Transactions on Intelligence Technology 2023-03-28
Coming Soon ...