Yaxin Peng

ORCID: 0000-0002-2983-555X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Robotics and Sensor-Based Localization
  • Advanced Neural Network Applications
  • Medical Image Segmentation Techniques
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Face and Expression Recognition
  • Adversarial Robustness in Machine Learning
  • Remote-Sensing Image Classification
  • Robot Manipulation and Learning
  • Image Retrieval and Classification Techniques
  • Image and Object Detection Techniques
  • Image and Signal Denoising Methods
  • Video Surveillance and Tracking Methods
  • Model Reduction and Neural Networks
  • 3D Surveying and Cultural Heritage
  • Advanced Image Fusion Techniques
  • Matrix Theory and Algorithms
  • Advanced Steganography and Watermarking Techniques
  • Human Pose and Action Recognition
  • Sparse and Compressive Sensing Techniques
  • 3D Shape Modeling and Analysis
  • Natural Language Processing Techniques
  • Chaos-based Image/Signal Encryption
  • Machine Learning and ELM

Shanghai University
2016-2025

Huaqiao University
2023

Huazhong University of Science and Technology
1992-2021

East China Normal University
2007-2018

Nanyang Technological University
2018

Yanshan University
2015-2016

Hunan University
2004-2009

École Normale Supérieure de Lyon
2007-2008

École Normale Supérieure Paris-Saclay
2008

Shanghai Jiao Tong University
2007

In this paper, we address the semisupervised distance metric learning problem and its applications in classification image retrieval. First, formulate a model by considering information of inner classes interclasses. model, an adaptive parameter is designed to balance metrics intermetrics using data structure. Second, convert minimization whose variable symmetric positive-definite matrix. Third, implementation, deduce intrinsic steepest descent method, which assures that matrix strictly at...

10.1109/tnnls.2017.2691005 article EN IEEE Transactions on Neural Networks and Learning Systems 2017-01-01

Humans possess a unified cognitive ability to perceive, comprehend, and interact with the physical world. Why can't large language models replicate this holistic understanding? Through systematic analysis of existing training paradigms in vision-language-action (VLA), we identify two key challenges: spurious forgetting, where robot overwrites crucial visual-text alignments, task interference, competing control understanding tasks degrade performance when trained jointly. To overcome these...

10.48550/arxiv.2502.14420 preprint EN arXiv (Cornell University) 2025-02-20

Many deep learning models are vulnerable to the adversarial attack, i.e., imperceptible but intentionally-designed perturbations input can cause incorrect output of networks. In this paper, using information geometry, we provide a reasonable explanation for vulnerability models. By considering data space as non-linear with Fisher metric induced from neural network, first propose an attack algorithm termed one-step spectral (OSSA). The method is described by constrained quadratic form matrix,...

10.1609/aaai.v33i01.33015869 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

The leaderless consensus of fractional-order multi-agent systems (FOMASs) by intermittence sampled data control method is investigated in this brief, for which a distributed protocol presented to reduce the updating rate and working time controllers. Subsequently, Laplace transform stability theory are utilized derive some necessary sufficient criteria that show relations among fractional order, sampling period, communication width, coupling strengths, network topology. What more, it can be...

10.1109/tcsii.2019.2912331 article EN IEEE Transactions on Circuits & Systems II Express Briefs 2019-04-25

In the field of autonomous driving, carriers are equipped with a variety sensors, including cameras and LiDARs. However, camera suffers from problems illumination occlusion, LiDAR encounters motion distortion, degenerate environment limited ranging distance. Therefore, fusing information these two sensors deserves to be explored. this paper, we propose fusion network which robustly captures both image point cloud descriptors solve place recognition problem. Our contribution can summarized...

10.3390/s20102870 article EN cc-by Sensors 2020-05-19

Quaternion singular value decomposition (QSVD) is a robust technique of digital watermarking that extracts high quality watermarks from watermarked images with low distortion. However, the existing QSVD-based schemes face obstacle "explosion complexity" and have much room for improvement in terms real-time, invisibility, robustness. In this paper, we overcome such by introducing new real structure-preserving QSVD algorithm propose novel scheme efficiency. Secret information transmitted...

10.1109/tip.2023.3293773 article EN IEEE Transactions on Image Processing 2023-01-01

10.3901/cjme.2015.0217.019 article EN Chinese Journal of Mechanical Engineering 2015-05-01

This paper proposes a robust dual-color watermarking based on quaternion singular value decomposition (QSVD), which can embed large payloads into color images with low distortion, and obtain strong robustness to process image in holistic manner. First, two notes are proposed for designing the scheme, one of is about three correlations found <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$U$...

10.1109/access.2020.2973044 article EN cc-by IEEE Access 2020-01-01

Continual Learning enables models to learn and adapt new tasks while retaining prior knowledge.Introducing tasks, however, can naturally lead feature entanglement across limiting the model's capability distinguish between domain data.In this work, we propose a method called Feature Realignment through Experts on hyperSpHere in (Fresh-CL). By leveraging predefined fixed simplex equiangular tight frame (ETF) classifiers hypersphere, our model improves separation both intra inter tasks.However,...

10.48550/arxiv.2501.02198 preprint EN arXiv (Cornell University) 2025-01-04

Object detection in unmanned aerial vehicle (UAV) remote sensing images poses significant challenges due to unstable image quality, small object sizes, complex backgrounds, and environmental occlusions. Small objects, particular, occupy minimal portions of images, making their accurate highly difficult. Existing multi-scale feature fusion methods address these some extent by aggregating features across different resolutions. However, often fail effectively balance classification localization...

10.48550/arxiv.2501.17983 preprint EN arXiv (Cornell University) 2025-01-29

10.4208/cmr.2024-0022 article EN Communications in Mathematical Research 2025-02-01

10.1109/icassp49660.2025.10889653 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Multimodal Large Language Models (MLLMs) have showcased impressive skills in tasks related to visual understanding and reasoning. Yet, their widespread application faces obstacles due the high computational demands during both training inference phases, restricting use a limited audience within research user communities. In this paper, we investigate design aspects of Small (MSLMs) propose an efficient multimodal assistant named Mipha, which is designed create synergy among various aspects:...

10.1609/aaai.v39i10.33194 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Highly efficient multifunctional materials that exhibit strong microwave absorption and elevated heat conduction are crucial for tackling electromagnetic interference accumulation in miniaturized integrated electronic systems. Nevertheless, simple...

10.1039/d5ce00429b article EN CrystEngComm 2025-01-01
Coming Soon ...