- Adversarial Robustness in Machine Learning
- Generative Adversarial Networks and Image Synthesis
- Digital Media Forensic Detection
- Multimodal Machine Learning Applications
- High-Velocity Impact and Material Behavior
- Advanced Neural Network Applications
- 3D Shape Modeling and Analysis
- Face recognition and analysis
- Anomaly Detection Techniques and Applications
- Advanced Steganography and Watermarking Techniques
- Integrated Circuits and Semiconductor Failure Analysis
- 3D Surveying and Cultural Heritage
- Advanced Control Systems Optimization
- Advanced Graph Neural Networks
- Hydrology and Watershed Management Studies
- Advanced Control Systems Design
- Flood Risk Assessment and Management
- Advanced Algorithms and Applications
- Domain Adaptation and Few-Shot Learning
- Metaheuristic Optimization Algorithms Research
- Machine Learning in Healthcare
- Advanced Image Processing Techniques
- Biometric Identification and Security
- Physical Unclonable Functions (PUFs) and Hardware Security
- Acupuncture Treatment Research Studies
Second Artillery General Hospital of Chinese People's Liberation Army
2024-2025
University of Science and Technology of China
2020-2024
Jinan University
2024
Nanjing University of Chinese Medicine
2021
Suihua University
2018
Traditional watermarking algorithms have been extensively studied. As an important type of schemes, template-based approaches maintain a very high embedding rate. In such scheme, the message is often represented by some dedicatedly designed templates, and then process carried out additive operation with templates host image. To resist potential distortions, these need to contain special statistical features so that they can be successfully recovered at extracting side. But in existing...
Recent research shows deep neural networks are vulnerable to different types of attacks, such as adversarial attack, data poisoning attack and backdoor attack. Among them, is the most cunning one can occur in almost every stage learning pipeline. Therefore, has attracted lots interests from both academia industry. However, existing methods either visible or fragile some effortless pre-processing common transformations. To address these limitations, we propose a robust invisible called...
Adversary and invisibility are two fundamental but conflict characters of adversarial perturbations. Previous attacks on 3D point cloud recognition have often been criticized for their noticeable outliers, since they just involve an "implicit constrain" like global distance loss in the time-consuming optimization to limit generated noise. While is a highly structured data format, it hard constrain its perturbation with simple or metric properly. In this paper, we propose novel Point-Cloud...
We present Diversity-Aware Meta Visual Prompting (DAM-VP), an efficient and effective prompting method for transferring pre-trained models to downstream tasks with frozen backbone. A challenging issue in visual is that image datasets sometimes have a large data diversity whereas per-dataset generic prompt can hardly handle the complex distribution shift toward original pretraining properly. To address this issue, we propose dataset strategy whose initialization realized by Meta-prompt....
Notwithstanding the prominent performance shown in various applications, point cloud recognition models have often suffered from natural corruptions and adversarial perturbations. In this paper, we delve into boosting general robustness of recognition, proposing Point-Cloud Contrastive Adversarial Training (PointCAT). The main intuition PointCAT is encouraging target model to narrow decision gap between clean clouds corrupted by devising feature-level constraints rather than logit-level...
Benefiting from the development of generative adversarial networks (GAN), facial manipulation has achieved significant progress in both academia and industry recently. It inspires an increasing number entertainment applications but also incurs severe threats to individual privacy even political security meanwhile. To mitigate such risks, many countermeasures have been proposed. However, great majority methods are designed a passive manner, which is detect whether images or videos tampered...
Recent advancements in image relighting models, driven by large-scale datasets and pre-trained diffusion have enabled the imposition of consistent lighting. However, video still lags, primarily due to excessive training costs scarcity diverse, high-quality datasets. A simple application models on a frame-by-frame basis leads several issues: lighting source inconsistency relighted appearance inconsistency, resulting flickers generated videos. In this work, we propose Light-A-Video,...
Recent multimodal large language models (MLLMs) have demonstrated significant potential in open-ended conversation, generating more accurate and personalized responses. However, their abilities to memorize, recall, reason sustained interactions within real-world scenarios remain underexplored. This paper introduces MMRC, a Multi-Modal Real-world Conversation benchmark for evaluating six core of MLLMs: information extraction, multi-turn reasoning, update, image management, memory answer...
In this paper, we investigate the adversarial robustness of vision transformers that are equipped with BERT pretraining (e.g., BEiT, MAE). A surprising observation is MAE has significantly worse than other methods. This drives us to rethink basic differences between these methods and how affect against perturbations. Our empirical analysis reveals highly related reconstruction target, i.e., predicting raw pixels masked image patches will degrade more model semantic context, since it guides...
Deep 3D point cloud models are sensitive to adversarial attacks, which poses threats safety-critical applications such as autonomous driving. Robust training and defend-by-denoising typical strategies for defending perturbations. However, they either induce massive computational overhead or rely heavily upon specified priors, limiting generalized robustness against attacks of all kinds. To remedy it, this paper introduces a novel distortion-aware defense framework that can rebuild the...
This work discusses the performance assessment and optimization of a class control systems under load disturbance. Different from stochastic evaluation method which requires relevant information to identify models, paper summarizes applies deterministic according two indicators, namely Idle index (II) Area (AI). Applying evaluate optimize loop in real time by collecting operation data system online, will guide help factory operators maintain normal system. The studied this is applied project...
With the deterioration of climate, phenomenon rain-induced flooding has become frequent. To mitigate its impact, recent works adopt convolutional neural network or variants to predict floods. However, these methods directly force model reconstruct raw pixels flood images through a global constraint, overlooking underlying information contained in terrain features and rainfall patterns. address this, we present novel framework for precise map prediction, which incorporates hierarchical...
Hallucination, posed as a pervasive challenge of multi-modal large language models (MLLMs), has significantly impeded their real-world usage that demands precise judgment. Existing methods mitigate this issue with either training specific designed data or inferencing external knowledge from other sources, incurring inevitable additional costs. In paper, we present OPERA, novel MLLM decoding method grounded in an Over-trust Penalty and Retrospection-Allocation strategy, serving nearly free...
Despite the success of diffusion-based customization methods on visual content creation, increasing concerns have been raised about such techniques from both privacy and political perspectives. To tackle this issue, several anti-customization proposed in very recent months, predominantly grounded adversarial attacks. Unfortunately, most these adopt straightforward designs, as end-to-end optimization with a focus adversarially maximizing original training loss, thereby neglecting nuanced...
In this paper, aiming at the problems of slow estimation speed and low precision traditional fractional-order system (FOS) parameter method, an improved Archimedes optimization algorithm (IAOA) is proposed to calculate optimal value. By establishing model cost function, problem formulated as problem. As opposed (AOA), IAOA introduces three improvements: leadership behavior, levy flight behavior a new adaptive strategy. This paper verifies performance by selecting 10 classic test functions....
We present the Modality Integration Rate (MIR), an effective, robust, and generalized metric to indicate multi-modal pre-training quality of Large Vision Language Models (LVLMs). Large-scale plays a critical role in building capable LVLMs, while evaluating its training without costly supervised fine-tuning stage is under-explored. Loss, perplexity, in-context evaluation results are commonly used metrics for (LLMs), we observed that these less indicative when aligning well-trained LLM with...
In large vision-language models (LVLMs), images serve as inputs that carry a wealth of information. As the idiom "A picture is worth thousand words" implies, representing single image in current LVLMs can require hundreds or even thousands tokens. This results significant computational costs, which grow quadratically input resolution increases, thereby severely impacting efficiency both training and inference. Previous approaches have attempted to reduce number tokens either before within...
Benefiting from the development of generative adversarial networks (GAN), facial manipulation has achieved significant progress in both academia and industry recently. It inspires an increasing number entertainment applications but also incurs severe threats to individual privacy even political security meanwhile. To mitigate such risks, many countermeasures have been proposed. However, great majority methods are designed a passive manner, which is detect whether images or videos tampered...
Parameter estimation is important in the study of control and synchronization fractional-order nonlinear systems (FONSs). This paper proposes an improved Sparrow Search Algorithm (ISSA) for parameter problem FONSs. The algorithm improves population initialization, position update method discoverers warning sparrows based on (SSA), simulation experiment financial system L conducted to demonstrate this method. experimental results show that proposed ISSA superior SSA, Particle Swarm...