- Advanced Neural Network Applications
- Video Surveillance and Tracking Methods
- Retinal Imaging and Analysis
- Retinal Diseases and Treatments
- CCD and CMOS Imaging Sensors
- Adipose Tissue and Metabolism
- Generative Adversarial Networks and Image Synthesis
- 3D Shape Modeling and Analysis
- Advancements in Battery Materials
- Topic Modeling
- Infrared Target Detection Methodologies
- Advanced Image and Video Retrieval Techniques
- Smart Grid and Power Systems
- Advanced Welding Techniques Analysis
- Fire Detection and Safety Systems
- Autonomous Vehicle Technology and Safety
- Additive Manufacturing and 3D Printing Technologies
- Cancer, Lipids, and Metabolism
- Image Enhancement Techniques
- Adversarial Robustness in Machine Learning
- Adipokines, Inflammation, and Metabolic Diseases
- Emotion and Mood Recognition
- Aluminum Alloy Microstructure Properties
- Computer Graphics and Visualization Techniques
- Crystallization and Solubility Studies
Liaocheng People's Hospital
2024
South China Robotics Innovative Research Institute
2024
Fourth Affiliated Hospital of Harbin Medical University
2024
Google (United States)
2023-2024
NARI Group (China)
2024
Henan University
2024
Harbin Medical University
2024
University of Science and Technology of China
2023
Hanoi Open University
2023
Jilin Agricultural University
2023
Given two images depicting a person and garment worn by another person, our goal is to generate visualization of how the might look on input person. A key challenge synthesize photorealistic detail-preserving garment, while warping accommodate significant body pose shape change across subjects. Previous methods either focus detail preservation without effective variation, or allow tryon with desired but lack details. In this paper, we propose diffusion-based architecture that unifies UN ets...
Post-training quantization (PTQ) is a neural network compression technique that converts full-precision model into quantized using lower-precision data types. Although it can help reduce the size and computational cost of deep networks, also introduce noise prediction accuracy, especially in extremely low-bit settings. How to determine appropriate parameters (e.g., scaling factors rounding weights) main problem facing now. Existing methods attempt these by minimize distance between features...
Deep neural networks (DNNs) are found to be vulnerable against adversarial examples, which carefully crafted inputs with a small magnitude of perturbation aiming induce arbitrarily incorrect predictions. Recent studies show that examples can pose threat real-world security-critical applications: "physical Stop Sign" synthesized such the autonomous driving cars will misrecognize it as others (e.g., speed limit sign). However, these image-space cannot easily alter 3D scans widely equipped...
Visual object tracking is a fundamental research topic with broad range of applications. Benefiting from the rapid development Transformer, pure Transformer trackers have achieved great progress. However, feature learning these Transformer-based easily disturbed by complex backgrounds. To address above limitations, we propose novel foreground-background distribution modeling transformer for visual (F-BDMTrack), including fore-background agent (FBAL) module and distribution-aware attention...
Multimodal large language models (MLLMs) have garnered widespread attention due to their ability understand multimodal input. However, parameter sizes and substantial computational demands severely hinder practical deployment application.While quantization is an effective way reduce model size inference latency, its application MLLMs remains underexplored. In this paper, we propose MQuant, a post-training (PTQ) framework designed tackle the unique challenges of (MLLMs). Conventional often...
The advent of foundation models (FMs) is transforming medical domain. In ophthalmology, RETFound, a retina-specific FM pre-trained sequentially on 1.4 million natural images and 1.6 retinal images, has demonstrated high adaptability across clinical applications. Conversely, DINOv2, general-purpose vision 142 shown promise in non-medical domains. However, its applicability to tasks remains underexplored. To address this, we conducted head-to-head evaluations by fine-tuning RETFound three...
Recently, the need for advanced anti-UAV techniques is increasing due to rising threat of unauthorized drone intrusion. Object tracking, specifically in thermal infrared (TIR) videos, offers a potential solution this issue. However, tracked target often suffers dramatic scale variation, frequent disappearance, and camera movement which severely influence tracking performance. Therefore, we propose Unified Transformer-based Tracker, dubbed UTTracker, contains following four modules. Firstly,...
In this study, low-fat beef patties were prepared by replacing different proportions of fat (0, 25, 50, 75, and 100%) with ultra-high pressure-assisted cowhide gelatin. The quality characteristics, lipid oxidation, protein fatty acid profile during cold storage at 4 ℃ 7, 14, 21, 28 days) evaluated. results showed that the addition gelatin increased content reduced patties. saturated (SFA) significantly polyunsaturated (PUFA) increase in replacement ratio. Especially, 100% substitution group...
Early recognition of fruit body diseases in edible fungi can effectively improve the quality and yield fungi. This study proposes a method based on improved ShuffleNetV2 for disease recognition. First, ShuffleNetV2+SE model is constructed by deeply integrating SE module with network to make pay more attention target area model’s classification performance. Second, optimized improved. To simplify convolution operation, 1 × layer after 3 depth removed, ShuffleNetV2-Lite+SE established. The...
Visual tracking aims to estimate object state in a video sequence, which is challenging when facing drastic appearance changes. Most existing trackers conduct with divided parts handle variations. However, these commonly divide target objects into regular patches by hand-designed splitting way, too coarse align well. Besides, fixed part detector difficult partition targets arbitrary categories and deformations. To address the above issues, we propose novel adaptive mining tracker (APMT) for...
Comprehensive Summary Potassium ion batteries (PIBs) are of great interest owing to the low cost and abundance potassium resources, while sluggish diffusion kinetics K + in electrode materials severely impede their practical applications. Here, self‐hybridized BiOCl 0.5 Br with a floral structure is assembled used as anode for PIBs. Based on systematic theoretical calculation experimental analysis, unbalance charge distribution between Cl atoms leads an enhanced built‐in electric field...
<a><b>Objective:</b></a> Diabetic macular edema (DME) is the primary cause of vision loss among individuals with diabetes mellitus (DM). We developed, validated, and tested a deep-learning (DL) system for classifying DME using images from three common commercially available optical coherence tomography (OCT) devices. <p><b>Research Design Methods:</b> trained validated two versions multi-task convolution neural network (CNN) to classify...
Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures. Despite its effectiveness and convenience, the reliability of PTQ methods in presence some extrem cases such as distribution shift data noise remains largely unexplored. This paper first investigates this problem on various commonly-used methods. We aim to answer several research questions related influence calibration set...
Given the capability of mitigating long-tail deficiencies and intricate-shaped absence prevalent in 3D object detection, occupancy prediction has become a pivotal component autonomous driving systems. However, procession three-dimensional voxel-level representations inevitably introduces large overhead both memory computation, obstructing deployment to-date approaches. In contrast to trend making model larger more complicated, we argue that desirable framework should be deployment-friendly...
Download This Paper Open PDF in Browser Add to My Library Share: Permalink Using these links will ensure access this page indefinitely Copy URL DOI
We present M&M VTO, a mix and match virtual try-on method that takes as input multiple garment images, text description for layout an image of person. An example includes: shirt, pair pants, "rolled sleeves, shirt tucked in", The output is visualization how those garments (in the desired layout) would look like on given Key contributions our are: 1) single stage diffusion based model, with no super resolution cascading, allows to at 1024x512 preserving warping intricate details, 2)...