NFDI4DS | UHH-SEMS - Publication Details

Long Chen

ORCID: 0000-0002-5280-4727

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5084203526

Research Areas

Advanced Neural Network Applications
Medical Image Segmentation Techniques
Domain Adaptation and Few-Shot Learning
COVID-19 diagnosis using AI
Advanced Image and Video Retrieval Techniques
Brain Tumor Detection and Classification
Digital Imaging for Blood Diseases
Image and Object Detection Techniques
Image and Signal Denoising Methods
Generative Adversarial Networks and Image Synthesis
Advanced Vision and Imaging
Advanced Text Analysis Techniques
Digital Marketing and Social Media
Face and Expression Recognition
AI in cancer detection
Cell Image Analysis Techniques
Multimodal Machine Learning Applications
Neural Networks and Applications
Video Analysis and Summarization
Industrial Vision Systems and Defect Detection
Image Retrieval and Classification Techniques
Semantic Web and Ontologies
Reinforcement Learning in Robotics
Sentiment Analysis and Opinion Mining

RWTH Aachen University
2019-2024

University College London
2023

Wellcome / EPSRC Centre for Interventional and Surgical Sciences
2023

Affymax (United States)
2023

KARST: Multi-Kernel Kronecker Adaptation with Re-Scaling Transmission for Visual Classification

OPENALEX - Publications

Yuemin Zhu Haiwen Diao Shang Gao Long Chen Huchuan Lu

Fine-tuning pre-trained vision models for specific tasks is a common practice in computer vision. However, this process becomes more expensive as grow larger. Recently, parameter-efficient fine-tuning (PEFT) methods have emerged popular solution to improve training efficiency and reduce storage needs by tuning additional low-rank modules within backbones. Despite their advantages, they struggle with limited representation capabilities misalignment intermediate features. To address these...

10.48550/arxiv.2502.06779 preprint EN arXiv (Cornell University) 2025-02-10

Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

OPENALEX - Publications

Zijing Hu Fengda Zhang Long Chen Kun Kuang Jiahui Li and 4 more

Diffusion models have achieved remarkable success in text-to-image generation. However, their practical applications are hindered by the misalignment between generated images and corresponding text prompts. To tackle this issue, reinforcement learning (RL) has been considered for diffusion model fine-tuning. Yet, RL's effectiveness is limited challenge of sparse reward, where feedback only available at end generation process. This makes it difficult to identify which actions during denoising...

10.48550/arxiv.2503.11240 preprint EN arXiv (Cornell University) 2025-03-14

SortedAP: Rethinking evaluation metrics for instance segmentation

OPENALEX - Publications

Long Chen Yuli Wu Johannes Stegmaier Dorit Merhof

Designing metrics for evaluating instance segmentation revolves around comprehensively considering object detection and accuracy. However, other important properties, such as sensitivity, continuity, equality, are overlooked in the current study. In this paper, we reveal that most existing have a limited resolution of quality. They only conditionally sensitive to change masks or false predictions. For certain metrics, score can drastically narrow range which could provide misleading...

10.1109/iccvw60793.2023.00424 article EN 2023-10-02

Boundary and Relation Distillation for Semantic Segmentation

OPENALEX - Publications

Dong Zhang Pingcheng Dong Xinting Hu Long Chen Kwang‐Ting Cheng

Recently, it has been revealed that small semantic segmentation (SS) models exhibit a tendency to make errors in maintaining boundary region completeness and preserving target connectivity, despite their effective of the main object regions. To address these errors, we propose targeted relation distillation (BRD) strategy using knowledge from large teacher student models. Specifically, extracts explicit boundaries hierarchical feature maps backbone network, subsequently enhancing model's...

10.48550/arxiv.2401.13174 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Illumination Histogram Consistency Metric for Quantitative Assessment of Video Sequences

OPENALEX - Publications

Long Chen Mobarakol Islam Matthew J. Clarkson Thomas Dowrick

The advances in deep generative models have greatly accelerate the process of video procession such as enhancement and synthesis. Learning spatio-temporal requires to capture temporal dynamics a scene, addition visual appearance individual frames. Illumination consistency, which reflects variations illumination dynamic sequences, play vital role processing. Unfortunately, date, no well-accepted quantitative metric has been proposed for consistency evaluation. In this paper, we propose...

10.48550/arxiv.2405.09716 preprint EN arXiv (Cornell University) 2024-05-15

MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

OPENALEX - Publications

Baixuan Xu Weiqi Wang Haochen Shi Wenxuan Ding Huihao Jing and 4 more

Improving user experience and providing personalized search results in E-commerce platforms heavily rely on understanding purchase intention. However, existing methods for acquiring large-scale intentions bank distilling large language models with human annotation verification. Such an approach tends to generate product-centric intentions, overlook valuable visual information from product images, incurs high costs scalability. To address these issues, we introduce MIND, a multimodal...

10.48550/arxiv.2406.10701 preprint EN arXiv (Cornell University) 2024-06-15

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

OPENALEX - Publications

Fangxun Shu Yue Liao Le Zhuo Chenning Xu G. X. Zhang and 11 more

We introduce LLaVA-MoD, a novel framework designed to enable the efficient training of small-scale Multimodal Language Models (s-MLLM) by distilling knowledge from large-scale MLLM (l-MLLM). Our approach tackles two fundamental challenges in distillation. First, we optimize network structure s-MLLM integrating sparse Mixture Experts (MoE) architecture into language model, striking balance between computational efficiency and model expressiveness. Second, propose progressive transfer strategy...

10.48550/arxiv.2408.15881 preprint EN arXiv (Cornell University) 2024-08-28

MEDOE: A Multi-Expert Decoder and Output Ensemble Framework for Long-tailed Semantic Segmentation

OPENALEX - Publications

Junao Shen Long Chen Kun Kuang Fei Wu Tian Feng and 1 more

Long-tailed distribution of semantic categories, which has been often ignored in conventional methods, causes unsatisfactory performance segmentation on tail categories. In this paper, we focus the problem long-tailed segmentation. Although some recognition methods (e.g., re-sampling/re-weighting) have proposed other problems, they can probably compromise crucial contextual information and are thus hardly adaptable to To address issue, propose MEDOE, a novel framework for via...

10.48550/arxiv.2308.08213 preprint EN other-oa arXiv (Cornell University) 2023-01-01

SortedAP: Rethinking evaluation metrics for instance segmentation

OPENALEX - Publications

Long Chen Yuli Wu Johannes Stegmaier Dorit Merhof

10.48550/arxiv.2309.04887 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Semi-supervised Instance Segmentation with a Learned Shape Prior

OPENALEX - Publications

Long Chen Wei Zhang Yuli Wu Martin Strauch Dorit Merhof

To date, most instance segmentation approaches are based on supervised learning that requires a considerable amount of annotated object contours as training ground truth. Here, we propose framework searches for the target shape prior. The prior model is learned with variational autoencoder only very limited data: In our experiments, few dozens patches from dataset, well purely synthetic shapes, were sufficient to achieve results en par methods full access data two out three cell datasets....

10.48550/arxiv.2309.04888 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Making Autonomous Stores Smarter (MASS): A Practical Solution to Improve Product Detection Performance Using Synthetic Dataset at Scale

OPENALEX - Publications

Nanxin Jin Devin Shah Juan Terven Daniela Basurto Lozada Zachary Bennett and 2 more

Product detection in large retail stores requires extensive annotated real data, which is expensive and lacks adaptability when new products are introduced. This paper presents an end-to-end product approach using domain randomization to generate synthetic datasets for training. We propose a set of randomizations at the scene level method generating amounts domain-randomized data. To evaluate performance on this dataset, we pipeline where model pre-trained simulation data fine-tuned small...

10.1109/smartcloud58862.2023.00039 article EN 2023-09-16

Coming Soon ...