NFDI4DS | UHH-SEMS - Publication Details

Amin Dada

ORCID: 0000-0003-4016-7799

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5007940140

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Neural Network Applications
Machine Learning in Healthcare
Text Readability and Simplification
Medical Image Segmentation Techniques
Medical Imaging and Analysis
Biomedical Text Mining and Ontologies
Artificial Intelligence in Healthcare and Education
Genetics, Bioinformatics, and Biomedical Research
Multimodal Machine Learning Applications
Video Analysis and Summarization
Data Quality and Management
Speech Recognition and Synthesis
Artificial Intelligence in Healthcare
AI in cancer detection
Cancer Genomics and Diagnostics
Ferroelectric and Negative Capacitance Devices
Neural Networks and Applications
Online Learning and Analytics
COVID-19 diagnosis using AI

ChatGPT in Healthcare: A Taxonomy and Systematic Review

OPENALEX - Publications

Jianning Li Amin Dada Jens Kleesiek Jan Egger

Abstract The recent release of ChatGPT, a chat bot research project / product natural language processing (NLP) by OpenAI, stirs up sensation among both the general public and medical professionals, amassing phenomenally large user base in short time. This is typical example ‘productization’ cutting-edge technologies, which allows without technical background to gain firsthand experience artificial intelligence (AI), similar AI hype created AlphaGo (DeepMind Technologies, UK) self-driving...

10.1101/2023.03.30.23287899 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2023-03-30

An Automated Information Extraction Model For Unstructured Discharge Letters Using Large Language Models and GPT-4

OPENALEX - Publications

Robert Siepmann Giulia Baldini Cynthia S. Schmidt Daniel Truhn Gustav Müller‐Franzes and 4 more

10.1016/j.health.2024.100378 article EN cc-by-nc-nd Healthcare Analytics 2025-01-01

MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

OPENALEX - Publications

Amin Dada Osman Alperen Koraş Marie Bauer Amanda Saravia-Butler Kaleb E. Smith and 2 more

While increasing patients' access to medical documents improves care, this benefit is limited by varying health literacy levels and complex terminology. Large language models (LLMs) offer solutions simplifying information. However, evaluating LLMs for safe patient-friendly text generation difficult due the lack of standardized evaluation resources. To fill gap, we developed MeDiSumQA. MeDiSumQA a dataset created from MIMIC-IV discharge summaries through an automated pipeline combining...

10.48550/arxiv.2502.03298 preprint EN arXiv (Cornell University) 2025-02-05

Evaluating the effectiveness of biomedical fine-tuning for large language models on clinical tasks

OPENALEX - Publications

Felix J. Dorfner Amin Dada Felix Busch Marcus R. Makowski Tianyu Han and 5 more

Abstract Objectives Large language models (LLMs) have shown potential in biomedical applications, leading to efforts fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study aims critically evaluate performance biomedically fine-tuned LLMs against their general-purpose counterparts across a range clinical tasks. Materials and Methods We evaluated case challenges from NEJM JAMA, multiple tasks, such as information extraction, document...

10.1093/jamia/ocaf045 article EN Journal of the American Medical Informatics Association 2025-04-07

Information extraction from weakly structured radiological reports with natural language queries

OPENALEX - Publications

Amin Dada Tim Leon Ufer Moon Kim Max Hasin Nicola Spieker and 4 more

Abstract Objectives Provide physicians and researchers an efficient way to extract information from weakly structured radiology reports with natural language processing (NLP) machine learning models. Methods We evaluate seven different German bidirectional encoder representations transformers (BERT) models on a dataset of 857,783 unlabeled annotated reading comprehension in the format SQuAD 2.0 based 1223 additional reports. Results Continued pre-training BERT model medical online...

10.1007/s00330-023-09977-3 article EN cc-by European Radiology 2023-07-28

MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

OPENALEX - Publications

Jianning Li Antonio Pepe Christina Gsaxner Gijs Luijten Yuan Jin and 95 more

Prior to the deep learning era, shape was commonly used describe objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models used. This is seen numerous shape-related publications premier vision conferences as well growing popularity of ShapeNet (about 51,300 models) Princeton ModelNet (127,915 models). For domain, we present a large collection anatomical shapes...

10.48550/arxiv.2308.16139 preprint EN cc-by arXiv (Cornell University) 2023-01-01

CLUE: A Clinical Language Understanding Evaluation for LLMs

OPENALEX - Publications

Amin Dada Marie Bauer Amanda Butler Contreras Osman Alperen Koraş Constantin Seibold and 2 more

Large Language Models (LLMs) have shown the potential to significantly contribute patient care, diagnostics, and administrative processes. Emerging biomedical LLMs address healthcare-specific challenges, including privacy demands computational constraints. However, evaluation of these models has primarily been limited non-clinical tasks, which do not reflect complexity practical clinical applications. Additionally, there no thorough comparison between general-domain for tasks. To fill this...

10.48550/arxiv.2404.04067 preprint EN arXiv (Cornell University) 2024-04-05

Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding

OPENALEX - Publications

Ahmad Idrissi-Yaghir Amin Dada Henning Schäfer Kamyar Arzideh Giulia Baldini and 15 more

Recent advances in natural language processing (NLP) can be largely attributed to the advent of pre-trained models such as BERT and RoBERTa. While these demonstrate remarkable performance on general datasets, they struggle specialized domains medicine, where unique domain-specific terminologies, abbreviations, varying document structures are common. This paper explores strategies for adapting requirements, primarily through continuous pre-training data. We several German medical 2.4B tokens...

10.48550/arxiv.2404.05694 preprint EN arXiv (Cornell University) 2024-04-08

Biomedical Large Languages Models Seem not to be Superior to Generalist Models on Unseen Medical Data

OPENALEX - Publications

Felix J. Dorfner Amin Dada Felix Busch Marcus R. Makowski Tianyu Han and 6 more

Large language models (LLMs) have shown potential in biomedical applications, leading to efforts fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study evaluates performance biomedically fine-tuned LLMs against their general-purpose counterparts a variety clinical tasks. We evaluated case challenges from New England Journal Medicine (NEJM) and American Medical Association (JAMA) several tasks (e.g., information extraction, document...

10.48550/arxiv.2408.13833 preprint EN arXiv (Cornell University) 2024-08-25

IKIM at MEDIQA-M3G 2024: Multilingual Visual Question-Answering for Dermatology through VLM Fine-tuning and LLM Translations

OPENALEX - Publications

Marie Bauer Amin Dada Constantin Seibold Hyeonhoon Lee

10.18653/v1/2024.clinicalnlp-1.44 article EN 2024-01-01

FAM: Relative Flatness Aware Minimization

OPENALEX - Publications

Linara Adilova Amr Abourayya Jianning Li Amin Dada Henning Petzka and 3 more

Flatness of the loss curve around a model at hand has been shown to empirically correlate with its generalization ability. Optimizing for flatness proposed as early 1994 by Hochreiter and Schmidthuber, was followed more recent successful sharpness-aware optimization techniques. Their widespread adoption in practice, though, is dubious because lack theoretically grounded connection between generalization, particular light reparameterization curse - certain reparameterizations neural network...

10.48550/arxiv.2307.02337 preprint EN other-oa arXiv (Cornell University) 2023-01-01

On the Impact of Cross-Domain Data on German Language Models

OPENALEX - Publications

Amin Dada Aokun Chen Peng Cheng Kaleb E Smith Ahmad Idrissi-Yaghir and 9 more

Traditionally, large language models have been either trained on general web crawls or domain-specific data. However, recent successes of generative models, shed light the benefits cross-domain datasets. To examine significance prioritizing data diversity over quality, we present a German dataset comprising texts from five domains, along with another aimed at containing high-quality Through training series ranging between 122M and 750M parameters both datasets, conduct comprehensive...

10.48550/arxiv.2310.07321 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

On the Impact of Cross-Domain Data on German Language Models

OPENALEX - Publications

Amin Dada Aokun Chen Peng Cheng Kaleb E Smith Ahmad Idrissi-Yaghir and 9 more

Amin Dada, Aokun Chen, Cheng Peng, Kaleb Smith, Ahmad Idrissi-Yaghir, Constantin Seibold, Jianning Li, Lars Heiliger, Christoph Friedrich, Daniel Truhn, Jan Egger, Jiang Bian, Jens Kleesiek, Yonghui Wu. Findings of the Association for Computational Linguistics: EMNLP 2023.

10.18653/v1/2023.findings-emnlp.922 article EN cc-by 2023-01-01

Coming Soon ...