Amin Dada

ORCID: 0000-0003-4016-7799
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Neural Network Applications
  • Machine Learning in Healthcare
  • Text Readability and Simplification
  • Medical Image Segmentation Techniques
  • Medical Imaging and Analysis
  • Biomedical Text Mining and Ontologies
  • Artificial Intelligence in Healthcare and Education
  • Genetics, Bioinformatics, and Biomedical Research
  • Multimodal Machine Learning Applications
  • Video Analysis and Summarization
  • Data Quality and Management
  • Speech Recognition and Synthesis
  • Artificial Intelligence in Healthcare
  • AI in cancer detection
  • Cancer Genomics and Diagnostics
  • Ferroelectric and Negative Capacitance Devices
  • Neural Networks and Applications
  • Online Learning and Analytics
  • COVID-19 diagnosis using AI

Abstract The recent release of ChatGPT, a chat bot research project / product natural language processing (NLP) by OpenAI, stirs up sensation among both the general public and medical professionals, amassing phenomenally large user base in short time. This is typical example ‘productization’ cutting-edge technologies, which allows without technical background to gain firsthand experience artificial intelligence (AI), similar AI hype created AlphaGo (DeepMind Technologies, UK) self-driving...

10.1101/2023.03.30.23287899 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2023-03-30

While increasing patients' access to medical documents improves care, this benefit is limited by varying health literacy levels and complex terminology. Large language models (LLMs) offer solutions simplifying information. However, evaluating LLMs for safe patient-friendly text generation difficult due the lack of standardized evaluation resources. To fill gap, we developed MeDiSumQA. MeDiSumQA a dataset created from MIMIC-IV discharge summaries through an automated pipeline combining...

10.48550/arxiv.2502.03298 preprint EN arXiv (Cornell University) 2025-02-05

Abstract Objectives Large language models (LLMs) have shown potential in biomedical applications, leading to efforts fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study aims critically evaluate performance biomedically fine-tuned LLMs against their general-purpose counterparts across a range clinical tasks. Materials and Methods We evaluated case challenges from NEJM JAMA, multiple tasks, such as information extraction, document...

10.1093/jamia/ocaf045 article EN Journal of the American Medical Informatics Association 2025-04-07

Abstract Objectives Provide physicians and researchers an efficient way to extract information from weakly structured radiology reports with natural language processing (NLP) machine learning models. Methods We evaluate seven different German bidirectional encoder representations transformers (BERT) models on a dataset of 857,783 unlabeled annotated reading comprehension in the format SQuAD 2.0 based 1223 additional reports. Results Continued pre-training BERT model medical online...

10.1007/s00330-023-09977-3 article EN cc-by European Radiology 2023-07-28
Jianning Li Antonio Pepe Christina Gsaxner Gijs Luijten Yuan Jin and 95 more Narmada Ambigapathy Enrico Nasca Naida Solak Gian Marco Melito Afaque Rafique Memon Xiaojun Chen Jan S. Kirschke Ezequiel de la Rosa Patrich Ferndinand Christ Hongwei Li David Ellis Michele R. Aizenberg Sergios Gatidis Thomas Kuestner Nadya Shusharina Nicholas Heller Vincent Andrearczyk Adrien Depeursinge Mathieu Hatt Anjany Sekuboyina Maximilian Loeffler Hans Liebl Reuben Dorent Tom Vercauteren Jonathan Shapey Aaron Kujawa S. Cornelissen Patrick Langenhuizen Achraf Ben-Hamadou Ahmed Rekik Sergi Pujades Edmond Boyer Federico Bolelli Costantino Grana Luca Lumetti Hamidreza Salehi Jun Ma Yao Zhang Ramtin Gharleghi Susann Beier Arcot Sowmya Eduardo A. Garza‐Villarreal Thania Balducci Diego Ángeles-Valdéz Roberto Souza Letícia Rittner Richard Frayne Yuanfeng Ji Soumick Chatterjee Andreas Nuernberger João Pedrosa Carlos Ferreira Guilherme Aresta A. Cunha Aurélio Campilho Yannick Suter José García Alain Lalande Emmanuel Audenaert Claudia Krebs Timo van Leeuwen Evie Vereecke Rainer Roehrig Frank Hoelzle Vahid Badeli Kathrin Krieger Matthias Gunzer Jianxu Chen Amin Dada Miriam Balzer Jana Fragemann Frederic Jonske Moritz Rempe Stanislav Malorodov Fin Hendrik Bahnsen Constantin Seibold Alexander Jaus Ana Sofia Santos Mariana Lindo André Ferreira Victor Alves Michael Kamp Amr Abourayya Felix Nensa Fabian Hoerst Alexander Brehmer Lukas Heine Lars Erik Podleska Matthias A. Fink Julius Keyl Konstantinos Tserpes Moon Kim Shireen Elhabian Hans Lamecker Dženan Zukić

Prior to the deep learning era, shape was commonly used describe objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models used. This is seen numerous shape-related publications premier vision conferences as well growing popularity of ShapeNet (about 51,300 models) Princeton ModelNet (127,915 models). For domain, we present a large collection anatomical shapes...

10.48550/arxiv.2308.16139 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Large Language Models (LLMs) have shown the potential to significantly contribute patient care, diagnostics, and administrative processes. Emerging biomedical LLMs address healthcare-specific challenges, including privacy demands computational constraints. However, evaluation of these models has primarily been limited non-clinical tasks, which do not reflect complexity practical clinical applications. Additionally, there no thorough comparison between general-domain for tasks. To fill this...

10.48550/arxiv.2404.04067 preprint EN arXiv (Cornell University) 2024-04-05

Recent advances in natural language processing (NLP) can be largely attributed to the advent of pre-trained models such as BERT and RoBERTa. While these demonstrate remarkable performance on general datasets, they struggle specialized domains medicine, where unique domain-specific terminologies, abbreviations, varying document structures are common. This paper explores strategies for adapting requirements, primarily through continuous pre-training data. We several German medical 2.4B tokens...

10.48550/arxiv.2404.05694 preprint EN arXiv (Cornell University) 2024-04-08

Large language models (LLMs) have shown potential in biomedical applications, leading to efforts fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study evaluates performance biomedically fine-tuned LLMs against their general-purpose counterparts a variety clinical tasks. We evaluated case challenges from New England Journal Medicine (NEJM) and American Medical Association (JAMA) several tasks (e.g., information extraction, document...

10.48550/arxiv.2408.13833 preprint EN arXiv (Cornell University) 2024-08-25

Flatness of the loss curve around a model at hand has been shown to empirically correlate with its generalization ability. Optimizing for flatness proposed as early 1994 by Hochreiter and Schmidthuber, was followed more recent successful sharpness-aware optimization techniques. Their widespread adoption in practice, though, is dubious because lack theoretically grounded connection between generalization, particular light reparameterization curse - certain reparameterizations neural network...

10.48550/arxiv.2307.02337 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Traditionally, large language models have been either trained on general web crawls or domain-specific data. However, recent successes of generative models, shed light the benefits cross-domain datasets. To examine significance prioritizing data diversity over quality, we present a German dataset comprising texts from five domains, along with another aimed at containing high-quality Through training series ranging between 122M and 750M parameters both datasets, conduct comprehensive...

10.48550/arxiv.2310.07321 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

Amin Dada, Aokun Chen, Cheng Peng, Kaleb Smith, Ahmad Idrissi-Yaghir, Constantin Seibold, Jianning Li, Lars Heiliger, Christoph Friedrich, Daniel Truhn, Jan Egger, Jiang Bian, Jens Kleesiek, Yonghui Wu. Findings of the Association for Computational Linguistics: EMNLP 2023.

10.18653/v1/2023.findings-emnlp.922 article EN cc-by 2023-01-01
Coming Soon ...