NFDI4DS | UHH-SEMS - Publication Details

Helena Balabin

ORCID: 0000-0002-6392-9306

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5067393016

Research Areas

Biomedical Text Mining and Ontologies
Bioinformatics and Genomic Networks
Topic Modeling
Machine Learning in Healthcare
Neurobiology of Language and Bilingualism
Dementia and Cognitive Impairment Research
Immunodeficiency and Autoimmune Disorders
Domain Adaptation and Few-Shot Learning
Music and Audio Processing
Mental Health via Writing
Speech and Audio Processing
Artificial Intelligence in Healthcare
Speech Recognition and Synthesis
Multimodal Machine Learning Applications
Machine Learning and Algorithms
Computational Drug Discovery Methods
Ferroelectric and Negative Capacitance Devices
Artificial Intelligence in Healthcare and Education
Mental Health Research Topics

KU Leuven
2022-2025

VIB-KU Leuven Center for Brain & Disease Research
2025

Fraunhofer Institute for Algorithms and Scientific Computing
2020-2024

Centre Hospitalier de Luxembourg
2024

Luxembourg Institute of Health
2024

University of Oxford
2024

Laboratoire d'Informatique de Paris-Nord
2023

Institut des Sciences Cognitives Marc Jeannerod
2023

Hochschule Bonn-Rhein-Sieg
2021-2022

Natural language processing‐based classification of early Alzheimer's disease from connected speech

OPENALEX - Publications

Helena Balabin Bastiaan Tamm Laure Spruyt Nathalie Dusart Ines Kabouche and 7 more

Abstract INTRODUCTION The automated analysis of connected speech using natural language processing (NLP) emerges as a possible biomarker for Alzheimer's disease (AD). However, it remains unclear which types are most sensitive and specific the detection AD. METHODS We applied model to automatically transcribed from 114 Flemish‐speaking individuals first distinguish early AD patients amyloid negative cognitively unimpaired (CU) then positive CU five different speech. RESULTS was able between...

10.1002/alz.14530 article EN cc-by-nc-nd Alzheimer s & Dementia 2025-01-27

The COVID-19 Ontology

OPENALEX - Publications

Astghik Sargsyan Alpha Tom Kodamullil Shounak Baksi Johannes Darms Sumit Madan and 8 more

Abstract Motivation The COVID-19 pandemic has prompted an impressive, worldwide response by the academic community. In order to support text mining approaches as well data description, linking and harmonization in context of COVID-19, we have developed ontology representing major novel coronavirus (SARS-CoV-2) entities. a strong scope on chemical entities suited for drug repurposing, this is target ongoing therapeutic development. Results comprises 2270 classes concepts 38 987 axioms (2622...

10.1093/bioinformatics/btaa1057 article EN cc-by Bioinformatics 2020-12-10

STonKGs: a sophisticated transformer trained on biomedical text and knowledge graphs

OPENALEX - Publications

Helena Balabin Charles Tapley Hoyt Colin Birkenbihl Benjamin M. Gyori John A. Bachman and 4 more

The majority of biomedical knowledge is stored in structured databases or as unstructured text scientific publications. This vast amount information has led to numerous machine learning-based biological applications using either through natural language processing (NLP) data graph embedding models. However, representations based on a single modality are inherently limited.To generate better knowledge, we propose STonKGs, Sophisticated Transformer trained and Knowledge Graphs (KGs)....

10.1093/bioinformatics/btac001 article EN cc-by Bioinformatics 2022-01-03

Deep Learning-based detection of psychiatric attributes from German mental health records

OPENALEX - Publications

Sumit Madan Fabian Julius Zimmer Helena Balabin Sebastian Schaaf Holger Fröhlich and 5 more

Health care records provide large amounts of data with real-world and longitudinal aspects, which is advantageous for predictive analyses improvements in personalized medicine. Text-based are a main source information mental health. Therefore, application text mining to the electronic health – especially state examination key approach detection psychiatric disease phenotypes that relate treatment outcomes. We focused on (MSE) patients' discharge summaries as part records. prepared sample 150...

10.1016/j.ijmedinf.2022.104724 article EN cc-by-nc-nd International Journal of Medical Informatics 2022-02-22

On the Utility of Large Language Model Embeddings for Revolutionizing Semantic Data Harmonization in Alzheimer's and Parkinson’s Disease

OPENALEX - Publications

Yasamin Salimi Tim Adams Mehmet Can Ay Helena Balabin Marc Jacobs and 1 more

Abstract Data Harmonization is an important yet time-consuming process. With the recent popularity of applications using Large Language Models (LLMs) due to their high capabilities in text understanding, we investigated whether LLMs could facilitate data harmonization for clinical use cases. To evaluate this, created PASSIONATE, a novel Parkinson's disease (PD) Common Model (CDM) as ground truth source pairwise cohort LLMs. Additionally, extended our investigation existing Alzheimer’s (AD)...

10.21203/rs.3.rs-4108029/v1 preprint EN cc-by Research Square (Research Square) 2024-04-01

Semantic harmonization of Alzheimer’s disease datasets using AD-Mapper

OPENALEX - Publications

Philipp Wegner Helena Balabin Mehmet Can Ay Sarah Bauermeister Lewis Killin and 3 more

Abstract INTRODUCTION Despite numerous past endeavors for the semantic harmonization of Alzheimer’s disease (AD) cohort studies, an automatic tool has yet to be developed. As studies form basis data-driven analysis, harmonizing them is crucial cross-cohort analysis. We aimed accelerate this task by constructing tool. METHODS created a common data model (CDM) through cross-mapping from 20 cohorts, three CDMs, and ontology terms, which was then used fine-tune BioBERT model. Finally, we...

10.1101/2023.10.26.564134 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-10-30

Pre-trained Speech Representations as Feature Extractors for Speech Quality Assessment in Online Conferencing Applications

OPENALEX - Publications

Bastiaan Tamm Helena Balabin Rik Vandenberghe Hugo Van hamme

Speech quality in online conferencing applications is typically assessed through human judgements the form of mean opinion score (MOS) metric. Since such a labor-intensive approach not feasible for large-scale speech assessments most settings, focus has shifted towards automated MOS prediction end-to-end training deep neural networks (DNN). Instead network from scratch, we propose to leverage representations pre-trained wav2vec-based XLS-R model. However, number parameters model exceeds...

10.21437/interspeech.2022-10147 article EN Interspeech 2022 2022-09-16

Semantic Harmonization of Alzheimer’s Disease Datasets Using AD-Mapper

OPENALEX - Publications

Philipp Wegner Helena Balabin Mehmet Can Ay Sarah Bauermeister Lewis Killin and 3 more

Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer’s disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As studies form basis data-driven analysis, harmonizing them is crucial cross-cohort analysis. We aimed accelerate this task by constructing tool. Methods: created a common data model (CDM) through cross-mapping from 20 cohorts, three CDMs, and ontology terms, which was then used fine-tune BioBERT model. Finally, we...

10.3233/jad-240116 article EN other-oa Journal of Alzheimer s Disease 2024-05-17

Natural language processing‐based analysis of connected speech in prodromal Alzheimer’s disease

OPENALEX - Publications

Helena Balabin Laure Spruyt Ella Eycken Ines Kabouche Bastiaan Tamm and 4 more

Abstract Background Connected speech has been explored as a possible marker for Alzheimer’s disease (AD) by employing language models based on machine learning. However, most previous approaches are scene description tasks, and it is unclear how different types of connected differences across subjects’ relate to changes in their brains. Method We analyzed transcripts Flemish Dutch from interviews 74 cognitively healthy elderly adults (mean MMSE = 28.71 [25‐30], age 73.15 years, 40 female) 27...

10.1002/alz.091238 article EN cc-by Alzheimer s & Dementia 2024-12-01

How Relevant is Selective Memory Population in Lifelong Language Learning?

OPENALEX - Publications

Vladimir Araujo Helena Balabin Julio Hurtado Álvaro Soto Marie‐Francine Moens

Lifelong language learning seeks to have models continuously learn multiple tasks in a sequential order without suffering from catastrophic forgetting. State-of-the-art approaches rely on sparse experience replay as the primary approach prevent Experience usually adopts sampling methods for memory population; however, effect of chosen strategy model performance has not yet been studied. In this paper, we investigate how relevant selective population is lifelong process text classification...

10.48550/arxiv.2210.00940 preprint EN other-oa arXiv (Cornell University) 2022-01-01

How Relevant is Selective Memory Population in Lifelong Language Learning?

OPENALEX - Publications

Vladimir Araujo Helena Balabin Julio Hurtado Álvaro Soto Marie‐Francine Moens

10.18653/v1/2022.aacl-short.20 article EN 2022-01-01

STonKGs: A Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs

OPENALEX - Publications

Helena Balabin Charles Tapley Hoyt Colin Birkenbihl Benjamin M. Gyori John A. Bachman and 4 more

Abstract The majority of biomedical knowledge is stored in structured databases or as unstructured text scientific publications. This vast amount information has led to numerous machine learning-based biological applications using either through natural language processing (NLP) data graph embedding models (KGEMs). However, representations based on a single modality are inherently limited. To generate better knowledge, we propose STonKGs, Sophisticated Transformer trained and Knowledge...

10.1101/2021.08.17.456616 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-08-18

Coming Soon ...