Helena Balabin

ORCID: 0000-0002-6392-9306
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Biomedical Text Mining and Ontologies
  • Bioinformatics and Genomic Networks
  • Topic Modeling
  • Machine Learning in Healthcare
  • Neurobiology of Language and Bilingualism
  • Dementia and Cognitive Impairment Research
  • Immunodeficiency and Autoimmune Disorders
  • Domain Adaptation and Few-Shot Learning
  • Music and Audio Processing
  • Mental Health via Writing
  • Speech and Audio Processing
  • Artificial Intelligence in Healthcare
  • Speech Recognition and Synthesis
  • Multimodal Machine Learning Applications
  • Machine Learning and Algorithms
  • Computational Drug Discovery Methods
  • Ferroelectric and Negative Capacitance Devices
  • Artificial Intelligence in Healthcare and Education
  • Mental Health Research Topics

KU Leuven
2022-2025

VIB-KU Leuven Center for Brain & Disease Research
2025

Fraunhofer Institute for Algorithms and Scientific Computing
2020-2024

Centre Hospitalier de Luxembourg
2024

Luxembourg Institute of Health
2024

University of Oxford
2024

Laboratoire d'Informatique de Paris-Nord
2023

Institut des Sciences Cognitives Marc Jeannerod
2023

Hochschule Bonn-Rhein-Sieg
2021-2022

Abstract INTRODUCTION The automated analysis of connected speech using natural language processing (NLP) emerges as a possible biomarker for Alzheimer's disease (AD). However, it remains unclear which types are most sensitive and specific the detection AD. METHODS We applied model to automatically transcribed from 114 Flemish‐speaking individuals first distinguish early AD patients amyloid negative cognitively unimpaired (CU) then positive CU five different speech. RESULTS was able between...

10.1002/alz.14530 article EN cc-by-nc-nd Alzheimer s & Dementia 2025-01-27

Abstract Motivation The COVID-19 pandemic has prompted an impressive, worldwide response by the academic community. In order to support text mining approaches as well data description, linking and harmonization in context of COVID-19, we have developed ontology representing major novel coronavirus (SARS-CoV-2) entities. a strong scope on chemical entities suited for drug repurposing, this is target ongoing therapeutic development. Results comprises 2270 classes concepts 38 987 axioms (2622...

10.1093/bioinformatics/btaa1057 article EN cc-by Bioinformatics 2020-12-10

The majority of biomedical knowledge is stored in structured databases or as unstructured text scientific publications. This vast amount information has led to numerous machine learning-based biological applications using either through natural language processing (NLP) data graph embedding models. However, representations based on a single modality are inherently limited.To generate better knowledge, we propose STonKGs, Sophisticated Transformer trained and Knowledge Graphs (KGs)....

10.1093/bioinformatics/btac001 article EN cc-by Bioinformatics 2022-01-03

Health care records provide large amounts of data with real-world and longitudinal aspects, which is advantageous for predictive analyses improvements in personalized medicine. Text-based are a main source information mental health. Therefore, application text mining to the electronic health – especially state examination key approach detection psychiatric disease phenotypes that relate treatment outcomes. We focused on (MSE) patients' discharge summaries as part records. prepared sample 150...

10.1016/j.ijmedinf.2022.104724 article EN cc-by-nc-nd International Journal of Medical Informatics 2022-02-22

Abstract Data Harmonization is an important yet time-consuming process. With the recent popularity of applications using Large Language Models (LLMs) due to their high capabilities in text understanding, we investigated whether LLMs could facilitate data harmonization for clinical use cases. To evaluate this, created PASSIONATE, a novel Parkinson's disease (PD) Common Model (CDM) as ground truth source pairwise cohort LLMs. Additionally, extended our investigation existing Alzheimer’s (AD)...

10.21203/rs.3.rs-4108029/v1 preprint EN cc-by Research Square (Research Square) 2024-04-01

Abstract INTRODUCTION Despite numerous past endeavors for the semantic harmonization of Alzheimer’s disease (AD) cohort studies, an automatic tool has yet to be developed. As studies form basis data-driven analysis, harmonizing them is crucial cross-cohort analysis. We aimed accelerate this task by constructing tool. METHODS created a common data model (CDM) through cross-mapping from 20 cohorts, three CDMs, and ontology terms, which was then used fine-tune BioBERT model. Finally, we...

10.1101/2023.10.26.564134 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-10-30

Speech quality in online conferencing applications is typically assessed through human judgements the form of mean opinion score (MOS) metric. Since such a labor-intensive approach not feasible for large-scale speech assessments most settings, focus has shifted towards automated MOS prediction end-to-end training deep neural networks (DNN). Instead network from scratch, we propose to leverage representations pre-trained wav2vec-based XLS-R model. However, number parameters model exceeds...

10.21437/interspeech.2022-10147 article EN Interspeech 2022 2022-09-16

Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer’s disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As studies form basis data-driven analysis, harmonizing them is crucial cross-cohort analysis. We aimed accelerate this task by constructing tool. Methods: created a common data model (CDM) through cross-mapping from 20 cohorts, three CDMs, and ontology terms, which was then used fine-tune BioBERT model. Finally, we...

10.3233/jad-240116 article EN other-oa Journal of Alzheimer s Disease 2024-05-17

Abstract Background Connected speech has been explored as a possible marker for Alzheimer’s disease (AD) by employing language models based on machine learning. However, most previous approaches are scene description tasks, and it is unclear how different types of connected differences across subjects’ relate to changes in their brains. Method We analyzed transcripts Flemish Dutch from interviews 74 cognitively healthy elderly adults (mean MMSE = 28.71 [25‐30], age 73.15 years, 40 female) 27...

10.1002/alz.091238 article EN cc-by Alzheimer s & Dementia 2024-12-01

Lifelong language learning seeks to have models continuously learn multiple tasks in a sequential order without suffering from catastrophic forgetting. State-of-the-art approaches rely on sparse experience replay as the primary approach prevent Experience usually adopts sampling methods for memory population; however, effect of chosen strategy model performance has not yet been studied. In this paper, we investigate how relevant selective population is lifelong process text classification...

10.48550/arxiv.2210.00940 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Abstract The majority of biomedical knowledge is stored in structured databases or as unstructured text scientific publications. This vast amount information has led to numerous machine learning-based biological applications using either through natural language processing (NLP) data graph embedding models (KGEMs). However, representations based on a single modality are inherently limited. To generate better knowledge, we propose STonKGs, Sophisticated Transformer trained and Knowledge...

10.1101/2021.08.17.456616 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-08-18
Coming Soon ...