NFDI4DS | UHH-SEMS - Publication Details

André Freitas

ORCID: 0000-0002-4430-4837

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5053978668

Research Areas

Topic Modeling
Natural Language Processing Techniques
Semantic Web and Ontologies
Biomedical Text Mining and Ontologies
Explainable Artificial Intelligence (XAI)
Text Readability and Simplification
Data Quality and Management
Advanced Text Analysis Techniques
Scientific Computing and Data Management
Machine Learning in Healthcare
Advanced Graph Neural Networks
Service-Oriented Architecture and Web Services
Mathematics, Computing, and Information Processing
Data Visualization and Analytics
COVID-19 and healthcare impacts
Neural Networks and Applications
Multimodal Machine Learning Applications
Software Engineering Research
Distributed and Parallel Computing Systems
COVID-19 Clinical Research Studies
Sentiment Analysis and Opinion Mining
Business Process Modeling and Analysis
Metabolomics and Mass Spectrometry Studies
Cardiovascular Function and Risk Factors
Advanced Database Systems and Queries

University of Manchester
2018-2025

Idiap Research Institute
2021-2025

Cancer Research UK Manchester Institute
2020-2024

Manchester Academic Health Science Centre
2023

The Christie NHS Foundation Trust
2023

Iscte – Instituto Universitário de Lisboa
2023

Hong Kong Polytechnic University
2023

Bangalore University
2023

University of the Basque Country
2023

Nokia (United Kingdom)
2023

Defining the role of real-world data in cancer clinical research: The position of the European Organisation for Research and Treatment of Cancer

OPENALEX - Publications

Robbe Saesen Mieke Van Hemelrijck Jan Bogaerts Christopher M. Booth Jan J. Cornelissen and 13 more

The emergence of the precision medicine paradigm in oncology has led to increasing interest integration real-world data (RWD) into cancer clinical research. As sources evidence (RWE), such could potentially help address uncertainties that surround adoption novel anticancer therapies clinic following their investigation trials. At present, RWE-generating studies which investigate antitumour interventions seem primarily focus on collecting and analysing observational RWD, typically forgoing...

10.1016/j.ejca.2023.03.013 article EN cc-by-nc-nd European Journal of Cancer 2023-03-21

A survey of safety and trustworthiness of large language models through the lens of verification and validation

OPENALEX - Publications

Xiaowei Huang Wenjie Ruan Wei Huang Gaojie Jin Yi Dong and 12 more

Abstract Large language models (LLMs) have exploded a new heatwave of AI for their ability to engage end-users in human-level conversations with detailed and articulate answers across many knowledge domains. In response fast adoption industrial applications, this survey concerns safety trustworthiness. First, we review known vulnerabilities limitations the LLMs, categorising them into inherent issues, attacks, unintended bugs. Then, consider if how Verification Validation (V&V)...

10.1007/s10462-024-10824-0 article EN cc-by Artificial Intelligence Review 2024-06-17

SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News

OPENALEX - Publications

Keith Cortis André Freitas Tobias Daudert Manuela Huerlimann Manel Zarrouk and 2 more

Keith Cortis, André Freitas, Tobias Daudert, Manuela Huerlimann, Manel Zarrouk, Siegfried Handschuh, Brian Davis. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 2017.

10.18653/v1/s17-2089 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2017-01-01

WWW'18 Open Challenge

OPENALEX - Publications

Macedo Maia Siegfried Handschuh André Freitas Brian Davis Ross McDermott and 2 more

The growing maturity of Natural Language Processing (NLP) techniques and resources is dramatically changing the landscape many application domains which are dependent on analysis unstructured data at scale. finance domain, with its reliance interpretation multiple structured sources demand for fast comprehensive decision making already emerging as a primary ground experimentation NLP, Web Mining Information Retrieval (IR) automatic financial news opinions online. This challenge focuses...

10.1145/3184558.3192301 article EN 2018-01-01

Assessing the communication gap between AI models and healthcare professionals: Explainability, utility and trust in AI-driven clinical decision-making

OPENALEX - Publications

Oskar Wysocki Jessica Katharine Davies Markel Vigo Anne Armstrong Dónal Landers and 2 more

This paper contributes with a pragmatic evaluation framework for explainable Machine Learning (ML) models clinical decision support. The study revealed more nuanced role ML explanation models, when these are pragmatically embedded in the context. Despite general positive attitude of healthcare professionals (HCPs) towards explanations as safety and trust mechanism, significant set participants there were negative effects associated confirmation bias, accentuating model over-reliance...

10.1016/j.artint.2022.103839 article EN cc-by Artificial Intelligence 2022-12-20

SemEval-2023 Task 7: Multi-Evidence Natural Language Inference for Clinical Trial Data

OPENALEX - Publications

Maël Jullien Marco Valentino Hannah Frost Paul O’Regan Dónal Landers and 1 more

This paper describes the results of SemEval 2023 task 7 – Multi-Evidence Natural Language Inference for Clinical Trial Data (NLI4CT) consisting 2 tasks, a (NLI) task, and an evidence selection on clinical trial data. The proposed challenges require multi-hop biomedical numerical reasoning, which are significant importance to development systems capable large-scale interpretation retrieval medical evidence, provide personalized evidence-based care.Task 1, entailment received 643 submissions...

10.18653/v1/2023.semeval-1.307 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2023-01-01

Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends

OPENALEX - Publications

André Freitas Edward Curry João Gabriel Oliveira Seán Ó Riain

The growing number of datasets published on the Web as linked data brings both opportunities for high availability and challenges inherent to querying in a semantically heterogeneous distributed environment. Approaches used siloed databases fail at Web-scale because users don't have an priori understanding all available datasets. This article investigates main constructing query search solution analyzes existing approaches trends.

10.1109/mic.2011.141 article EN IEEE Internet Computing 2011-10-18

A Survey on Open Information Extraction

OPENALEX - Publications

Christina Niklaus Matthias Cetto André Freitas Siegfried Handschuh

We provide a detailed overview of the various approaches that were proposed to date solve task Open Information Extraction. present major challenges such systems face, show evolution suggested over time and depict specific issues they address. In addition, we critique commonly applied evaluation procedures for assessing performance IE highlight some directions future work.

10.48550/arxiv.1806.05599 preprint EN cc-by arXiv (Cornell University) 2018-01-01

Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions

OPENALEX - Publications

Lan Zhang Marco Valentino André Freitas

Thanks to their linguistic capabilities, LLMs offer an opportunity bridge the gap between informal mathematics and formal languages through autoformalization. However, it is still unclear how well generalize sophisticated naturally occurring mathematical statements. To address this gap, we investigate task of autoformalizing real-world definitions -- a critical component discourse. Specifically, introduce two novel resources for autoformalisation, collecting from Wikipedia (Def_Wiki) arXiv...

10.48550/arxiv.2502.12065 preprint EN arXiv (Cornell University) 2025-02-17

Meta-analysis informed machine learning: Supporting cytokine storm detection during CAR-T cell Therapy

OPENALEX - Publications

Alex Bogatu Magdalena Wysocka Oskar Wysocki Holly Butterworth Manon Pillai and 5 more

Cytokine release syndrome (CRS), also known as cytokine storm, is one of the most consequential adverse effects chimeric antigen receptor therapies that have shown otherwise promising results in cancer treatment. When emerging, CRS could be identified by analysis specific and chemokine profiles tend to exhibit similarities across patients. In this paper, we exploit these using machine learning algorithms set out pioneer a meta-review informed method for identification based on peak...

10.1016/j.jbi.2023.104367 article EN cc-by Journal of Biomedical Informatics 2023-04-26

SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials

OPENALEX - Publications

Mael Jullien Marco Valentino André Freitas

10.18653/v1/2024.semeval-1.271 article EN Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2024-01-01

Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning

OPENALEX - Publications

Bai Yang Yingji Zhang Cheng‐Di Dong André Freitas Chenghua Lin

Recent large language models (LLMs) have advanced table understanding capabilities but rely on converting tables into text sequences. While multimodal (MLLMs) enable direct visual processing, they face limitations in handling scientific due to fixed input image resolutions and insufficient numerical reasoning capabilities. We present a comprehensive framework for with dynamic resolutions. Our consists of three key components: (1) MMSci-Pre, domain-specific structure learning dataset 52K...

10.48550/arxiv.2501.13042 preprint EN arXiv (Cornell University) 2025-01-22

CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment

OPENALEX - Publications

Nura Aljaafari Danilo S. Carvalho André Freitas

Large language models (LLMs) struggle with compositional generalisation, limiting their ability to systematically combine learned components interpret novel inputs. While architectural modifications, fine-tuning, and data augmentation improve compositionality, they often have limited adaptability, face scalability constraints, or yield diminishing returns on real data. To address this, we propose CARMA, an intervention that enhances the stability robustness of reasoning in LLMs while...

10.48550/arxiv.2502.11066 preprint EN arXiv (Cornell University) 2025-02-16

An evidence-based guidance framework for neural network system diagrams.

OPENALEX - Publications

Guy Clarke Marshall André Freitas Caroline Jay

Accurate communication of research is essential. We present the first evidence-based framework for formatting neural network architecture diagrams within scholarly publications. Neural networks are a prevalent and important machine learning component, their application leading to significant scientific progress in many domains. Diagrams key communication, appearing almost all papers describing novel systems. However, there currently no established, evidenced-based conventions how they should...

10.1371/journal.pone.0318800 article EN PubMed 2025-01-01

Transforming Complex Sentences into a Semantic Hierarchy

OPENALEX - Publications

Christina Niklaus Matthias Cetto André Freitas Siegfried Handschuh

We present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences, with each them presenting more regular structure that may facilitate wide variety artificial intelligence tasks, such as machine translation (MT) or information extraction (IE). Using set hand-crafted transformation rules, input are transformed two-layered hierarchical representation in the form core accompanying contexts linked via rhetorical...

10.18653/v1/p19-1333 article EN cc-by 2019-01-01

A Survey on Explainability in Machine Reading Comprehension

OPENALEX - Publications

Mokanarangan Thayaparan Marco Valentino André Freitas

This paper presents a systematic review of benchmarks and approaches for explainability in Machine Reading Comprehension (MRC). We present how the representation inference challenges evolved steps which were taken to tackle these challenges. also evaluation methodologies assess performance explainable systems. In addition, we identify persisting open research questions highlight critical directions future work.

10.48550/arxiv.2010.00389 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Natural language queries over heterogeneous linked data graphs

OPENALEX - Publications

André Freitas Edward Curry

The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying distributed third-party databases can create major barriers consumers. At core this problem semantic gap between way express their information needs representation data. This work aims provide natural language interface an associated index support increased level vocabulary independency queries over Linked Data/Semantic Web...

10.1145/2557500.2557534 article EN 2014-02-18

Coming Soon ...