- Topic Modeling
- Natural Language Processing Techniques
- Semantic Web and Ontologies
- Biomedical Text Mining and Ontologies
- Explainable Artificial Intelligence (XAI)
- Text Readability and Simplification
- Data Quality and Management
- Advanced Text Analysis Techniques
- Scientific Computing and Data Management
- Machine Learning in Healthcare
- Advanced Graph Neural Networks
- Service-Oriented Architecture and Web Services
- Mathematics, Computing, and Information Processing
- Data Visualization and Analytics
- COVID-19 and healthcare impacts
- Neural Networks and Applications
- Multimodal Machine Learning Applications
- Software Engineering Research
- Distributed and Parallel Computing Systems
- COVID-19 Clinical Research Studies
- Sentiment Analysis and Opinion Mining
- Business Process Modeling and Analysis
- Metabolomics and Mass Spectrometry Studies
- Cardiovascular Function and Risk Factors
- Advanced Database Systems and Queries
University of Manchester
2018-2025
Idiap Research Institute
2021-2025
Cancer Research UK Manchester Institute
2020-2024
Manchester Academic Health Science Centre
2023
The Christie NHS Foundation Trust
2023
Iscte – Instituto Universitário de Lisboa
2023
Hong Kong Polytechnic University
2023
Bangalore University
2023
University of the Basque Country
2023
Nokia (United Kingdom)
2023
The emergence of the precision medicine paradigm in oncology has led to increasing interest integration real-world data (RWD) into cancer clinical research. As sources evidence (RWE), such could potentially help address uncertainties that surround adoption novel anticancer therapies clinic following their investigation trials. At present, RWE-generating studies which investigate antitumour interventions seem primarily focus on collecting and analysing observational RWD, typically forgoing...
Abstract Large language models (LLMs) have exploded a new heatwave of AI for their ability to engage end-users in human-level conversations with detailed and articulate answers across many knowledge domains. In response fast adoption industrial applications, this survey concerns safety trustworthiness. First, we review known vulnerabilities limitations the LLMs, categorising them into inherent issues, attacks, unintended bugs. Then, consider if how Verification Validation (V&V)...
Keith Cortis, André Freitas, Tobias Daudert, Manuela Huerlimann, Manel Zarrouk, Siegfried Handschuh, Brian Davis. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). 2017.
The growing maturity of Natural Language Processing (NLP) techniques and resources is dramatically changing the landscape many application domains which are dependent on analysis unstructured data at scale. finance domain, with its reliance interpretation multiple structured sources demand for fast comprehensive decision making already emerging as a primary ground experimentation NLP, Web Mining Information Retrieval (IR) automatic financial news opinions online. This challenge focuses...
This paper contributes with a pragmatic evaluation framework for explainable Machine Learning (ML) models clinical decision support. The study revealed more nuanced role ML explanation models, when these are pragmatically embedded in the context. Despite general positive attitude of healthcare professionals (HCPs) towards explanations as safety and trust mechanism, significant set participants there were negative effects associated confirmation bias, accentuating model over-reliance...
This paper describes the results of SemEval 2023 task 7 – Multi-Evidence Natural Language Inference for Clinical Trial Data (NLI4CT) consisting 2 tasks, a (NLI) task, and an evidence selection on clinical trial data. The proposed challenges require multi-hop biomedical numerical reasoning, which are significant importance to development systems capable large-scale interpretation retrieval medical evidence, provide personalized evidence-based care.Task 1, entailment received 643 submissions...
The growing number of datasets published on the Web as linked data brings both opportunities for high availability and challenges inherent to querying in a semantically heterogeneous distributed environment. Approaches used siloed databases fail at Web-scale because users don't have an priori understanding all available datasets. This article investigates main constructing query search solution analyzes existing approaches trends.
We provide a detailed overview of the various approaches that were proposed to date solve task Open Information Extraction. present major challenges such systems face, show evolution suggested over time and depict specific issues they address. In addition, we critique commonly applied evaluation procedures for assessing performance IE highlight some directions future work.
Thanks to their linguistic capabilities, LLMs offer an opportunity bridge the gap between informal mathematics and formal languages through autoformalization. However, it is still unclear how well generalize sophisticated naturally occurring mathematical statements. To address this gap, we investigate task of autoformalizing real-world definitions -- a critical component discourse. Specifically, introduce two novel resources for autoformalisation, collecting from Wikipedia (Def_Wiki) arXiv...
Cytokine release syndrome (CRS), also known as cytokine storm, is one of the most consequential adverse effects chimeric antigen receptor therapies that have shown otherwise promising results in cancer treatment. When emerging, CRS could be identified by analysis specific and chemokine profiles tend to exhibit similarities across patients. In this paper, we exploit these using machine learning algorithms set out pioneer a meta-review informed method for identification based on peak...
Recent large language models (LLMs) have advanced table understanding capabilities but rely on converting tables into text sequences. While multimodal (MLLMs) enable direct visual processing, they face limitations in handling scientific due to fixed input image resolutions and insufficient numerical reasoning capabilities. We present a comprehensive framework for with dynamic resolutions. Our consists of three key components: (1) MMSci-Pre, domain-specific structure learning dataset 52K...
Large language models (LLMs) struggle with compositional generalisation, limiting their ability to systematically combine learned components interpret novel inputs. While architectural modifications, fine-tuning, and data augmentation improve compositionality, they often have limited adaptability, face scalability constraints, or yield diminishing returns on real data. To address this, we propose CARMA, an intervention that enhances the stability robustness of reasoning in LLMs while...
Accurate communication of research is essential. We present the first evidence-based framework for formatting neural network architecture diagrams within scholarly publications. Neural networks are a prevalent and important machine learning component, their application leading to significant scientific progress in many domains. Diagrams key communication, appearing almost all papers describing novel systems. However, there currently no established, evidenced-based conventions how they should...
We present an approach for recursively splitting and rephrasing complex English sentences into a novel semantic hierarchy of simplified sentences, with each them presenting more regular structure that may facilitate wide variety artificial intelligence tasks, such as machine translation (MT) or information extraction (IE). Using set hand-crafted transformation rules, input are transformed two-layered hierarchical representation in the form core accompanying contexts linked via rhetorical...
This paper presents a systematic review of benchmarks and approaches for explainability in Machine Reading Comprehension (MRC). We present how the representation inference challenges evolved steps which were taken to tackle these challenges. also evaluation methodologies assess performance explainable systems. In addition, we identify persisting open research questions highlight critical directions future work.
The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying distributed third-party databases can create major barriers consumers. At core this problem semantic gap between way express their information needs representation data. This work aims provide natural language interface an associated index support increased level vocabulary independency queries over Linked Data/Semantic Web...