Javier Alvarez-Valle

ORCID: 0000-0003-0906-4177
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Radiomics and Machine Learning in Medical Imaging
  • Topic Modeling
  • Artificial Intelligence in Healthcare and Education
  • COVID-19 diagnosis using AI
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Advanced Radiotherapy Techniques
  • AI in cancer detection
  • Medical Imaging Techniques and Applications
  • Privacy-Preserving Technologies in Data
  • Medical Image Segmentation Techniques
  • Advanced Neural Network Applications
  • Pancreatic and Hepatic Oncology Research
  • Machine Learning in Healthcare
  • Lung Cancer Diagnosis and Treatment
  • Domain Adaptation and Few-Shot Learning
  • Interpreting and Communication in Healthcare
  • Biomedical Text Mining and Ontologies
  • Esophageal Cancer Research and Treatment
  • Machine Learning and Data Classification
  • Computational and Text Analysis Methods
  • Glioma Diagnosis and Treatment
  • Generative Adversarial Networks and Image Synthesis
  • Brain Tumor Detection and Classification
  • Cell Image Analysis Techniques

Microsoft Research (United Kingdom)
2020-2025

Microsoft (United Kingdom)
2023-2025

Microsoft (United States)
2021-2022

Intel (United Kingdom)
2020

Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have a confounding effect on assessment model performance. Nevertheless, employing experts remove noise by fully re-annotating large datasets is infeasible resource-constrained settings, such healthcare. This work advocates for data-driven approach prioritising samples re-annotation-which we term "active cleaning". We propose rank instances according estimated correctness...

10.1038/s41467-022-28818-3 article EN cc-by Nature Communications 2022-03-04

Self-supervised learning in vision-language processing (VLP) exploits semantic alignment between imaging and text modalities. Prior work biomedical VLP has mostly relied on the of single image report pairs even though clinical notes commonly refer to prior images. This does not only introduce poor modalities but also a missed opportunity exploit rich self-supervision through existing temporal content data. In this work, we explicitly account for images reports when available during both...

10.1109/cvpr52729.2023.01442 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Recent advances in AI combine large language models (LLMs) with vision encoders that bring forward unprecedented technical capabilities to leverage for a wide range of healthcare applications. Focusing on the domain radiology, vision-language (VLMs) achieve good performance results tasks such as generating radiology findings based patient's medical image, or answering visual questions (e.g., "Where are nodules this chest X-ray?"). However, clinical utility potential applications these is...

10.1145/3613904.3642013 preprint EN cc-by-nd 2024-05-11

<h3>Importance</h3> Personalized radiotherapy planning depends on high-quality delineation of target tumors and surrounding organs at risk (OARs). This process puts additional time burdens oncologists introduces variability among both experts institutions. <h3>Objective</h3> To explore clinically acceptable autocontouring solutions that can be integrated into existing workflows used in different domains radiotherapy. <h3>Design, Setting, Participants</h3> quality improvement study a...

10.1001/jamanetworkopen.2020.27426 article EN cc-by-nc-nd JAMA Network Open 2020-11-30

Abstract Timely detection of Barrett’s esophagus, the pre-malignant condition esophageal adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic minimally invasive procedure, has been used for diagnosing intestinal metaplasia in Barrett’s. However, it depends on pathologist’s assessment two slides stained with H&amp;E and immunohistochemical biomarker TFF3. This resource-intensive clinical workflow limits large-scale screening at-risk population. To...

10.1038/s41467-024-46174-2 article EN cc-by Nature Communications 2024-03-11

Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel Castro, Maria Wetscherek, Robert Tinn, Harshita Sharma, Fernando Pérez-García, Anton Schwaighofer, Pranav Rajpurkar, Sameer Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya Nori, Matthew Lungren, Ozan Oktay, Javier Alvarez-Valle. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.

10.18653/v1/2023.emnlp-main.891 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Language-supervised pre-training has proven to be a valuable method for extracting semantically meaningful features from images, serving as foundational element in multimodal systems within the computer vision and medical imaging domains. However, resulting are limited by information contained text. This is particularly problematic imaging, where radiologists' written findings focus on specific observations; challenge compounded scarcity of paired imaging-text data due concerns over leakage...

10.48550/arxiv.2401.10815 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Radiology reporting is a complex task that requires detailed image understanding, integration of multiple inputs, including comparison with prior imaging, and precise language generation. This makes it ideal for the development use generative multimodal models. Here, we extend report generation to include localisation individual findings on - call grounded Prior work indicates grounding important clarifying understanding interpreting AI-generated text. Therefore, stands improve utility...

10.48550/arxiv.2406.04449 preprint EN arXiv (Cornell University) 2024-06-06

Nasogastric tubes (NGTs) are feeding that inserted through the nose into stomach to deliver nutrition or medication. If not placed correctly, they can cause serious harm, even death patients. Recent AI developments demonstrate feasibility of robustly detecting NGT placement from Chest X-ray images reduce risks sub-optimally critically NGTs being missed delayed in their detection, but gaps remain clinical practice integration. In this study, we present a human-centered approach problem and...

10.1145/3716500 article EN ACM Transactions on Computer-Human Interaction 2025-02-12

We present a radiology-specific multimodal model for the task generating radiological reports from chest X-rays (CXRs). Our work builds on idea that large language model(s) can be equipped with capabilities through alignment pre-trained vision encoders. On natural images, this has been shown to allow models gain image understanding and description capabilities. proposed (MAIRA-1) leverages CXR-specific encoder in conjunction fine-tuned based Vicuna-7B, text-based data augmentation, produce...

10.48550/arxiv.2311.13668 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Nasogastric tubes (NGTs) are feeding that inserted through the nose into stomach to deliver nutrition or medication. If not placed correctly, they can cause serious harm, even death patients. Recent AI developments demonstrate feasibility of robustly detecting NGT placement from Chest X-ray images reduce risks sub-optimally critically NGTs being missed delayed in their detection, but gaps remain clinical practice integration. In this study, we present a human-centered approach problem and...

10.48550/arxiv.2405.05299 preprint EN arXiv (Cornell University) 2024-05-08

We present CRYPTFLOW, a system that converts TensorFlow inference code into Secure Multi-party Computation (MPC) protocols at the push of button. To do this, we build two components. Our first component is an end-to-end compiler from to variety MPC protocols. The second improved semi-honest 3-party protocol provides significant speedups for inference. empirically demonstrate power our by showing secure real-world neural networks such as DENSENET121 detection lung diseases chest X-ray images...

10.48550/arxiv.2012.05064 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities. Prior work biomedical VLP has mostly relied on the of single image report pairs even though clinical notes commonly refer to prior images. This does not only introduce poor modalities but also a missed opportunity exploit rich self-supervision through existing temporal content data. In this work, we explicitly account for images reports when available during both training...

10.48550/arxiv.2301.04558 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Timely detection of Barrett’s esophagus, the pre-malignant condition esophageal adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic minimally invasive procedure, has been used for diagnosing intestinal metaplasia in Barrett’s. However, it depends on pathologist’s assessment two slides stained with H&amp;E and immunohistochemical biomarker TFF3. This resource-intensive clinical workflow limits large-scale screening at-risk population. Deep learning...

10.1101/2023.08.21.23294360 preprint EN medRxiv (Cold Spring Harbor Laboratory) 2023-08-22

Abstract Label scarcity is a bottleneck for improving task performance in specialized domains. We propose novel compositional transfer learning framework (DoT51) zero-shot domain transfer. Without access to in-domain labels, DoT5 jointly learns knowledge (from masked language modelling of unlabelled free text) and training on more readily available general-domain data) multi-task manner. To improve the transferability training, we design strategy named NLGU: simultaneously train natural...

10.1162/tacl_a_00585 article EN cc-by Transactions of the Association for Computational Linguistics 2023-01-01

The recent success of general-domain large language models (LLMs) has significantly changed the natural processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing performance GPT-4, most capable LLM so far, text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. Exploring various prompting strategies, evaluated GPT-4 diverse range common tasks found either outperforms...

10.48550/arxiv.2310.14573 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Accurate hippocampal segmentation tools are critical for monitoring neurodegenerative disease progression on MRI and assessing the impact of interventional treatment. Here we present InnerEye model evaluate this new against three standard in an Alzheimer’s dataset. We found performed best Dice score, precision Hausdorff distance. performs consistently well across different cognitive diagnoses, while performance other methods decreased with decline.

10.58530/2023/0808 article EN Proceedings on CD-ROM - International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition/Proceedings of the International Society for Magnetic Resonance in Medicine, Scientific Meeting and Exhibition 2024-08-14

In this work, we present MedImageInsight, an open-source medical imaging embedding model. MedImageInsight is trained on images with associated text and labels across a diverse collection of domains, including X-Ray, CT, MRI, dermoscopy, OCT, fundus photography, ultrasound, histopathology, mammography. Rigorous evaluations demonstrate MedImageInsight's ability to achieve state-of-the-art (SOTA) or human expert level performance classification, image-image search, fine-tuning tasks....

10.48550/arxiv.2410.06542 preprint EN arXiv (Cornell University) 2024-10-09
Coming Soon ...