- Machine Learning in Healthcare
- Privacy-Preserving Technologies in Data
- COVID-19 epidemiological studies
- COVID-19 diagnosis using AI
- Gene Regulatory Network Analysis
- Artificial Intelligence in Healthcare and Education
- Topic Modeling
- Bioinformatics and Genomic Networks
- Artificial Intelligence in Healthcare
- Genomics and Chromatin Dynamics
- Biomedical Text Mining and Ontologies
- Gene expression and cancer classification
- Neural Networks and Applications
- Data-Driven Disease Surveillance
- Domain Adaptation and Few-Shot Learning
- CRISPR and Genetic Engineering
- Stochastic Gradient Optimization Techniques
- COVID-19 Pandemic Impacts
- Generative Adversarial Networks and Image Synthesis
- Advanced Memory and Neural Computing
- Lung Cancer Diagnosis and Treatment
- Reinforcement Learning in Robotics
- Neural dynamics and brain function
- Complex Network Analysis Techniques
- Radiomics and Machine Learning in Medical Imaging
National University of Singapore
2024
Mila - Quebec Artificial Intelligence Institute
2021-2024
Massachusetts Institute of Technology
2016-2023
China University of Geosciences (Beijing)
2022
Broad Institute
2016-2022
Boston Children's Hospital
2019-2022
Harvard University
2019-2022
Canadian Institute for Advanced Research
2022
Computer Algorithms for Medicine
2021
Massachusetts General Hospital
2021
Abstract The novel COVID-19 outbreak has affected more than 200 countries and territories as of March 2020. Given that patients with cancer are generally vulnerable to infections, systematic analysis diverse cohorts by is needed. We performed a multicenter study including 105 536 age-matched noncancer confirmed COVID-19. Our results showed had higher risks in all severe outcomes. Patients hematologic cancer, lung or metastatic (stage IV) the highest frequency events. nonmetastatic...
Abstract First identified in Wuhan, China, December 2019, a novel coronavirus (SARS-CoV-2) has affected over 16,800,000 people worldwide as of July 29, 2020 and was declared pandemic by the World Health Organization on March 11, 2020. Influenza studies have shown that influenza viruses survive longer surfaces or droplets cold dry air, thus increasing likelihood subsequent transmission. A similar hypothesis been postulated for transmission COVID-19, disease caused SARS-CoV-2. It is important...
Intensive care data are valuable for improvement of health care, policy making and many other purposes. Vast amount such stored in different locations, on devices silos. Sharing among sources is a big challenge due to regulatory, operational security reasons. One potential solution federated machine learning, which method that sends learning algorithms simultaneously all sources, trains models each source aggregates the learned models. This strategy allows utilization without moving them....
We present a timely and novel methodology that combines disease estimates from mechanistic models with digital traces, via interpretable machine-learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces real-time. Specifically, our method is able produce stable accurate forecasts 2 days ahead of current time, uses as inputs (a) official health reports Center Disease for Control Prevention (China CDC), (b) COVID-19-related internet search Baidu, (c) news media...
Despite significant clinical progress in cell and gene therapies, maximizing protein expression order to enhance potency remains a major technical challenge. Here, we develop high-throughput strategy design, screen, optimize 5' UTRs that from strong human cytomegalovirus (CMV) promoter. We first identify naturally occurring with high translation efficiencies use this information silico genetic algorithms generate synthetic UTRs. A total of ~12,000 are then screened using recombinase-mediated...
A large percentage of medical information is in unstructured text format electronic record systems. Manual extraction from clinical notes extremely time consuming. Natural language processing has been widely used recent years for automatic texts. However, algorithms trained on data a single healthcare provider are not generalizable and error-prone due to the heterogeneity uniqueness documents. We develop two-stage federated natural method that enables utilization different hospitals or...
Marine debris is severely threatening the marine lives and causing sustained pollution to whole ecosystem. To prevent wastes from getting into ocean, it helpful clean up floating in inland waters using autonomous cleaning devices like unmanned surface vehicles. The efficiency relies on a high-accurate robust object detection system. However, small size of target, strong light reflection over water surface, other objects bank-side all bring challenges vision-based promote practical...
Background The inherent difficulty of identifying and monitoring emerging outbreaks caused by novel pathogens can lead to their rapid spread; if left unchecked, they may become major public health threats the planet. ongoing coronavirus disease (COVID-19) outbreak, which has infected over 2,300,000 individuals 150,000 deaths, is an example one these catastrophic events. Objective We present a timely methodology that combines estimates from mechanistic models digital traces, via interpretable...
Electronic health record (EHR) data is collected by individual institutions and often stored across locations in silos. Getting access to these difficult slow due security, privacy, regulatory, operational issues. We show, using ICU from 58 different hospitals, that machine learning models predict patient mortality can be trained efficiently without moving out of their silos a distributed strategy. propose new method, called Federated-Autonomous Deep Learning (FADL) trains part the model all...
A novel coronavirus (SARS-CoV-2) was identified in Wuhan, Hubei Province, China, December 2019 and has caused over 240,000 cases of COVID-19 worldwide as March 19, 2020. Previous studies have supported an epidemiological hypothesis that cold dry environments facilitate the survival spread droplet-mediated viral diseases, warm humid see attenuated transmission (e.g., influenza). However, role temperature humidity not yet been established. Here, we examine spatial variability basic...
Abstract Patients’ no-shows, scheduled but unattended medical appointments, have a direct negative impact on patients’ health, due to discontinuity of treatment and late presentation care. They also lead inefficient use resources in hospitals clinics. The ability predict likely no-show advance could enable the design implementation interventions reduce risk it happening, thus improving care clinical resource allocation. In this study, we develop new interpretable deep learning-based approach...
<sec> <title>BACKGROUND</title> Mixed reality (MR) has the potential to transform delivery of medical education. With tools like HoloLens 2, educators can create immersive, interactive simulations that allow students practice and engage with real-world scenarios in a controlled environment. </sec> <title>OBJECTIVE</title> We postulate hybrid ophthalmology curriculum incorporates EyelearnMR (simulation application) traditional teaching is non-inferior teaching. compare learning outcomes...
Question: What is the performance and reasoning ability of OpenAI o1 compared to other large language models in addressing ophthalmology-specific questions? Findings: This study evaluated five LLMs using 6,990 ophthalmological questions from MedMCQA. O1 achieved highest accuracy (0.88) macro-F1 score but ranked third capabilities based on text-generation metrics. Across subtopics, first ``Lens'' ``Glaucoma'' second GPT-4o ``Corneal External Diseases'', ``Vitreous Retina'' ``Oculoplastic...
Federated Learning (FL) has emerged as a promising approach for research on real-world medical data distributed across different organizations, it allows analysis of while preserving patient privacy. However, one the prominent challenges in FL is covariate shift, where distributions differ significantly clinical sites, like hospitals and outpatient clinics. These differences demographics, practices, collection processes may lead to significant performance degradation shared model when...
Large Language Models (LLMs) demonstrate remarkable proficiency in generating accurate and fluent text. However, they often struggle with diversity novelty, leading to repetitive or overly deterministic responses. These limitations stem from constraints training data, including gaps specific knowledge domains, outdated information, an over-reliance on textual sources. Such shortcomings reduce their effectiveness tasks requiring creativity, multi-perspective reasoning, exploratory thinking,...
Hair follicles and sweat glands are recognized as reservoirs of melanocyte stem cells (MSCs). Unlike differentiated melanocytes, undifferentiated MSCs do not produce melanin. They serve a source melanocytes for the hair follicle contribute to interfollicular epidermis upon wounding, exposure ultraviolet irradiation or in remission from vitiligo, where repigmentation often spreads outwards follicles. It is unknown whether these observations reflect normal homoeostatic mechanism renewal...
Chest X-ray imaging based abnormality localization, essential in diagnosing various diseases, faces significant clinical challenges due to complex interpretations and the growing workload of radiologists. While recent advances deep learning offer promising solutions, there is still a critical issue domain inconsistency cross-domain transfer learning, which hampers efficiency accuracy diagnostic processes. This study aims address problem improve autonomic localization performance...