- Genetic Associations and Epidemiology
- Genetic Mapping and Diversity in Plants and Animals
- Genetic and phenotypic traits in livestock
- Gene expression and cancer classification
- Face and Expression Recognition
- Heart Rate Variability and Autonomic Control
- Sparse and Compressive Sensing Techniques
- Bioinformatics and Genomic Networks
- Non-Invasive Vital Sign Monitoring
- AI in cancer detection
- Radiomics and Machine Learning in Medical Imaging
- Control Systems in Engineering
- Artificial Intelligence in Healthcare
- Generative Adversarial Networks and Image Synthesis
- Data Mining Algorithms and Applications
- Gaussian Processes and Bayesian Inference
- Physical Activity and Health
- Music and Audio Processing
- Cardiovascular and exercise physiology
- Time Series Analysis and Forecasting
- Optical Imaging and Spectroscopy Techniques
- Molecular Biology Techniques and Applications
- Machine Fault Diagnosis Techniques
- Mental Health Research Topics
- Machine Learning in Healthcare
Brown University
2020-2024
Apple (United States)
2023
Princeton University
2015-2021
University of California, Los Angeles
2012
University of California, Berkeley
2012
University of Hull
2002
Abstract Histopathological images are used to characterize complex phenotypes such as tumor stage. Our goal is associate features of stained tissue with high-dimensional genomic markers. We use convolutional autoencoders and sparse canonical correlation analysis (CCA) on paired histological bulk gene expression identify subsets genes whose levels in a sample correlate morphological from the corresponding image. apply our approach, ImageCCA, two TCGA data sets, find sets associated structure...
For real-time monitoring of hospital patients, high-quality inference patients' health status using all information available from clinical covariates and lab test results is essential to enable successful medical interventions improve patient outcomes. Developing a computational framework that can learn observational large-scale electronic records (EHRs) make accurate predictions critical step. In this work, we develop explore Bayesian nonparametric model based on multi-output Gaussian...
LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it fast and popular approach. In this work, we present interaction-LD (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying wide range generative models in simulations, by re-analyzing 25 well-studied quantitative phenotypes 349,468 individuals UK Biobank up...
Recent technological developments in measuring genetic variation have ushered an era of genome-wide association studies which discovered many genes involved human disease. Current methods to perform collect information and compare the frequency variants individuals with without Standard approaches do not take into account any on whether or a given variant is likely effect We propose novel method for computing statistic takes prior information. Our improves both power resolution by 8% 27%,...
Identifying variants, both discrete and continuous, that are associated with quantitative traits, or QTs, is the primary focus of genetics. Most current methods limited to identifying mean effects, associations between genotype covariates value a trait. It possible, however, variant may affect variance trait in lieu of, addition to, affecting mean. Here, we develop general methodology identify effects on using Bayesian heteroskedastic linear regression model (BTH). We compare BTH existing...
In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide (GWA) studies. BANNs are feedforward models with partially connected architectures that based on biological annotations. This setup yields fully interpretable neural network where the input layer encodes SNP-level effects, and hidden aggregated effects among SNP-sets. We treat weights connections of as random variables prior distributions...
Heart rate (HR) response to workout intensity reflects fitness and cardiorespiratory health. Physiological models have been developed describe such heart dynamics characterize fitness. However, these limited small studies in controlled lab environments are challenging apply noisy-but ubiquitous-data from wearables. We propose a hybrid approach that combines physiological model with flexible neural network components learn personalized, multidimensional representation of The describes the...
In the scenario of real-time monitoring hospital patients, high-quality inference patients' health status using all information available from clinical covariates and lab tests is essential to enable successful medical interventions improve patient outcomes. Developing a computational framework that can learn observational large-scale electronic records (EHRs) make accurate predictions critical step. this work, we develop explore Bayesian nonparametric model based on Gaussian process (GP)...
Histological images are used to identify and characterize complex phenotypes such as tumor stage. Our goal is associate histological image with high-dimensional genomic markers; the limitations incorporating in studies that relevant features difficult extract an automated way, confounders control this setting. In paper, we use convolutional autoencoders sparse canonical correlation analysis (CCA) on gene expression levels from paired samples find subsets of genes whose values a tissue sample...
Abstract In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide (GWA) studies. BANNs are feedforward models with partially connected architectures that based on biological annotations. This setup yields fully interpretable neural network where the input layer encodes SNP-level effects, and hidden aggregated effects among SNP-sets. We treat weights connections of as random variables prior...
The work presented here forms part of a study into the application self-learning networks to complex field machine condition monitoring. There are already several methods by which machines can be automatically monitored, but development simplified nonintrusive "intelligent" system would advantageous. Some has been undertaken on time encoded speech (TES) automatic recognition using neural networks. It seemed feasible try similar technique classify acoustic emissions mechanical object. Initial...
We present RelCon, a novel self-supervised *Rel*ative *Con*trastive learning approach that uses learnable distance measure in combination with softened contrastive loss for training an motion foundation model from wearable sensors. The captures motif similarity and domain-specific semantic information such as rotation invariance. learned provides measurement of between pair accelerometer time-series segments, which is used to the anchor various other sampled candidate segments. trained on 1...
Abstract LD score regression ( LDSC ) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it fast and popular approach. In this work, we present interaction-LD i-LDSC regression: an extension of the original framework that accounts for interactions between genetic variants. By studying wide range generative models in simulations, by re-analyzing 25 well-studied quantitative phenotypes 349,468 individuals UK Biobank up...
Abstract Genome-wide association (GWA) studies have identified thousands of significant genetic associations in humans across a number complex traits. However, the majority these focus on linear additive relationships between genotypic and phenotypic variation. Epistasis, or non-additive interactions, has been as major driver both trait architecture evolution multiple model organisms; yet, this same phenomenon is not considered to be factor underlying human There are two possible reasons for...
Identifying genetic variants that regulate quantitative traits, or QTLs, is the primary focus of field statistical genetics. Most current methods are limited to identifying mean effects, associations between genotype and value a trait. It possible, however, variant may affect variance trait in lieu of, addition to, affecting mean. Here, we develop general methodological approach covariates with effects on using Bayesian heteroskedastic linear regression model. We show our test for...
Increased use of sensor signals from wearable devices as rich sources physiological data has sparked growing interest in developing health monitoring systems to identify changes an individual's profile. Indeed, machine learning models for have enabled a diverse range healthcare related applications including early detection abnormalities, fertility tracking, and adverse drug effect prediction. However, these can fail account the dependent high-dimensional nature underlying signals. In this...
The scalability of statistical estimators is increasing importance in modern applications. One approach to implementing scalable algorithms compress data into a low dimensional latent space using dimension reduction methods. In this paper we develop an for that exploits the assumption rank structure high gain both computational and advantages. We adapt recent randomized low-rank approximation provide efficient solution principal component analysis (PCA), use solver improve parameter...
Variational Autoencoders (VAEs) have experienced recent success as data-generating models by using simple architectures that do not require significant fine-tuning of hyperparameters. However, VAEs are known to suffer from over-regularization which can lead failure escape local maxima. This phenomenon, posterior collapse, prevents learning a meaningful latent encoding the data. Recent methods mitigated this issue deterministically moment-matching an aggregated distribution aggregate prior....