NFDI4DS | UHH-SEMS - Publication Details

Emma Pierson

ORCID: 0000-0002-6149-5567

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5067807487

Research Areas

Machine Learning in Healthcare
Single-cell and spatial transcriptomics
Artificial Intelligence in Healthcare and Education
Gene expression and cancer classification
Inflammatory Biomarkers in Disease Prognosis
COVID-19 epidemiological studies
Cell Image Analysis Techniques
Human Mobility and Location-Based Analysis
Data-Driven Disease Surveillance
Crime Patterns and Interventions
Ethics and Social Impacts of AI
Anomaly Detection Techniques and Applications
demographic modeling and climate adaptation
Ethics in Clinical Research
Policing Practices and Perceptions
Topic Modeling
Cancer Immunotherapy and Biomarkers
Cancer, Lipids, and Metabolism
Machine Learning and Data Classification
Colorectal Cancer Screening and Detection
Mental Health Research Topics
Imbalanced Data Classification Techniques
Bioinformatics and Genomic Networks
Explainable Artificial Intelligence (XAI)
Advanced Causal Inference Techniques

Cornell University
2020-2025

University of California, Berkeley
2020-2025

Jacobs Institute
2023-2024

New York Proton Center
2024

Boston Children's Hospital
2024

Boston Medical Center
2024

Brigham and Women's Hospital
2024

Beth Israel Deaconess Medical Center
2024

Massachusetts General Hospital
2024

Harvard University
2024

Mobility network models of COVID-19 explain inequities and inform reopening

OPENALEX - Publications

Serina Chang Emma Pierson Pang Wei Koh Jaline Gerardin Beth Redbird and 2 more

10.1038/s41586-020-2923-3 article EN other-oa Nature 2020-11-10

Algorithmic Decision Making and the Cost of Fairness

OPENALEX - Publications

Sam Corbett‐Davies Emma Pierson Avi Feller Sharad Goel Aziz Z. Huq

Algorithms are now regularly used to decide whether defendants awaiting trial too dangerous be released back into the community. In some cases, black substantially more likely than white incorrectly classified as high risk. To mitigate such disparities, several techniques have recently been proposed achieve algorithmic fairness. Here we reformulate fairness constrained optimization: objective is maximize public safety while satisfying formal constraints designed reduce racial disparities. We...

10.1145/3097983.3098095 article EN 2017-08-04

Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

OPENALEX - Publications

Bo Wang Junjie Zhu Emma Pierson Daniele Ramazzotti Serafim Batzoglou

10.1038/nmeth.4207 article EN Nature Methods 2017-03-06

ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis

OPENALEX - Publications

Emma Pierson Christopher Yau

Single-cell RNA-seq data allows insight into normal cellular function and various disease states through molecular characterization of gene expression on the single cell level. Dimensionality reduction such high-dimensional sets is essential for visualization analysis, but single-cell are challenging classical dimensionality-reduction methods because prevalence dropout events, which lead to zero-inflated data. Here, we develop a method, (Z)ero (I)nflated (F)actor (A)nalysis (ZIFA),...

10.1186/s13059-015-0805-z article EN cc-by Genome biology 2015-11-02

A large-scale analysis of racial disparities in police stops across the United States

OPENALEX - Publications

Emma Pierson Camelia Simoiu Jan Overgoor Sam Corbett‐Davies Daniel Jenson and 6 more

10.1038/s41562-020-0858-1 article EN Nature Human Behaviour 2020-05-04

An algorithmic approach to reducing unexplained pain disparities in underserved populations

OPENALEX - Publications

Emma Pierson David Cutler Jure Leskovec Sendhil Mullainathan Ziad Obermeyer

10.1038/s41591-020-01192-7 article EN Nature Medicine 2021-01-01

Sharing and Specificity of Co-expression Networks across 35 Human Tissues

OPENALEX - Publications

Emma Pierson Daphne Koller Alexis Battle Sara Mostafavi

To understand the regulation of tissue-specific gene expression, GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This provides an opportunity deriving shared and tissue specific regulatory networks on basis co-expression between genes. However, a small number samples are available majority tissues, therefore statistical inference in this setting is highly underpowered. address problem, we infer 35 tissues dataset using novel algorithm, GNAT,...

10.1371/journal.pcbi.1004220 article EN cc-by PLoS Computational Biology 2015-05-13

Ethical Machine Learning in Healthcare

OPENALEX - Publications

Irene Y. Chen Emma Pierson Sherri Rose Shalmali Joshi Kadija Ferryman and 1 more

The use of machine learning (ML) in healthcare raises numerous ethical concerns, especially as models can amplify existing health inequities. Here, we outline considerations for equitable ML the advancement healthcare. Specifically, frame ethics through lens social justice. We describe ongoing efforts and challenges a proposed pipeline health, ranging from problem selection to postdeployment considerations. close by summarizing recommendations address these challenges.

10.1146/annurev-biodatasci-092820-114757 article EN Annual Review of Biomedical Data Science 2021-05-06

Concept Bottleneck Models

OPENALEX - Publications

Pang Wei Koh Thao Nguyen Yew Siang Tang Stephen Mussmann Emma Pierson and 2 more

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in x-ray, would it still predict severe arthritis? State-of-the-art today do typically support manipulation of concepts like "the existence spurs", as they are trained end-to-end go directly from raw input (e.g., pixels) output arthritis severity). revisit classic idea first predicting provided at training time, and then these label. By construction, intervene on...

10.48550/arxiv.2007.04612 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Human mobility networks reveal increased segregation in large cities

OPENALEX - Publications

Hamed Nilforoshan Wenli Looi Emma Pierson Blanca Villanueva Nic Fishman and 5 more

Abstract A long-standing expectation is that large, dense and cosmopolitan areas support socioeconomic mixing exposure among diverse individuals 1–6 . Assessing this hypothesis has been difficult because previous measures of have relied on static residential housing data rather than real-life exposures people at work, in places leisure home neighbourhoods 7,8 Here we develop a measure segregation captures the diversity these everyday encounters. Using mobile phone mobility to represent 1.6...

10.1038/s41586-023-06757-3 article EN cc-by Nature 2023-11-29

Implications of Race Adjustment in Lung-Function Equations

OPENALEX - Publications

James A. Diao Yixuan He Rohan Khazanchi Max Jordan Nguemeni Tiako Jonathan Witonsky and 14 more

Adjustment for race is discouraged in lung-function testing, but the implications of adopting race-neutral equations have not been comprehensively quantified.

10.1056/nejmsa2311809 article EN New England Journal of Medicine 2024-05-19

Projected Changes in Statin and Antihypertensive Therapy Eligibility With the AHA PREVENT Cardiovascular Risk Equations

OPENALEX - Publications

James A. Diao Ivy Shi Venkatesh L. Murthy Thomas A. Buckley Chirag J. Patel and 5 more

Importance Since 2013, the American College of Cardiology (ACC) and Heart Association (AHA) have recommended pooled cohort equations (PCEs) for estimating 10-year risk atherosclerotic cardiovascular disease (ASCVD). An AHA scientific advisory group recently developed Predicting Risk EVENTs (PREVENT) equations, which incorporated kidney measures, removed race as an input, improved calibration in contemporary populations. PREVENT is known to produce ASCVD predictions that are lower than those...

10.1001/jama.2024.12537 article EN JAMA 2024-07-29

Artificial Intelligence in Cardiovascular Care—Part 2: Applications

OPENALEX - Publications

Sneha S. Jain Pierre Elias Timothy J. Poterucha Michael Randazzo Francisco López Jiménez and 16 more

10.1016/j.jacc.2024.03.401 article EN Journal of the American College of Cardiology 2024-04-07

Accuracy and Equity in Clinical Risk Prediction

OPENALEX - Publications

Emma Pierson

Has the laudable intention of ensuring patient equity caused medicine to deviate from its mandate predict patients’ risk as accurately possible?

10.1056/nejmp2311050 article EN New England Journal of Medicine 2024-01-06

Using Large Language Models to Promote Health Equity

OPENALEX - Publications

Emma Pierson Divya Shanmugam Rajiv Movva Jon Kleinberg Monica Agrawal and 10 more

10.1056/aip2400889 article EN NEJM AI 2025-01-13

Higher Absolute Lymphocyte Counts Predict Lower Mortality from Early-Stage Triple-Negative Breast Cancer

OPENALEX - Publications

Anosheh Afghahi Natasha Purington Summer S. Han Manisha Desai Emma Pierson and 13 more

Abstract Purpose: Tumor-infiltrating lymphocytes (TIL) in pretreatment biopsies are associated with improved survival triple-negative breast cancer (TNBC). We investigated whether higher peripheral lymphocyte counts lower cancer–specific mortality (BCM) and overall (OM) TNBC. Experimental Design: Data on treatments diagnostic tests from electronic medical records of two health care systems were linked demographic, clinical, pathologic, data the California Cancer Registry. Multivariable...

10.1158/1078-0432.ccr-17-1323 article EN Clinical Cancer Research 2018-03-26

Daily, weekly, seasonal and menstrual cycles in women’s mood, behaviour and vital signs

OPENALEX - Publications

Emma Pierson Tim Althoff Daniel Thomas Paula J. Adams Hillard Jure Leskovec

10.1038/s41562-020-01046-9 article EN Nature Human Behaviour 2021-02-01

SIMLR: A Tool for Large‐Scale Genomic Analyses by Multi‐Kernel Learning

OPENALEX - Publications

Bo Wang Daniele Ramazzotti De Luca Junjie Zhu Emma Pierson and 1 more

SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn sample-to-sample similarity measure from expression data observed for heterogenous samples, is presented here. can be effectively used perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations samples. was benchmarked against state-of-the-art methods these three on several public datasets, showing it scalable capable greatly...

10.1002/pmic.201700232 article EN PROTEOMICS 2017-12-19

Denoising genome-wide histone ChIP-seq with convolutional neural networks

OPENALEX - Publications

Pang Wei Koh Emma Pierson Anshul Kundaje

Chromatin immune-precipitation sequencing (ChIP-seq) experiments are commonly used to obtain genome-wide profiles of histone modifications associated with different types functional genomic elements. However, the quality ChIP-seq data is affected by many experimental parameters such as amount input DNA, antibody specificity, ChIP enrichment and depth. Making accurate inferences from chromatin profiling that involve diverse challenging.We introduce a convolutional denoising algorithm, Coda,...

10.1093/bioinformatics/btx243 article EN cc-by-nc Bioinformatics 2017-04-18

Modeling Individual Cyclic Variation in Human Behavior

OPENALEX - Publications

Emma Pierson Tim Althoff Jure Leskovec

Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, the menstrual cycle. However, modeling cycles in time series data is challenging because most cases not labeled or directly observed need be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyHMMs) for detecting a collection of heterogeneous data. In contrast previous cycle methods, CyHMMs deal with number challenges encountered...

10.1145/3178876.3186052 article EN 2018-01-01

WILDS: A Benchmark of in-the-Wild Distribution Shifts

OPENALEX - Publications

Pang Wei Koh Shiori Sagawa Henrik Marklund Sang Michael Xie Marvin Zhang and 18 more

Distribution shifts -- where the training distribution differs from test can substantially degrade accuracy of machine learning (ML) systems deployed in wild. Despite their ubiquity real-world deployments, these are under-represented datasets widely used ML community today. To address this gap, we present WILDS, a curated benchmark 10 reflecting diverse range that naturally arise applications, such as across hospitals for tumor identification; camera traps wildlife monitoring; and time...

10.48550/arxiv.2012.07421 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Participation in the age of foundation models

OPENALEX - Publications

Harini Suresh Emily Tseng Meg Young Mary L. Gray Emma Pierson and 1 more

Growing interest and investment in the capabilities of foundation models has positioned such systems to impact a wide array public services. Alongside these opportunities is risk that reify existing power imbalances cause disproportionate harm marginalized communities. Participatory approaches hold promise instead lend agency decision-making stakeholders. But participatory AI/ML are typically deeply grounded context - how do we apply models, which are, by design, disconnected from context?...

10.1145/3630106.3658992 preprint EN cc-by 2022 ACM Conference on Fairness, Accountability, and Transparency 2024-06-03

Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers

OPENALEX - Publications

Rajiv Movva Sidhika Balachandar Kenny Peng Gabriel Agostini Nikhil Garg and 1 more

10.18653/v1/2024.naacl-long.67 article EN 2024-01-01

Coming Soon ...