- RNA modifications and cancer
- Epigenetics and DNA Methylation
- Biomedical Text Mining and Ontologies
- Cancer-related gene regulation
- Machine Learning in Healthcare
- Electronic Health Records Systems
- Medical Coding and Health Information
- Semantic Web and Ontologies
- Bioinformatics and Genomic Networks
University of California, San Francisco
2019-2023
City of Hope
2019
Abstract Motivation Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size heterogeneity of underlying information. Results In this work, we present Scalable Precision Medicine Open Engine (SPOKE), biomedical connecting millions concepts via semantically meaningful relationships. SPOKE contains 27 million nodes 21 different types 53 edges 55 downloaded from 41 databases. The graph is built on framework 11...
Abstract Epigenetic landscapes can shape physiologic and disease phenotypes. We used integrative, high resolution multi-omics methods to delineate the methylome landscape characterize oncogenic drivers of esophageal squamous cell carcinoma (ESCC). found 98% CpGs are hypomethylated across ESCC genome. Hypo-methylated regions enriched in areas with heterochromatin binding markers (H3K9me3, H3K27me3), while hyper-methylated polycomb repressive complex (EZH2/SUZ12) recognizing regions. Altered...
Clinical notes are a veritable treasure trove of information on patient's disease progression, medical history, and treatment plans, yet locked in secured databases accessible for research only after extensive ethics review. Removing personally identifying protected health (PII/PHI) from the records can reduce need additional Institutional Review Boards (IRB) reviews. In this project, our goals were to: (1) develop robust scalable clinical text de-identification pipeline that is compliant...
There is a great and growing need to ascertain what exactly the state of patient, in terms disease progression, actual care practices, pathology, adverse events, much more, beyond paucity data available structured medical record data. Ascertaining these harder-to-reach elements now critical for accurate phenotyping complex traits, detection outcomes, efficacy off-label drug use, longitudinal patient surveillance. Clinical notes often contain most detailed relevant digital information about...
Abstract Epigenetic landscapes can shape physiologic and disease phenotypes. We used integrative, high resolution multi-omics methods to characterize the oncogenic drivers of esophageal squamous cell carcinoma (ESCC). found 98% CpGs are hypomethylated across ESCC genome two-thirds occur in long non-coding (lnc)RNA regions. DNA methylation epigenetic heterogeneity both coincide with chromosomal topological alterations. Gene body methylation, polycomb repressive complex occupancy, CTCF binding...
We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists tens millions papers spanning decades research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) validate connective against curated knowledge graphs Spoke)....