Aaron Gu

ORCID: 0000-0002-4918-4010
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Chromatin Dynamics
  • Gene expression and cancer classification
  • Single-cell and spatial transcriptomics
  • Cancer-related molecular mechanisms research

University of Virginia
2020-2021

Office of Public Health Genomics
2020-2021

Genomic region sets summarize functional genomics data and define locations of interest in the genome such as regulatory regions or transcription factor binding sites. The number publicly available has increased dramatically, leading to challenges analysis.

10.1093/bioinformatics/btab439 article EN Bioinformatics 2021-06-15

Functional genomics experiments, like ChIP-Seq or ATAC-Seq, produce results that are summarized as a region set. There is no way to objectively evaluate the effectiveness of set similarity metrics. We present Bedshift, tool for perturbing BED files by randomly shifting, adding, and dropping regions from reference file. The perturbed can be used benchmark metrics, well other applications. highlight differences in behavior between such Jaccard score most sensitive added dropped regions, while...

10.1186/s13059-021-02440-w article EN cc-by Genome biology 2021-08-20

Functional genomics experiments, like ChIP-Seq or ATAC-Seq, produce results that are summarized as a region set. Many tools have been developed to analyze sets, including computing similarity metrics compare them. However, there is no way objectively evaluate the effectiveness of set metrics. In this paper we present Bedshift , command-line tool and Python API generate new BED files by making random perturbations an original file. Perturbed known file therefore useful benchmark To...

10.1101/2020.11.11.378554 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-11-12

10.5281/zenodo.4771246 article EN Zenodo (CERN European Organization for Nuclear Research) 2021-05-22

Motivation Genomic region sets summarize functional genomics data and define locations of interest in the genome such as regulatory regions or transcription factor binding sites. The number publicly available has increased dramatically, leading to challenges analysis. Results We propose a new method represent genomic vectors, embeddings, using an adapted word2vec approach. compared our approach two simpler methods based on interval unions term frequency-inverse document frequency evaluated...

10.1101/2021.05.07.443166 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-05-09
Coming Soon ...