StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics

Barcode
DOI: 10.12688/f1000research.2-248.v1 Publication Date: 2013-11-15T08:47:15Z
ABSTRACT
<ns4:p>Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts time. Additionally, techniques such as multiplex allow one run to contain hundreds different samples. With comes a significant challenge understand its quality and how the yield are changing across instruments over As well desire historical data, centres often have duty provide clear summaries individual performance collaborators or customers. We present StatsDB, an open-source software package for storage analysis next generation metrics. The system has been designed incorporation into primary pipeline, either at programmatic level via integration existing user interfaces. Statistics stored SQL database APIs ability store access while abstracting underlying design. This abstraction allows simpler, wider querying multiple fields than is possible by manual steps calculation required dissect reports, e.g. ”provide metrics about nucleotide bias libraries using adaptor barcode X, all runs on sequencer A, within last month”. supplied with modules statistics from FastQC, commonly used tool sequence reads, but open nature schema means it can be easily adapted other tools. Currently Genome Analysis Centre (TGAC), reports accessed through our LIMS standalone GUI tool, API examples make easy develop custom interface packages.</ns4:p>
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (15)
CITATIONS (12)