TabDEG: Classifying differentially expressed genes from RNA-seq data based on feature extraction and deep learning framework

Identification Sample (material)
DOI: 10.1371/journal.pone.0305857 Publication Date: 2024-07-22T17:28:08Z
ABSTRACT
Traditional differential expression genes (DEGs) identification models have limitations in small sample size datasets because they require meeting distribution assumptions, otherwise resulting high false positive/negative rates due to variation. In contrast, tabular data model based on deep learning (DL) frameworks do not need consider the types and However, applying DL RNA-Seq is still a challenge lack of proper labeling compared number genes. Data augmentation (DA) extracts features using different methods procedures, which can significantly increase complementary pseudo-values from limited without significant additional cost. Based this, we combine DA framework-based model, propose TabDEG, predict DEGs their up-regulation/down-regulation directions gene obtained Cancer Genome Atlas database. Compared five counterpart methods, TabDEG has sensitivity low misclassification rates. Experiment shows that robust effective enhancing facilitate classification high-dimensional validates TabDEG-predicted are mapped important ontology terms pathways associated with cancer.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (63)
CITATIONS (0)