NFDI4DS | UHH-SEMS - Publication Details

Fabio Vandin

ORCID: 0000-0003-2244-2320

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5060279781

Research Areas

Bioinformatics and Genomic Networks
Cancer Genomics and Diagnostics
Data Mining Algorithms and Applications
Gene expression and cancer classification
Algorithms and Data Compression
Data Management and Algorithms
Rough Sets and Fuzzy Logic
Genomics and Phylogenetic Studies
Genomics and Rare Diseases
Genetic factors in colorectal cancer
Imbalanced Data Classification Techniques
Complex Network Analysis Techniques
Advanced Graph Neural Networks
Epigenetics and DNA Methylation
Genetic Associations and Epidemiology
RNA modifications and cancer
Machine Learning and Algorithms
Advanced Database Systems and Queries
Genomics and Chromatin Dynamics
Computational Drug Discovery Methods
Data Quality and Management
Data Stream Mining Techniques
Genomic variations and chromosomal abnormalities
Optimization and Search Problems
Bayesian Modeling and Causal Inference

University of Padua
2015-2024

Brown University
2010-2019

University of Southern Denmark
2013-2019

National Center for Biotechnology Information
2019

Research Network (United States)
2017

Providence College
2011-2016

John Brown University
2010-2015

National Institutes of Health
2014

Walter and Eliza Hall Institute of Medical Research
2011

Integrated genomic analyses of ovarian carcinoma

OPENALEX - Publications

Debra Bell Andrew Berchuck Michael J. Birrer Jeremy Chien D. W. Cramer and 95 more

10.1038/nature10166 article EN Nature 2011-06-28

Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia

OPENALEX - Publications

T J Ley Christopher A. Miller Li Ding Benjamin J. Raphael Andrew J. Mungall and 95 more

Many mutations that contribute to the pathogenesis of acute myeloid leukemia (AML) are undefined. The relationships between patterns and epigenetic phenotypes not yet clear. We analyzed genomes 200 clinically annotated adult cases de novo AML, using either whole-genome sequencing (50 cases) or whole-exome (150 cases), along with RNA microRNA DNA-methylation analysis. AML have fewer than most other cancers, an average only 13 found in genes. Of these, 5 genes recurrently mutated AML. A total...

10.1056/nejmoa1301689 article EN New England Journal of Medicine 2013-05-02

Mutational landscape and significance across 12 major cancer types

OPENALEX - Publications

Cyriac Kandoth Michael D. McLellan Fabio Vandin Kai Ye Beifang Niu and 14 more

The Cancer Genome Atlas (TCGA) has used the latest sequencing and analysis methods to identify somatic variants across thousands of tumours. Here we present data analytical results for point mutations small insertions/deletions from 3,281 tumours 12 tumour types as part TCGA Pan-Cancer effort. We illustrate distributions mutation frequencies, contexts types, establish their links tissues origin, environmental/carcinogen influences, DNA repair defects. Using integrated sets, identified 127...

10.1038/nature12634 article EN cc-by-nc-sa Nature 2013-10-15

The mutational landscape of lethal castration-resistant prostate cancer

OPENALEX - Publications

Catherine S. Grasso Yi-Mi Wu Dan R. Robinson Xuhong Cao Saravana M. Dhanasekaran and 22 more

10.1038/nature11125 article EN Nature 2012-05-18

Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes

OPENALEX - Publications

Mark D.M. Leiserson Fabio Vandin Hsin-Ta Wu Jason R. Dobson Jonathan V Eldridge and 14 more

10.1038/ng.3168 article EN Nature Genetics 2014-12-15

De novo discovery of mutated driver pathways in cancer

OPENALEX - Publications

Fabio Vandin Eli Upfal Benjamin J. Raphael

Next-generation DNA sequencing technologies are enabling genome-wide measurements of somatic mutations in large numbers cancer patients. A major challenge the interpretation these data is to distinguish functional “driver mutations” important for development from random “passenger mutations.” common approach identifying driver find genes that mutated at significant frequency a cohort genomes. This confounded by observation target multiple cellular signaling and regulatory pathways. Thus,...

10.1101/gr.120477.111 article EN cc-by-nc Genome Research 2011-06-07

Algorithms for Detecting Significantly Mutated Pathways in Cancer

OPENALEX - Publications

Fabio Vandin Eli Upfal Benjamin J. Raphael

Recent genome sequencing studies have shown that the somatic mutations drive cancer development are distributed across a large number of genes. This mutational heterogeneity complicates efforts to distinguish functional from sporadic, passenger mutations. Since hypothesized target relatively small cellular signaling and regulatory pathways, common practice is assess whether known pathways enriched for mutated We introduce an alternative approach examines genes in context genome-scale gene...

10.1089/cmb.2010.0265 article EN Journal of Computational Biology 2011-03-01

CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer

OPENALEX - Publications

Mark D.M. Leiserson Hsin-Ta Wu Fabio Vandin Benjamin J. Raphael

Cancer is a heterogeneous disease with different combinations of genetic alterations driving its development in individuals. We introduce CoMEt, an algorithm to identify that exhibit pattern mutual exclusivity across individuals, often observed for the same pathway. CoMEt includes exact statistical test and techniques perform simultaneous analysis multiple sets mutually exclusive subtype-specific alterations. demonstrate outperforms existing approaches on simulated real data. apply five...

10.1186/s13059-015-0700-7 article EN cc-by Genome biology 2015-08-07

DISCOVERY OF MUTATED SUBNETWORKS ASSOCIATED WITH CLINICAL DATA IN CANCER

OPENALEX - Publications

Fabio Vandin Patrick G. Clay Eli Upfal Benjamin J. Raphael

10.1142/9789814366496_0006 article EN Biocomputing 2011-12-01

HIT'nDRIVE: patient-specific multidriver gene prioritization for precision oncology

OPENALEX - Publications

Raunak Shrestha Ermin Hodzic Thomas Sauerwald Phuong Dao Kendric Wang and 6 more

Prioritizing molecular alterations that act as drivers of cancer remains a crucial bottleneck in therapeutic development. Here we introduce HIT'nDRIVE, computational method integrates genomic and transcriptomic data to identify set patient-specific, sequence-altered genes, with sufficient collective influence over dysregulated transcripts. HIT'nDRIVE aims solve the “random walk facility location” (RWFL) problem gene (or protein) interaction network, which differs from standard location by...

10.1101/gr.221218.117 article EN cc-by-nc Genome Research 2017-07-18

Computational Pan-Genomics: Status, Promises and Challenges

OPENALEX - Publications

Tobias Marschall Manja Marz Thomas Abeel Louis J. Dijkstra Bas E. Dutilh and 54 more

Abstract Many disciplines, from human genetics and oncology to plant breeding, microbiology virology, commonly face the challenge of analyzing rapidly increasing numbers genomes. In case Homo sapiens , number sequenced genomes will approach hundreds thousands in next few years. Simply scaling up established bioinformatics pipelines not be sufficient for leveraging full potential such rich genomic datasets. Instead, novel, qualitatively different computational methods paradigms are needed. We...

10.1101/043430 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2016-03-12

Attention-Based Deep Learning Framework for Human Activity Recognition With User Adaptation

OPENALEX - Publications

Davide Buffelli Fabio Vandin

Sensor-based human activity recognition (HAR) requires to predict the action of a person based on sensor-generated time series data. HAR has attracted major interest in past few years, thanks large number applications enabled by modern ubiquitous computing devices. While several techniques hand-crafted feature engineering have been proposed, current state-of-the-art is represented deep learning architectures that automatically obtain high level representations and use recurrent neural...

10.1109/jsen.2021.3067690 article EN IEEE Sensors Journal 2021-03-22

De novo pathway-based biomarker identification

OPENALEX - Publications

Nicolás Alcaraz Markus List Richa Batra Fabio Vandin Henrik J. Ditzel and 1 more

Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for patients suffering from complex diseases, such cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent datasets. Attempts mitigate these drawbacks led development of network-based approaches that integrate pathway information produce meta-gene (MG) features. Also, MG only dealt with two-class problem good...

10.1093/nar/gkx642 article EN cc-by-nc Nucleic Acids Research 2017-07-13

Clustering uncertain graphs

OPENALEX - Publications

Matteo Ceccarello Carlo Fantozzi Andrea Pietracaprina Geppino Pucci Fabio Vandin

An uncertain graph 𝒢 = (V, E, p : E → (0, 1]) can be viewed as a probability space whose outcomes (referred to possible worlds ) are subgraphs of where any edge e ε occurs with ( ), independently the other edges. These graphs naturally arise in many application domains data management systems required cope uncertainty interrelated data, such computational biology, social network analysis, reliability, and privacy enforcement, among others. For this reason, it is important devise fundamental...

10.1145/3186728.3164143 article EN Proceedings of the VLDB Endowment 2017-12-01

Algorithms on evolving graphs

OPENALEX - Publications

Aris Anagnostopoulos Ravi Kumar Mohammad Mahdian Eli Upfal Fabio Vandin

Motivated by applications that concern graphs are evolving and massive in nature, we define a new general framework for computing with such graphs. In our framework, the graph changes over time an algorithm can only track these explicitly probing graph. This captures inherent tradeoff between complexity of maintaining up-to-date view quality results computed available view. We apply this to two classical connectivity problems, namely, path minimum spanning trees, obtain efficient algorithms.

10.1145/2090236.2090249 article EN 2012-01-08

Simultaneous Inference of Cancer Pathways and Tumor Progression from Cross-Sectional Mutation Data

OPENALEX - Publications

Benjamin J. Raphael Fabio Vandin

Recent cancer sequencing studies provide a wealth of somatic mutation data from large number patients. One the most intriguing and challenging questions arising this is to determine whether temporal order mutations in follows any common progression. Since we usually obtain only one sample patient, such inferences are commonly made cross-sectional different This analysis complicated by extensive variation across patients, that reduced examining combinations various pathways. Thus far, methods...

10.1089/cmb.2014.0161 article EN Journal of Computational Biology 2015-03-18

Mining top-K frequent itemsets through progressive sampling

OPENALEX - Publications

Andrea Pietracaprina Matteo Riondato Eli Upfal Fabio Vandin

10.1007/s10618-010-0185-7 article EN Data Mining and Knowledge Discovery 2010-07-22

Accurate Computation of Survival Statistics in Genome-Wide Studies

OPENALEX - Publications

Fabio Vandin Alexandra Papoutsaki Benjamin J. Raphael Eli Upfal

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test widely used for this purpose, nearly all implementations of rely on an asymptotic approximation not appropriate many applications. This because: two populations determined by a variant may have very sizes; and evaluation possible demands highly accurate computation small p-values. We demonstrate problem cancer data where...

10.1371/journal.pcbi.1004071 article EN cc-by PLoS Computational Biology 2015-05-07

SPuManTE

OPENALEX - Publications

Leonardo Pellegrina Matteo Riondato Fabio Vandin

We present SPuManTE, an efficient algorithm for mining significant patterns from a transactional dataset. SPuManTE controls the Family-wise Error Rate: it ensures that probability of reporting one or more false discoveries is less than user-specified threshold. A key ingredient UT, our novel unconditional statistical test evaluating significance pattern, requires fewer assumptions on data generation process and appropriate knowledge discovery setting classical conditional tests, such as...

10.1145/3292500.3330978 article EN 2019-07-25

Permutation Strategies for Mining Significant Sequential Patterns

OPENALEX - Publications

Andrea Tonon Fabio Vandin

The identification of significant patterns, defined as patterns whose frequency significantly deviates from what is expected under a suitable null model the data, key data mining task with application in several areas. We present PROMISE, an algorithm for identifying sequential while guaranteeing that probability one or more false discoveries are reported output (i.e., Family-Wise Error Rate - FWER) less than user-defined threshold. PROMISE employs Westfall-Young method to correct multiple...

10.1109/icdm.2019.00169 article EN 2021 IEEE International Conference on Data Mining (ICDM) 2019-11-01

An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets

OPENALEX - Publications

Adam Kirsch Michael Mitzenmacher Andrea Pietracaprina Geppino Pucci Eli Upfal and 1 more

As advances in technology allow for the collection, storage, and analysis of vast amounts data, task screening assessing significance discovered patterns is becoming a major challenge data mining applications. In this work, we address context frequent itemset mining. Specifically, develop novel methodology to identify meaningful support threshold s * dataset, such that number itemsets with at least represents substantial deviation from what would be expected random dataset same transactions...

10.1145/2220357.2220359 article EN Journal of the ACM 2012-06-01

Efficient Mining of the Most Significant Patterns with Permutation Testing

OPENALEX - Publications

Leonardo Pellegrina Fabio Vandin

The extraction of patterns displaying significant association with a class label is key data mining task wide application in many domains. We study variant the problem that requires to mine top-k statistically patterns, thus providing tight control on number reported output. develop TopKWY, first algorithm while rigorously controlling family-wise error rate output and provide theoretical evidence its effectiveness. TopKWY crucially relies novel strategy explore several implementation...

10.1145/3219819.3219997 article EN 2018-07-19

Coming Soon ...