Flavia Villani

ORCID: 0000-0003-3633-0610
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Genetic Mapping and Diversity in Plants and Animals
  • Chromosomal and Genetic Variations
  • Epigenetics and DNA Methylation
  • CRISPR and Genetic Engineering
  • Bioinformatics and Genomic Networks
  • Genomics and Chromatin Dynamics
  • Genetic diversity and population structure
  • RNA and protein synthesis mechanisms
  • Genomic variations and chromosomal abnormalities
  • Genomics and Rare Diseases
  • Adipose Tissue and Metabolism
  • Molecular Biology Techniques and Applications
  • Genetic Associations and Epidemiology
  • Genetics, Aging, and Longevity in Model Organisms
  • Biomedical Text Mining and Ontologies
  • SARS-CoV-2 detection and testing
  • Gene expression and cancer classification
  • Ubiquitin and proteasome pathways
  • Evolution and Genetic Dynamics
  • Biomedical and Engineering Education
  • Scientific Computing and Data Management
  • Genetic Neurodegenerative Diseases
  • Fungal and yeast genetics research
  • Artificial Intelligence in Healthcare and Education

University of Tennessee Health Science Center
2021-2025

Institute of Genetics and Biophysics
2020-2021

National Research Council
2021

University of Naples Federico II
2020

Wen‐Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness and 95 more Glenn Hickey Shuangjia Lu Julian Lucas Jean Monlong Haley Abel Silvia Buonaiuto Xian Chang Haoyu Cheng Justin Chu Vincenza Colonna Jordan M. Eizenga Xiaowen Feng Christian Fischer Robert S. Fulton Shilpa Garg Cristian Groza Andrea Guarracino William T. Harvey Simon Heumos Kerstin Howe Miten Jain Tsung-Yu Lu Charles Markello Fergal J. Martin Matthew W. Mitchell Katherine M. Munson Moses Njagi Mwaniki Adam M. Novak Hugh E. Olsen Trevor Pesout David Porubský Pjotr Prins Jonas A. Sibbesen Jouni Sirén Chad Tomlinson Flavia Villani Mitchell R. Vollger Lucinda Antonacci-Fulton Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Andrew Carroll Pi-Chuan Chang Sarah Cody Daniel E. Cook Robert Cook‐Deegan Omar E. Cornejo Mark Diekhans Peter Ebert Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Giulio Formenti Adam Frankish Yan Gao Nanibaa’ A. Garrison Carlos García Girón Richard E. Green Leanne Haggerty Kendra Hoekzema Thibaut Hourlier Hanlee P. Ji Eimear E. Kenny Barbara A. Koenig Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky Sergey Koren HoJoon Lee Alexandra P. Lewis Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Ann M. Mc Cartney Jennifer McDaniel Jacquelyn Mountcastle Maria Nattestad Sergey Nurk Nathan D. Olson Alice B. Popejoy Daniela Puiu Mikko Rautiainen Allison Regier Arang Rhie Samuel Sacco Ashley D. Sanders Valérie Schneider Baergen I. Schultz Kishwar Shafin Michael W. Smith Heidi J. Sofia Ahmad Abou Tayoun Françoise Thibaud‐Nissen Francesca Floriana Tricomi

Abstract Here the Human Pangenome Reference Consortium presents a first draft of human pangenome reference. The contains 47 phased, diploid assemblies from cohort genetically diverse individuals 1 . These cover more than 99% expected sequence in each genome and are accurate at structural base pair levels. Based on alignments assemblies, we generate that captures known variants haplotypes reveals new alleles structurally complex loci. We also add 119 million pairs euchromatic polymorphic...

10.1038/s41586-023-05896-x article EN cc-by Nature 2023-05-10

Abstract Pangenome graphs can represent all variation between multiple reference genomes, but current approaches to build them exclude complex sequences or are based upon a single reference. In response, we developed the PanGenome Graph Builder (PGGB), pipeline for constructing pangenome without bias exclusion. PGGB uses all-to-all alignments graph in which identify variation, measure conservation, detect recombination events, and infer phylogenetic relationships.

10.1101/2023.04.05.535718 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-04-06

Abstract The Human Pangenome Reference Consortium (HPRC) presents a first draft human pangenome reference. contains 47 phased, diploid assemblies from cohort of genetically diverse individuals. These cover more than 99% the expected sequence and are accurate at structural base-pair levels. Based on alignments assemblies, we generated that captures known variants haplotypes, reveals novel alleles structurally complex loci, adds 119 million base pairs euchromatic polymorphic 1,529 gene...

10.1101/2022.07.09.499321 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-07-09

The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold increases contiguity 290-fold compared with its predecessor. Gene annotations are now more complete, improving mapping precision genomic, transcriptomic, proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains substrains using mRatBN7.2. defined...

10.1016/j.xgen.2024.100527 article EN cc-by Cell Genomics 2024-03-26

The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in vast dataset quantitative molecular and physiological phenotypes. We built pangenome graph from 10x Genomics Linked-Read data for 31 rats to study variation association mapping. includes 0.2Gb sequence not present the reference mRatBN7.2, confirming capture substantial additional variation. validated variants challenging regions, including complex...

10.1016/j.isci.2025.111835 article EN cc-by-nc-nd iScience 2025-01-01

Pangenomics is a growing field within computational genomics. Many pangenomic analyses use bidirected sequence graphs as their core data model. However, implementing and correctly using this model can be difficult, the scale of datasets challenging to work at. These challenges have impeded progress in field.Here, we present stack two C++ libraries, libbdsg libhandlegraph, which simple, field-proven interface, designed expose elementary features these while preventing common graph...

10.1093/bioinformatics/btaa640 article EN Bioinformatics 2020-07-10

Abstract The BXD recombinant inbred (RI) mouse strains are the largest and most deeply phenotyped panel of vertebrate organisms. RIs allow phenotyping isogenic individuals across virtually any environment or treatment. We performed whole genome sequencing generated a compendium SNPs, indels, short tandem repeats, structural variants in these used them to analyze phenomic data accumulated over past 50 years. show that BXDs segregate >6 million with high minor allele which dervied from...

10.1101/2022.04.21.489063 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-04-21

The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in vast dataset quantitative molecular and physiological phenotypes. We built pangenome graph from 10x Genomics Linked-Read data for 31 rats to study variation association mapping. includes 0.2Gb sequence not present the reference mRatBN7.2, confirming capture substantial additional variation. validated variants challenging regions, including complex...

10.1101/2024.01.10.575041 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-01-11
HoJoon Lee Stephanie Greer Dmitri S. Pavlichin Bo Zhou Alexander E. Urban and 95 more Tsachy Weissman Hanlee P. Ji Wen‐Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness Glenn Hickey Shuangjia Lu Julian Lucas Jean Monlong Haley Abel Silvia Buonaiuto Xian Chang Haoyu Cheng Justin Chu Vincenza Colonna Jordan M. Eizenga Xiaowen Feng Christian Fischer Robert S. Fulton Shilpa Garg Cristian Groza Andrea Guarracino William T. Harvey Simon Heumos Kerstin Howe Miten Jain Tsung-Yu Lu Charles Markello Fergal J. Martin Matthew W. Mitchell Katherine M. Munson Moses Njagi Mwaniki Adam M. Novak Hugh E. Olsen Trevor Pesout David Porubský Pjotr Prins Jonas A. Sibbesen Chad Tomlinson Flavia Villani Mitchell R. Vollger Lucinda Antonacci-Fulton Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Andrew Carroll Pi-Chuan Chang Sarah Cody Daniel E. Cook Omar E. Cornejo Mark Diekhans Peter Ebert Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Giulio Formenti Adam Frankish Yan Gao Carlos García Girón Richard E. Green Leanne Haggerty Kendra Hoekzema Thibaut Hourlier Hanlee P. Ji Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky HoJoon Lee Alexandra P. Lewis Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Jennifer McDaniel Jacquelyn Mountcastle Maria Nattestad Nathan D. Olson Daniela Puiu Allison Regier Arang Rhie Samuel Sacco Ashley D. Sanders Valérie Schneider Baergen I. Schultz Kishwar Shafin Jouni Sirén Michael W. Smith Heidi J. Sofia Ahmad Abou Tayoun Françoise Thibaud‐Nissen Francesca Floriana Tricomi Justin Wagner Jonathan Wood

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed k-mer indexing strategy for comparative analysis across multiple assemblies, including pangenome reference, GRCh38, and CHM13, telomere-to-telomere assembly. Our approach enabled us to identify valuable collection universally conserved sequences all referred as...

10.1016/j.crmeth.2023.100543 article EN cc-by-nc-nd Cell Reports Methods 2023-08-01

Short tandem repeats (STRs) are a class of rapidly mutating genetic elements typically characterized by repeated units 1–6 bp. We leveraged whole-genome sequencing data for 152 recombinant inbred (RI) strains from the BXD family mice to map loci that modulate genome-wide patterns new mutations arising during parent-to-offspring transmission at STRs. defined quantitative phenotypes describing numbers and types germline STR in each strain performed trait locus (QTL) analyses these phenotypes....

10.1101/gr.277576.122 article EN cc-by-nc Genome Research 2023-05-01

The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in vast dataset quantitative molecular and physiological phenotypes. We built pangenome graph from 10x Genomics Linked-Read data for 31 rats to study variation association mapping. length was on average 2.4 times greater than the corresponding reference mRatBN7.2, confirming capture substantial additional variation. validated variants challenging...

10.2139/ssrn.4723495 preprint EN 2024-01-01

Abstract The ability of SARS-CoV-2 to rapidly mutate represents a remarkable complicancy. Quantitative evaluations the effects that these mutations have on virus structure/function is great relevance and availability large number sequences since early phases pandemic unique opportunity follow adaptation humans. Here, we evaluated amino acid their progression by analyzing publicly available viral genomes at three stages (2020 March 15th October 7th, 2021 February 7th). Mutations were...

10.1038/s41598-021-04147-1 article EN cc-by Scientific Reports 2021-12-30

Abstract Motivation Pangenomics is a growing field within computational genomics. Many pangenomic analyses use bidirected sequence graphs as their core data model. However, implementing and correctly using this model can be difficult, the scale of sets challenging to work at. These challenges have impeded progress in field. Results Here we present stack two C++ libraries, libbdsg libhandlegraph , which simple, field-proven interface, designed expose elementary features these while preventing...

10.1101/2020.04.23.056317 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-04-25

Abstract Short tandem repeats (STRs) are a class of rapidly mutating genetic elements characterized by repeated units 1 or more nucleotides. We leveraged whole genome sequencing data for 152 recombinant inbred (RI) strains from the BXD family derived C57BL/6J and DBA/2J mice to study effects background on genome-wide patterns new mutations at STRs. defined quantitative phenotypes describing numbers types germline STR in each strain identified locus chromosome 13 associated with propensity...

10.1101/2022.03.02.482700 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2022-03-02

Abstract Genetic variations in protein expression are implicated a broad spectrum of common diseases and complex traits. However, the fundamental genetic architecture variation have received comparatively less attention than either mRNA or classical phenotypes. In this study, we systematically quantified proteins brains large family rats using tandem mass tag (TMT)-based quantitative mass-spectrometry (MS) technology. We identified comprehensive proteome 8,119 from Spontaneously Hypertensive...

10.1101/2024.02.17.580840 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-02-21

We created GNQA, a generative pre-trained transformer (GPT) knowledge base driven by performant retrieval augmented generation (RAG) with focus on aging, dementia, Alzheimer’s and diabetes. uploaded corpus of three thousand peer reviewed publications these topics into the RAG. To address concerns about inaccurate responses GPT ‘hallucinations’, we implemented context provenance tracking mechanism that enables researchers to validate against original material get references papers. assess...

10.1101/2024.10.16.618663 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-10-18

Abstract DNA methylation is influenced by genetic and non-genetic factors. Here, we chart quantitative trait loci (QTLs) that modulate levels of at highly conserved CpGs using liver methylome data from mouse strains belonging to the BXD Family. A regulatory hotspot on chromosome 5 had highest density trans-acting QTLs (trans-meQTLs) associated with multiple distant CpGs. We refer this locus as meQTL.5a. The trans-modulated showed age-dependent changes, were enriched in developmental genes,...

10.1101/2023.04.12.536608 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2023-04-13

DNA methylation is influenced by genetic and non-genetic factors. Here, we chart quantitative trait loci (QTLs) that modulate levels of at highly conserved CpGs using liver methylome data from mouse strains belonging to the BXD family. A regulatory hotspot on chromosome 5 had highest density trans-acting QTLs (trans-meQTLs) associated with multiple distant CpGs. We refer this locus as meQTL.5a. Trans-modulated showed age-dependent changes were enriched in developmental genes, including...

10.1080/15592294.2023.2252631 article EN cc-by Epigenetics 2023-09-10

Abstract Linked-read whole genome sequencing methods, such as the 10x Chromium, attach a unique molecular barcode to each high weight DNA molecule. The samples are then sequenced using short-read technology. During analysis, sequence reads sharing same aligned adjacent genomic locations. pattern of between regions allows discovery large structural variants (SVs) in range 1 Kb few Mb. Most SV calling methods for these data, LongRanger, analyze one sample at time and often produces...

10.1101/2021.11.02.467006 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-11-04
Coming Soon ...