NFDI4DS | UHH-SEMS - Publication Details

Niina Haiminen

ORCID: 0000-0002-8663-1019

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5054652878

Research Areas

Genomics and Phylogenetic Studies
Gut microbiota and health
Chromosomal and Genetic Variations
Gene expression and cancer classification
Metabolomics and Mass Spectrometry Studies
Bioinformatics and Genomic Networks
Microbial Community Ecology and Physiology
Animal Genetics and Reproduction
Genetic and phenotypic traits in livestock
Algorithms and Data Compression
RNA and protein synthesis mechanisms
Probiotics and Fermented Foods
Identification and Quantification in Food
Genetic Mapping and Diversity in Plants and Animals
Machine Learning in Bioinformatics
Cocoa and Sweet Potato Agronomy
Dermatology and Skin Diseases
Genetic Associations and Epidemiology
Genomics and Chromatin Dynamics
Topological and Geometric Data Analysis
Biosensors and Analytical Detection
Dental Research and COVID-19
COVID-19 diagnosis using AI
Bioenergy crop production and management
Diabetes and associated disorders

IBM Research - Thomas J. Watson Research Center
2011-2025

IBM (United States)
2011-2024

ORCID
2020

Helsinki Institute for Information Technology
2006-2011

University of Helsinki
2006-2011

The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

OPENALEX - Publications

Juan Carlos Motamayor Keithanne Mockaitis Jeremy Schmutz Niina Haiminen Donald Livingstone and 25 more

Abstract Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated type. The availability of its genome sequence and methods for identifying genes responsible important traits will aid researchers breeders. Results We describe sequencing assembly 1-6. is 445 Mbp, which significantly larger than a sequenced Criollo cultivar, more typical other cultivars. chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 with contig N50 84.4 kbp, scaffold 34.4...

10.1186/gb-2013-14-6-r53 article EN cc-by Genome biology 2013-06-03

Standardized multi-omics of Earth’s microbiomes reveals microbial and metabolite diversity

OPENALEX - Publications

Justin P. Shaffer Louis‐Félix Nothias Luke Thompson Jon G. Sanders Rodolfo A. Salido and 92 more

Despite advances in sequencing, lack of standardization makes comparisons across studies challenging and hampers insights into the structure function microbial communities multiple habitats on a planetary scale. Here we present multi-omics analysis diverse set 880 community samples collected for Earth Microbiome Project. We include amplicon (16S, 18S, ITS) shotgun metagenomic sequence data, untargeted metabolomics data (liquid chromatography-tandem mass spectrometry gas chromatography...

10.1038/s41564-022-01266-x article EN cc-by Nature Microbiology 2022-11-28

Phylogeny-Aware Analysis of Metagenome Community Ecology Based on Matched Reference Genomes while Bypassing Taxonomy

OPENALEX - Publications

Qiyun Zhu Shi Huang Antonio González Imran McGrath Daniel McDonald and 21 more

We introduce the operational genomic unit (OGU) method, a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as minimum for assessing diversity of microbial communities and their relevance environmental factors. This approach is independent taxonomic classification, granting possibility maximal resolution community composition, organizes features into an accurate hierarchy using phylogenomic tree. The outputs are suitable contemporary...

10.1128/msystems.00167-22 article EN mSystems 2022-04-04

Human Skin, Oral, and Gut Microbiomes Predict Chronological Age

OPENALEX - Publications

Shi Huang Niina Haiminen Anna Paola Carrieri Rebecca Hu Lingjing Jiang and 11 more

Considerable evidence suggests that the gut microbiome changes with age or even accelerates aging in adults. Whether age-related are more less prominent than those for other body sites and whether predictions can be made about a person’s from sample remain unknown. We therefore combined several large studies different countries to determine which site’s could most accurately predict age. found skin was best, on average yielding within 4 years of chronological This study sets stage future...

10.1128/msystems.00630-19 article EN cc-by mSystems 2020-02-10

Challenges in benchmarking metagenomic profilers

OPENALEX - Publications

Zheng Sun Shi Huang Meng Zhang Qiyun Zhu Niina Haiminen and 6 more

10.1038/s41592-021-01141-3 article EN Nature Methods 2021-05-13

Explainable AI reveals changes in skin microbiome composition linked to phenotypic differences

OPENALEX - Publications

Anna Paola Carrieri Niina Haiminen Sean Maudsley-Barton Laura‐Jayne Gardiner Barry Murphy and 14 more

Abstract Alterations in the human microbiome have been observed a variety of conditions such as asthma, gingivitis, dermatitis and cancer, much remains to be learned about links between health. The fusion artificial intelligence with rich datasets can offer an improved understanding microbiome’s role To gain actionable insights it is essential consider both predictive power transparency models by providing explanations for predictions. We combine collection leg skin samples from two healthy...

10.1038/s41598-021-83922-6 article EN cc-by Scientific Reports 2021-02-25

SARS-CoV-2 detection status associates with bacterial community composition in patients and the hospital environment

OPENALEX - Publications

Clarisse Marotz Pedro Belda‐Ferre Farhana Ali Promi Das Shi Huang and 30 more

SARS-CoV-2 is an RNA virus responsible for the coronavirus disease 2019 (COVID-19) pandemic. Viruses exist in complex microbial environments, and recent studies have revealed both synergistic antagonistic effects of specific bacterial taxa on viral prevalence infectivity. We set out to test whether communities predict occurrence a hospital setting.We collected 972 samples from hospitalized patients with COVID-19, their health care providers, surfaces before, during, after admission. screened...

10.1186/s40168-021-01083-0 article EN cc-by Microbiome 2021-06-08

EMPress Enables Tree-Guided, Interactive, and Exploratory Analyses of Multi-omic Data Sets

OPENALEX - Publications

Kalen Cantrell Marcus W. Fedarko Gibraan Rahman Daniel McDonald Yimeng Yang and 25 more

Standard workflows for analyzing microbiomes often include the creation and curation of phylogenetic trees. Here we present EMPress, an interactive web tool visualizing trees in context microbiome, metabolome, other community data scalable to with well over 500,000 nodes. EMPress provides novel functionality-including ordination integration animations-alongside many standard tree visualization features thus simplifies exploratory analyses forms 'omic data.IMPORTANCE Phylogenetic are integral...

10.1128/msystems.01216-20 article EN cc-by mSystems 2021-03-15

Monitoring the microbiome for food safety and quality using deep shotgun sequencing

OPENALEX - Publications

Kristen L. Beck Niina Haiminen D. D. Chambliss Stefan Edlund Mark Kunitomi and 17 more

In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of unexpected contaminants or environmental changes. To test hypothesis, sequenced total RNA 31 high protein powder (HPP) samples poultry meal pet ingredients. We developed a analysis pipeline employing key eukaryotic matrix filtering step improved microbe detection specificity to >99.96% during silico validation. The identified 119 microbial genera per HPP sample on average with 65 present all...

10.1038/s41538-020-00083-y article EN cc-by npj Science of Food 2021-02-08

SARS-CoV-2 infectivity can be modulated through bacterial grooming of the glycocalyx

OPENALEX - Publications

Cameron Martino Benjamin P. Kellman Daniel R. Sandoval Thomas Mandel Clausen Robert M. Cooper and 60 more

ABSTRACT The gastrointestinal (GI) tract is a site of replication severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and GI symptoms are often reported by patients. SARS-CoV-2 cell entry depends upon heparan sulfate (HS) proteoglycans, which commensal bacteria that bathe the human mucosa known to modify. To explore gut HS-modifying bacterial abundances how their presence may impact infection, we developed task-based analysis proteoglycan degradation on large-scale shotgun...

10.1128/mbio.04015-24 article EN cc-by mBio 2025-02-25

GenomicTools: a computational platform for developing high-throughput analytics in genomics

OPENALEX - Publications

Aristotelis Tsirigos Niina Haiminen Erhan Bilal Filippo Utro

Recent advances in sequencing technology have resulted the dramatic increase of data, which, turn, requires efficient management computational resources, such as computing time, memory requirements well prototyping pipelines.We present GenomicTools, a flexible platform, comprising both command-line set tools and C++ API, for analysis manipulation high-throughput data DNA-seq, RNA-seq, ChIP-seq MethylC-seq. GenomicTools implements variety mathematical operations between sets genomic regions...

10.1093/bioinformatics/btr646 article EN Bioinformatics 2011-11-22

Food authentication from shotgun sequencing reads with an application on high protein powders

OPENALEX - Publications

Niina Haiminen Stefan Edlund D. D. Chambliss Mark Kunitomi Bart C. Weimer and 14 more

Abstract Here we propose that using shotgun sequencing to examine food leads accurate authentication of ingredients and detection contaminants. To demonstrate this, developed a bioinformatic pipeline, FASER (Food Authentication from SEquencing Reads), designed resolve the relative composition mixtures eukaryotic species RNA or DNA sequencing. Our comprehensive database includes >6000 plants animals may be present in food. accurately identified with 0.4% median absolute difference between...

10.1038/s41538-019-0056-6 article EN cc-by npj Science of Food 2019-11-19

Bacterial modification of the host glycosaminoglycan heparan sulfate modulates SARS-CoV-2 infectivity

OPENALEX - Publications

Cameron Martino Benjamin P. Kellman Daniel R. Sandoval Thomas Mandel Clausen Clarisse Marotz and 31 more

Abstract The human microbiota has a close relationship with disease and it remodels components of the glycocalyx including heparan sulfate (HS). Studies severe acute respiratory syndrome coronavirus (SARS-CoV-2) spike protein receptor binding domain suggest that infection requires to HS angiotensin converting enzyme 2 (ACE2) in codependent manner. Here, we show commensal host bacterial communities can modify thereby modulate SARS-CoV-2 these change age sex. Common human-associated bacteria...

10.1101/2020.08.17.238444 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-08-18

Efficient computation of Faith's phylogenetic diversity with applications in characterizing microbiomes

OPENALEX - Publications

George Armstrong Kalen Cantrell Shi Huang Daniel McDonald Niina Haiminen and 18 more

The number of publicly available microbiome samples is continually growing. As data set size increases, bottlenecks arise in standard analytical pipelines. Faith's phylogenetic diversity (Faith's PD) a highly utilized alpha metric that has thus far failed to effectively scale trees with millions vertices. Stacked (SFPhD) enables calculation this widely adopted at much larger by implementing computationally efficient algorithm. algorithm reduces the amount computational resources required,...

10.1101/gr.275777.121 article EN cc-by-nc Genome Research 2021-09-03

Functional profiling of COVID-19 respiratory tract microbiomes

OPENALEX - Publications

Niina Haiminen Filippo Utro Ed Seabolt Laxmi Parida

Abstract In response to the ongoing global pandemic, characterizing molecular-level host interactions of new coronavirus SARS-CoV-2 responsible for COVID-19 has been at center unprecedented scientific focus. However, when virus enters body it also interacts with micro-organisms already inhabiting host. Understanding virus-host-microbiome can yield additional insights into biological processes perturbed by viral invasion. Alterations in gut microbiome species and metabolites have noted during...

10.1038/s41598-021-85750-0 article EN cc-by Scientific Reports 2021-03-19

DNA Extraction and Host Depletion Methods Significantly Impact and Potentially Bias Bacterial Detection in a Biological Fluid

OPENALEX - Publications

Erika Ganda Kristen L. Beck Niina Haiminen Justin D. Silverman Ban Kawas and 4 more

Tracking the bacterial communities present in our food has potential to inform safety and product origin. To do so, entire genetic material a sample is extracted using chemical methods or commercially available kits sequenced next-generation platforms provide snapshot of microbial composition.

10.1128/msystems.00619-21 article EN mSystems 2021-06-15

Application of Genome Wide Association and Genomic Prediction for Improvement of Cacao Productivity and Resistance to Black and Frosty Pod Diseases

OPENALEX - Publications

J. Alberto Romero Navarro Wilbert Phillips‐Mora Adriana Arciniegas-Leal Allan Mata-Quirós Niina Haiminen and 7 more

Chocolate is a highly valued and palatable confectionery product. primarily made from the processed seeds of tree species Theobroma cacao. Cacao cultivation relevant for small-holder farmers throughout tropics, yet its productivity remains limited by low yields widespread pathogens. A panel 148 improved cacao clones was assembled based on disease resistance, phenotypic single-tree replicated clonal evaluation performed 8 years. Using high-density markers, diversity expressed relative to 10...

10.3389/fpls.2017.01905 article EN cc-by Frontiers in Plant Science 2017-11-14

OGUs enable effective, phylogeny-aware analysis of even shallow metagenome community structures

OPENALEX - Publications

Qiyun Zhu Shi Huang Antonio G. González Imran McGrath Daniel McDonald and 21 more

Abstract We introduce Operational Genomic Unit (OGU), a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing diversity of microbial communities and their relevance environmental factors. This approach is independent from taxonomic classification, granting possibility maximal resolution community composition, organizes features into an accurate hierarchy using phylogenomic tree. The outputs are suitable...

10.1101/2021.04.04.438427 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2021-04-06

Segmentation and dimensionality reduction

OPENALEX - Publications

Ella Bingham Aristides Gionis Niina Haiminen Heli Hiisilä Heikki Mannila and 1 more

Sequence segmentation and dimensionality reduction have been used as methods for studying high-dimensional sequences — they both reduce the complexity of representation original data. In this paper we study interplay these two techniques. We formulate problem segmenting a sequence while modeling it with basis small size, thus essentially reducing dimension input sequence. give three different algorithms problem: all combine existing reduction. For proposed prove guarantees quality solutions...

10.1137/1.9781611972764.33 article EN 2006-04-20

Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results

OPENALEX - Publications

Niina Haiminen David N. Kuhn Laxmi Parida Isidore Rigoutsos

Recent developments in high-throughput sequencing technology have made low-cost an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers usable sequences per instrument-run continue to make whole-genome assembly appealing target application. In this paper we evaluate feasibility de novo from short reads (≤100 nucleotides) through a detailed study involving genomic various lengths origin, conjunction...

10.1371/journal.pone.0024182 article EN cc-by PLoS ONE 2011-09-07

Comparative exomics of Phalariscultivars under salt stress

OPENALEX - Publications

Niina Haiminen Manfred Klaas Zeyu Zhou Filippo Utro Paul Cormican and 5 more

Reed canary grass (Phalaris arundinacea) is an economically important forage and bioenergy of the temperate regions world. Despite its economic importance, it lacking in public genomic data. We explore comparative exomics cultivars context response to salt exposure. The limited data set poses challenges computational pipeline. As a prerequisite for study, we generate Phalaris reference transcriptome sequence, one first steps addressing issue paucity processed this species. In addition,...

10.1186/1471-2164-15-s6-s18 article EN cc-by BMC Genomics 2014-10-01

Randomization of real-valued matrices for assessing the significance of data mining results

OPENALEX - Publications

Markus Ojala Niko Vuokko Aleksi Kallio Niina Haiminen Heikki Mannila

Randomization is an important technique for assessing the significance of data mining results. Given input set, a randomization method samples at random from some class datasets that share certain characteristics with original data. The measure interest on then compared to assess its significance.For types data, e.g., gene expression matrices, it useful be able sample row and column means variances. Testing whether results algorithm such randomized differ true dataset tells us were artifact...

10.1137/1.9781611972788.45 article EN 2008-04-24

Sequencing of a QTL-rich region of the Theobroma cacao genome using pooled BACs and the identification of trait specific candidate genes

OPENALEX - Publications

F. Alex Feltus Christopher Saski Keithanne Mockaitis Niina Haiminen Laxmi Parida and 9 more

BAC-based physical maps provide for sequencing across an entire genome or a selected sub-genomic region of biological interest. Such can be approached with next-generation whole-genome and assembly as if it were independent small genome. Using the minimum tiling path guide, specific BAC clones representing prioritized genomic interval are selected, pooled, used to prepare library. This pooled approach was taken sequence assemble QTL-rich region, ~3 Mbp represented by twenty-seven BACs, on...

10.1186/1471-2164-12-379 article EN cc-by BMC Genomics 2011-07-27

Coming Soon ...