Michael R. Crusoe

ORCID: 0000-0002-2961-9670
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Scientific Computing and Data Management
  • Research Data Management Practices
  • Distributed and Parallel Computing Systems
  • Genetics, Bioinformatics, and Biomedical Research
  • RNA and protein synthesis mechanisms
  • Genomics and Phylogenetic Studies
  • Advanced Data Storage Technologies
  • Biomedical and Engineering Education
  • Simulation Techniques and Applications
  • Data Quality and Management
  • Biomedical Text Mining and Ontologies
  • Advanced Control Systems Optimization
  • Metabolomics and Mass Spectrometry Studies
  • Cell Image Analysis Techniques
  • Gene expression and cancer classification
  • Ancient and Medieval Archaeology Studies
  • Interdisciplinary Research and Collaboration
  • Advanced Biosensing Techniques and Applications
  • Mineralogy and Gemology Studies
  • China's Ethnic Minorities and Relations
  • Algorithms and Data Compression
  • Anomaly Detection Techniques and Applications
  • Peer-to-Peer Network Technologies
  • Green IT and Sustainability
  • User Authentication and Security Systems

Zuse Institute Berlin
2025

Forschungszentrum Jülich
2023-2024

Vrije Universiteit Amsterdam
2020-2024

Freie Universität Berlin
2024

Dutch Techcentre for Life Sciences
2024

Center for Life Sciences
2023

Institut Pasteur
2022

Software (Spain)
2022

English Heritage
2022

Michigan State University
2015-2019

Enis Afgan Anton Nekrutenko Björn Grüning Daniel Blankenberg Jeremy Goecks and 95 more Michael C. Schatz Alexander Ostrovsky Alexandru Mahmoud Andrew Lonie Anna Syme Anne Fouilloux Anthony Bretaudeau Anton Nekrutenko Anup Kumar Arthur C. Eschenlauer Assunta D DeSanto Aysam Guerler Beatriz Serrano‐Solano Bérénice Batut Björn Grüning Bradley W. Langhorst Bridget Carr Bryan Raubenolt Cameron Hyde Catherine J. Bromhead Christopher B. Barnett Coline Royaux Cristóbal Gallardo Daniel Blankenberg Daniel Fornika Dannon Baker Dave Bouvier Dave Clements David Anderson de Lima Morais David López Tabernero Delphine Larivière Engy Nasr Enis Afgan Federico Zambelli Florian Heyl Fotis Psomopoulos Frederik Coppens Gareth Price Gianmauro Cuccuru Gildas Le Corguillé Greg Von Kuster Gulsum Gudukbay Akbulut Helena Rasche Hans-Rudolf Hotz Ignacio Eguinoa Igor V. Makunin Isuru Ranawaka James Taylor Jayadev Joshi Jennifer Hillman‐Jackson Jeremy Goecks John Chilton Kaivan Kamali Keith Suderman Krzysztof Poterlowicz Le Bras Yvan Lucille Lopez‐Delisle Luke Sargent Madeline E. Bassetti M. A. Tangaro Marius van den Beek Martin Čech Matthias Bernt Matthias Fahrner Mehmet Tekman Melanie Christine Föll Michael C. Schatz Michael R. Crusoe Miguel Roncoroni Natalie Kucher Nate Coraor Nicholas Stoler Nick Rhodes Nicola Soranzo Niko Pinter Nuwan Goonasekera Pablo Moreno Pavankumar Videm Mélanie Pétéra Pietro Mandreoli Pratik Jagtap Qiang Gu Ralf J. M. Weber Ross Lazarus Ruben H.P. Vorderman Saskia Hiltemann Sergey Golitsynskiy Shilpa Garg Simon Bray Simon Gladman Simone Leo Subina Mehta Timothy J. Griffin Vahid Jalili Yves Vandenbrouck

Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues use, maintain contribute the project, support from multiple national infrastructure providers that enable freely analysis training services. The Training Network supports free, self-directed, virtual >230 integrated tutorials. Project engagement metrics have continued grow...

10.1093/nar/gkac247 article EN cc-by Nucleic Acids Research 2022-04-14

MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over past 2 years, (formerly EBI Metagenomics) has more than doubled number publicly available analysed datasets held within resource. Recently, an updated approach been unveiled (version 5.0), replacing previous single pipeline with multiple pipelines tailored...

10.1093/nar/gkz1035 article EN cc-by Nucleic Acids Research 2019-10-23

<ns3:p>The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. provides implementations of probabilistic k-mer counting data structure, compressible De Bruijn graph representation, partitioning, and digital normalization. implemented in C++ Python, under the BSD license at <ns3:ext-link xmlns:ns4="http://www.w3.org/1999/xlink" ext-link-type="uri"...

10.12688/f1000research.6924.1 preprint EN cc-by F1000Research 2015-09-25

The Common Workflow Language (CWL) is an informal, multi-vendor working group consisting of various organizations and individuals that have interest in portability data analysis workflows. Our goal to create specifications enable scientists describe tools workflows are powerful, easy use, portable, support reproducibility.CWL builds on technologies such as JSON-LD Avro for modeling Docker portable runtime environments. CWL designed express data-intensive science, Bioinformatics, Medical...

10.6084/m9.figshare.3115156.v2 article EN 2016-07-08

Standardizing computational reuse and portability with the Common Workflow Language.

10.1145/3486897 article EN Communications of the ACM 2022-05-20

Computational workflows describe the complex multi-step methods that are used for data collection, preparation, analytics, predictive modelling, and simulation lead to new products. They can inherently contribute FAIR principles: by processing according established metadata; creating metadata themselves during of data; tracking recording provenance. These properties aid quality assessment secondary usage. Moreover, digital objects in their own right. This paper argues principles need address...

10.1162/dint_a_00033 article EN Data Intelligence 2019-11-01

Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines...

10.1038/s41597-025-04451-9 article EN cc-by-nc-nd Scientific Data 2025-02-24

Recording the provenance of scientific computation results is key to support traceability, reproducibility and quality assessment data products. Several models have been explored address this need, providing representations workflow plans their executions as well means packaging resulting information for archiving sharing. However, existing approaches tend lack interoperable adoption across management systems. In work we present Workflow Run RO-Crate, an extension RO-Crate (Research Object...

10.1371/journal.pone.0309210 article EN cc-by PLoS ONE 2024-09-10

<ns4:p>Software Containers are changing the way scientists and researchers develop, deploy exchange scientific software. They allow labs of all sizes to easily install bioinformatics software, maintain multiple versions same software combine tools into powerful analysis pipelines. However, containers packages should be produced under certain rules standards in order reusable, compatible easy integrate pipelines workflows. Here, we presented a set recommendations developed by BioContainers...

10.12688/f1000research.15140.2 preprint EN cc-by F1000Research 2019-03-20

The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity Africa program (H3Africa), an African-led consortium funded by US National Institutes UK Wellcome Trust, aimed at using genomics to study improve health Africans. A key role support H3Africa projects building infrastructure such as portable reproducible workflows for use on heterogeneous computing environments. Processing analysis...

10.1186/s12859-018-2446-1 article EN cc-by BMC Bioinformatics 2018-11-29

Squamates include all lizards and snakes, display some of the most diverse extreme morphological adaptations among vertebrates. However, compared with birds mammals, relatively few resources exist for comparative genomic analyses squamates, hampering efforts to understand molecular bases phenotypic diversification in such a speciose clade. In particular, ∼400 species anole lizard represent an extensive squamate radiation. Here, we sequence assemble draft genomes three species-Anolis...

10.1093/gbe/evy013 article EN cc-by-nc Genome Biology and Evolution 2018-01-18

Software Containers are changing the way scientists and researchers develop, deploy exchange scientific software. They allow labs of all sizes to easily install bioinformatics software, maintain multiple versions same software combine tools into powerful analysis pipelines. However, containers packages should be produced under certain rules standards in order reusable, compatible easy integrate pipelines workflows. Here, we presented a set recommendations developed by BioContainers Community...

10.12688/f1000research.15140.1 preprint EN cc-by F1000Research 2018-06-14

A personalized approach based on a patient's or pathogen’s unique genomic sequence is the foundation of precision medicine. Genomic findings must be robust and reproducible, experimental data capture should adhere to findable, accessible, interoperable, reusable (FAIR) guiding principles. Moreover, effective medicine requires standardized reporting that extends beyond wet-lab procedures computational methods. The BioCompute framework (https://w3id.org/biocompute/1.3.0) enables provenance,...

10.1371/journal.pbio.3000099 article EN public-domain PLoS Biology 2018-12-31

Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, interactomes, within between individuals across species. Due to large volumes, the analysis integration data generated by such high-throughput technologies have become computationally intensive, can no longer happen on a typical desktop computer.In this chapter we show how describe execute same using number workflow systems these follow different approaches tackle execution...

10.1007/978-1-4939-9074-0_24 article EN cc-by Methods in molecular biology 2019-01-01

Scientific workflows have been used almost universally across scientific domains, and underpinned some of the most significant discoveries past several decades. Many these high computational, storage, and/or communication demands, thus must execute on a wide range large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions be managed using software infrastructure. Due popularity workflows, workflow management systems (WMSs)...

10.48550/arxiv.2103.09181 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

Information integration and workflow technologies for data analysis have always been major fields of investigation in bioinformatics. A range popular suites are available to support analyses computational biology. Commercial providers tend offer prepared applications remote their clients. However, most academic environments with local expertise, novel collection techniques or analysis, it is essential all the flexibility open-source tools descriptions. Workflows data-driven science such as...

10.1007/s41019-017-0050-4 article EN cc-by Data Science and Engineering 2017-09-01

<ns3:p>Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used the life sciences, though their composition has remained a cumbersome manual process due to lack standards for annotation, assembly, and implementation. Recent technological advances returned long-standing vision workflow into focus.</ns3:p><ns3:p> This article summarizes recent Lorentz Center workshop dedicated sciences. We survey...

10.12688/f1000research.54159.1 preprint EN cc-by F1000Research 2021-09-07

Scientific workflows facilitate the automation of data analysis tasks by integrating various software and tools executed in a particular order. To enable transparency reusability workflows, it is essential to implement FAIR principles. Here, we describe our experiences implementing principles for metabolomics using Metabolome Annotation Workflow (MAW) as case study. MAW specified Common Language (CWL), allowing subsequent execution workflow on different engines. registered CWL description...

10.3390/metabo14020118 article EN cc-by Metabolites 2024-02-10

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. provides implementations of probabilistic k-mer counting data structure, compressible De Bruijn graph representation, partitioning, and digital normalization. implemented in C++ Python, under the BSD license at http://github.com/ged-lab/khmer/.

10.6084/m9.figshare.979190.v3 article EN 2014-04-01
Coming Soon ...