Vanessa Sochat

ORCID: 0000-0002-4387-3819
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Scientific Computing and Data Management
  • Functional Brain Connectivity Studies
  • Distributed and Parallel Computing Systems
  • Research Data Management Practices
  • Advanced Neuroimaging Techniques and Applications
  • Cloud Computing and Resource Management
  • Advanced MRI Techniques and Applications
  • Cell Image Analysis Techniques
  • Big Data and Business Intelligence
  • Data Quality and Management
  • Neural dynamics and brain function
  • Biomedical Text Mining and Ontologies
  • Software System Performance and Reliability
  • Computational Physics and Python Applications
  • Gene expression and cancer classification
  • Mental Health Research Topics
  • Semantic Web and Ontologies
  • Software Engineering Research
  • Online Learning and Analytics
  • Fungal and yeast genetics research
  • Bioinformatics and Genomic Networks
  • Advanced Data Storage Technologies
  • Business Process Modeling and Analysis
  • Model-Driven Software Engineering Techniques
  • Machine Learning and Data Classification

Lawrence Livermore National Laboratory
2021-2024

Stanford University
2014-2021

Computing Center
2018-2021

University Research Co (United States)
2017-2020

Stanford Medicine
2015

Here we present Singularity, software developed to bring containers and reproducibility scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing design, these complete easily be copied executed on other platforms. is an open source initiative that harnesses the expertise system engineers researchers alike, integrates seamlessly into common workflows for both groups. As its primary use case, brings mobility computing users HPC...

10.1371/journal.pone.0177459 article EN public-domain PLoS ONE 2017-05-11

The development of magnetic resonance imaging (MRI) techniques has defined modern neuroimaging. Since its inception, tens thousands studies using such as functional MRI and diffusion weighted have allowed for the non-invasive study brain. Despite fact that is routinely used to obtain data neuroscience research, there been no widely adopted standard organizing describing collected in an experiment. This renders sharing reusing (within or between labs) difficult if not impossible unnecessarily...

10.1038/sdata.2016.44 article EN cc-by Scientific Data 2016-06-21

<ns4:p>Data analysis often entails a multitude of heterogeneous steps, from the application various command line tools to usage scripting languages like R or Python for generation plots and tables. It is widely recognized that data analyses should ideally be conducted in reproducible way. Reproducibility enables technical validation regeneration results on original even new data. However, reproducibility alone by no means sufficient deliver an lasting impact (i.e., sustainable) field, just...

10.12688/f1000research.29032.2 preprint EN cc-by F1000Research 2021-04-19

Data analysis often entails a multitude of heterogeneous steps, from the application various command line tools to usage scripting languages like R or Python for generation plots and tables. It is widely recognized that data analyses should ideally be conducted in reproducible way. Reproducibility enables technical validation regeneration results on original even new data. However, reproducibility alone by no means sufficient deliver an lasting impact (i.e., sustainable) field, just one...

10.12688/f1000research.29032.1 preprint EN cc-by F1000Research 2021-01-18

Here we present NeuroVault-a web based repository that allows researchers to store, share, visualize, and decode statistical maps of the human brain. NeuroVault is easy use employs modern technologies provide informative visualization data without need install additional software. In addition, it leverages power Neurosynth database cognitive decoding deposited maps. The are exposed through a public REST API enabling other services tools take advantage it. new resource for interested in...

10.3389/fninf.2015.00008 article EN cc-by Frontiers in Neuroinformatics 2015-04-10

Abstract Psychiatric disorders are characterized by major fluctuations in psychological function over the course of weeks and months, but dynamic characteristics brain this timescale healthy individuals unknown. Here, as a proof concept to address question, we present MyConnectome project. An intensive phenome-wide assessment single human was performed period 18 including functional structural connectivity using magnetic resonance imaging, physical health, gene expression metabolomics. A...

10.1038/ncomms9885 article EN cc-by Nature Communications 2015-12-09

Although the prevalence of autism spectrum disorder (ASD) has risen sharply in last few years reaching 1 68, average age diagnosis United States remains close to 4—well past developmental window when early intervention largest gains. This emphasizes importance developing accurate methods detect risk faster than current standards care. In present study, we used machine learning evaluate one best and most widely instruments for clinical assessment ASD, Autism Diagnostic Observation Schedule...

10.1038/tp.2015.7 article EN cc-by-nc-nd Translational Psychiatry 2015-02-24

DataLad is a Python-based tool for the joint management of code, data, and their relationship, built on top versatile system data logistics (git-annex) most popular distributed version control (Git).It adapts principles open-source software development distribution to address technical challenges management, sharing, digital provenance collection across life cycle objects.DataLad aims make as easy managing code.It streamlines procedures consume, publish, update any size or type, link them...

10.21105/joss.03262 article EN cc-by The Journal of Open Source Software 2021-07-01

The administration of behavioral and experimental paradigms for psychology research is hindered by lack a coordinated effort to develop deploy standardized paradigms. While several frameworks (Mason Suri, 2011; McDonnell et al., 2012; de Leeuw, 2015; Lange 2015) have provided infrastructure methods individual groups paradigms, missing linked with system easily them. This disorganization leads redundancy in development, divergent implementations conceptually identical tasks, disorganized...

10.3389/fpsyg.2016.00610 article EN cc-by Frontiers in Psychology 2016-04-26

Highlights The manuscript presents a method to calculate sample sizes for fMRI experiments power analysis is based on the estimation of mixture distribution null and active peaks methodology validated with simulated real data. 1 Abstract Mounting evidence over last few years suggest that published neuroscience research suffer from low power, especially experiments. Not only does decrease chance detecting true effect, it also reduces statistically significant result indicates effect...

10.1101/049429 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-04-20

Only a tiny fraction of the data and metadata produced by an fMRI study is finally conveyed to community. This lack transparency not only hinders reproducibility neuroimaging results but also impairs future meta-analyses. In this work we introduce NIDM-Results, format specification providing machine-readable description statistical along with key image summarising experiment. NIDM-Results provides unified representation mass univariate analyses including level detail consistent available...

10.1038/sdata.2016.102 article EN cc-by Scientific Data 2016-12-05

Here we present Singularity Hub, a framework to build and deploy containers for mobility of compute, the singularity-python software with novel metrics assessing reproducibility such containers. make it possible scientists developers package reproducible software, Hub adds automation this workflow by building, capturing metadata for, visualizing, serving programmatically. Our metrics, based on custom filters content hashes container contents, allow comparison an entire container, including...

10.1371/journal.pone.0188511 article EN cc-by PLoS ONE 2017-11-29

Here we present NeuroVault - a web based repository that allows researchers to store, share, visualize, and decode statistical maps of the human brain. is easy use employs modern technologies provide informative visualization data without need install additional software. In addition, it leverages power Neurosynth database cognitive decoding deposited maps. The are exposed through public REST API enabling other services tools take advantage it. new resource for interested in conducting meta-...

10.1101/010348 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2014-10-14

Analyzing Functional Magnetic Resonance Imaging (fMRI) of resting brains to determine the spatial location and activity intrinsic brain networks--a novel burgeoning research field--is limited by lack ground truth tendency analyses overfit data. Independent Component Analysis (ICA) is commonly used separate data into signal Gaussian noise components, then map these components on networks. Identifying from this data, however, a tedious process that has proven hard automate, particularly when...

10.1371/journal.pone.0095493 article EN cc-by PLoS ONE 2014-04-18

Python for Population Genomics (PyPop) is a software package that processes genotype and allele data performs large-scale population genetic analyses on highly polymorphic multi-locus data. In particular, PyPop tests conformity to Hardy-Weinberg equilibrium expectations, Ewens-Watterson selection, estimates haplotype frequencies, measures linkage disequilibrium, significance. Standardized means of performing these key contemporary studies evolutionary biology genetics, are central disease...

10.3389/fimmu.2024.1378512 article EN cc-by Frontiers in Immunology 2024-04-02

10.21105/joss.00521 article EN cc-by The Journal of Open Source Software 2018-02-04

Computational science has been greatly improved by the use of containers for packaging software and data dependencies. In a scholarly context, main drivers using these are transparency support reproducibility; in turn, workflow’s reproducibility can be affected choices that made with respect to building containers. many cases, build process container’s image is created from instructions provided Dockerfile format. this approach, we present set rules help researchers write understandable...

10.31219/osf.io/fsd7t preprint EN 2020-04-17

Abstract The development of magnetic resonance imaging (MRI) techniques has defined modern neuroimaging. Since its inception, tens thousands studies using such as functional MRI and diffusion weighted have allowed for the non-invasive study brain. Despite fact that is routinely used to obtain data neuroscience research, there been no widely adopted standard organizing describing collected in an experiment. This renders sharing reusing (within or between labs) difficult if not impossible...

10.1101/034561 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2015-12-16

Here, we present the Scientific Filesystem (SCIF), an organizational format that supports exposure of executables and metadata for discoverability scientific applications. The includes a known filesystem structure, definition set environment variables describing it, functions generation interaction with libraries, metadata, located within. SCIF makes it easy to expose multiple environments, installation steps, files, entry points render applications consistent, modular, discoverable. A can...

10.1093/gigascience/giy023 article EN cc-by GigaScience 2018-03-13

Abstract Only a tiny fraction of the data and metadata produced by an fMRI study is finally conveyed to community. This lack transparency not only hinders reproducibility neuroimaging results but also impairs future meta-analyses. In this work we introduce NIDM-Results, format specification providing machine-readable description statistical along with key image summarising experiment. NIDM-Results provides unified representation mass univariate analyses including level detail consistent...

10.1101/041798 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2016-04-13

Containershare is an open source library of containers, both providing itself as a template, library, and production application programming interface (API) for interested users.Specifically, it complete metadata registry that can be freely deployed directly from Github repository to validate serve tested, tagged, version controlled each maintained independent repositories.The uses several free use solutions accomplish this, brings them together programatically with steps are easy the user...

10.21105/joss.00878 article EN cc-by The Journal of Open Source Software 2018-08-07

Singularity Registry is a non-centralized free and Open Source infrastructure to facilitate management sharing of institutional or personal containers.A container the encapsulation an entire computational environment that can be run consistently if platform supports it.It aid in reproducibility (Moreews et al. 2015, Belmann (2015), Boettiger (2014), Santana-Perez Pérez-Hernández Wandell ( 2015)) because different researchers exact same software stack on underlying (Linux Intel)...

10.21105/joss.00426 article EN cc-by The Journal of Open Source Software 2017-10-16
Coming Soon ...