- Research Data Management Practices
- Scientific Computing and Data Management
- Data Quality and Management
- Big Data and Business Intelligence
- Geological Modeling and Analysis
- Distributed and Parallel Computing Systems
- Geographic Information Systems Studies
- Semantic Web and Ontologies
- Big Data Technologies and Applications
- Digital and Traditional Archives Management
- Geochemistry and Geologic Mapping
- Planetary Science and Exploration
- Privacy-Preserving Technologies in Data
- Software Engineering Research
- Space Exploration and Technology
- demographic modeling and climate adaptation
- Advanced Data Storage Technologies
- Distributed systems and fault tolerance
- scientometrics and bibliometrics research
- Spatial and Panel Data Analysis
- Advanced X-ray and CT Imaging
- Privacy, Security, and Data Protection
- Open Source Software Innovations
- Technology and Data Analysis
- Disaster Management and Resilience
Columbia University
2010-2024
International Data Group (Sweden)
2023
World Data System
2023
National Bureau of Statistics of China
2023
Earth Island Institute
2017-2021
Earth Science Institute of the Slovak Academy of Sciences
2020-2021
Applied Sciences (United States)
2020
Montclair State University
1999
As information and communication technology has become pervasive in our society, we are increasingly dependent on both digital data repositories that provide access to enable the use of such resources. Repositories must earn trust communities they intend serve demonstrate reliable capable appropriately managing hold.
Reproducibility and reusability of research results is an important concern in scientific communication science policy. A foundational element reproducibility the open persistently available presentation data. However, many common approaches for primary data publication use today do not achieve sufficient long-term robustness, openness, accessibility or uniformity. Nor they permit comprehensive exploitation by modern Web technologies. This has led to several authoritative studies...
In recent years, a number of data identification technologies have been developed which purport to permanently identify digital objects. this paper, nine and systems for assigning persistent identifiers are assessed their applicability Earth science (ARKs, DOIs, XRIs, Handles, LSIDs, OIDs, PURLs, URIs/URNs/URLs, UUIDs). The evaluation used four use cases that focused on the suitability each scheme provide Unique Identifiers objects, Locators serve as Citable Locators, uniquely scientific...
NASA has a long history of collecting and openly sharing scientific data to help users better understand the sun, Earth, solar system universe. Over 40 repositories across five broad disciplines work archive, manage care for these valuable assets. To improve interdisciplinary transdisciplinary science, developed information governance strategy. The strategy focuses on collaborative approaches build more connected cooperative stewardship community while recognizing diversity domain specific...
Information about data quality helps potential users to determine whether and how can be used enables the analysis interpretation of such data. Providing information improves opportunities for reuse by increasing trustworthiness Recognizing need improving citizen science data, we describe assessment control (QA/QC) issues these offer perspectives on aspects or ensuring conducting research related issues.
Open-source science builds on open and free resources that include data, metadata, software, workflows. Informed decisions whether how to (re)use digital datasets are dependent an understanding about the <em>quality</em> of underpinning data relevant information. However, quality information, being difficult curate often context specific, is currently not readily available for sharing within across disciplines. To help address this challenge promote creation freely openly shared information...
The CODATA Data Science Journal is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data databases across all domains, including science, technology, humanities arts. scope journal includes descriptions systems, their implementations publication, applications, infrastructures, software, legal, reproducibility transparency issues, availability usability complex datasets, with particular focus principles, policies...
The lack of clear references to and descriptions data sets in published literature limits the usefulness data, as well reproducibility credibility scientific findings.
Science data collection and documentation practices have changed radically over the last several hundred years, most importantly since advent of digital age. Data centers, as repositories for that science data, only had their genesis in early twentieth century, yet excess 50 years experience managing data. In Earth Sciences, past 15+ Federation Information Partners (ESIP) has been working to make more discoverable, accessible, usable by people. As a part this effort, ESIP Stewardship...
A dataset, small or big, is often changed to correct errors, apply new algorithms, add data (e.g., as part of a time series), etc. In addition, datasets might be bundled into collections, distributed in different encodings mirrored onto platforms. All these differences between versions need understood by researchers who want cite the exact version dataset that was used underpin their research. Failing do so reduces reproducibility research results. Ambiguous identification also impacts and...
Investments in research that produce scientific and scholarly data can be leveraged by enabling the resulting products services to used broader communities for new purposes, extending reuse beyond initial users purposes which were originally collected. Submitting a repository offers opportunities future, providing ways benefits realized from reuse. Improvements repositories facilitate uses of increase potential gains value open are associated with such Assessing certifying capabilities...
Data do not exist in a vacuum. To be useful, data must accompanied by context on how they are captured, processed, analyzed, and validated other information that enables interpretation use.
Ongoing stewardship is required to keep data collections and archives in existence. Scientific may face a range of risk factors that could hinder, constrain, or limit current future use. Identifying such use key step preventing minimizing loss. This paper presents an analysis scientific face, assessment matrix support assessments help ameliorate those risks. The goals this work are inform enable effective by: a) individuals organizations who manage collections, b) want reduce the risks...
The reuse of software and related artifacts offers the potential for cost savings in various industries has contributed to development cyberinfrastructure that is used by Earth science community. Developing measures enable assessment terms its reusability can contribute efforts both developers reusers software. Draft Reuse Readiness Levels (RRLs) have been developed as an instrument assessing maturity products reuse. process employed develop draft RRLs described, initial summary topic areas...
Datasets carry cultural and political context at all parts of the data life cycle.Historically, Earth science repositories guidance policies have been a combination mandates from their funding agencies needs user communities -typically universities, researchers.Consequently, repository practices rarely taken into consideration other such as Indigenous Peoples on whose lands are often acquired.In recent years, number global efforts worked to improve conduct research well policy by that hold...
Science software has contributed to research practices, but the sustainability of scientific presents challenges for future use resources. Identifying improvements science practices can contribute re-use software. A focus group study was conducted identify ways improve Earth community. facilitated, roundtable discussion activity at 2014 Federation Information Partners (ESIP) Summer Meeting elicited recommendations on community activities These suggestions fell into three broad themes – (1)...
Knowledge about the quality of data and metadata is important to support informed decisions on (re)use individual datasets an essential part ecosystem that supports open science. Quality assessments reflect reliability usability data. They need be consistently curated, fully traceable, adequately documented, as these are crucial for sound decision- policy-making efforts rely also represented readily integrated across systems tools allow improved sharing information at dataset level attribute...
Packaging software assets for reuse can improve the potential others to adopt software. with appropriate documentation and other resources facilitate decision-making by those considering adoption enable them implement more efficiently. Software that be easily integrated is likely shared reused recipients. The NASA Earth Science Data Systems (ESDS) Reuse Working Group has been chartered oversee process will maximize of components. As part this work, a portal Web site was created support...
This work is licensed under a Creative Commons Attribution 3.0 Unported License. Objectives: Scientific data centers and other digital repositories need to continuously improve so that they can meet the challenge of providing stewardship for scientific are used by scientists, policy-makers, educators their students, general public. As part its efforts capabilities services offered communities interested in using on human interactions environment, SEDAC, NASA Socioeconomic Data Applications...
Software assets from existing Earth science missions can be reused for the new decadal survey that are being planned by NASA in response to 2007 Science National Research Council (NRC) Study. The will require development of software curate, process, and disseminate data users interest broader mission community. In this paper, we discuss tools a blossoming community developed Data System (ESDS) Reuse Working Group (SRWG) improve capabilities reusing assets.