NFDI4DS | UHH-SEMS - Publication Details

Line Pouchard

ORCID: 0000-0002-2120-6521

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5044729983

Research Areas

Scientific Computing and Data Management
Research Data Management Practices
Distributed and Parallel Computing Systems
Semantic Web and Ontologies
Advanced Data Storage Technologies
Advanced Text Analysis Techniques
Cloud Computing and Resource Management
Biomedical Text Mining and Ontologies
Topic Modeling
Software System Performance and Reliability
Explainable Artificial Intelligence (XAI)
Web Data Mining and Analysis
Data Quality and Management
Machine Learning in Materials Science
Business Process Modeling and Analysis
Computational Drug Discovery Methods
Distributed systems and fault tolerance
Geological Modeling and Analysis
Anomaly Detection Techniques and Applications
Natural Language Processing Techniques
Advanced Computational Techniques and Applications
Big Data and Business Intelligence
Service-Oriented Architecture and Web Services
Genetics, Bioinformatics, and Biomedical Research
Environmental Monitoring and Data Management

Sandia National Laboratories
2024-2025

Brookhaven National Laboratory
2017-2024

Texas State University
2022

Argonne National Laboratory
2022

Michigan State University
2021

Purdue University West Lafayette
2014-2016

Oak Ridge National Laboratory
2000-2014

National Center for Supercomputing Applications
2005

Knoxville College
1997

University of Tennessee at Knoxville
1997

Supercomputer-Based Ensemble Docking Drug Discovery Pipeline with Application to Covid-19

OPENALEX - Publications

Atanu Acharya Rupesh Agarwal Matthew Baker Jérôme Baudry Debsindhu Bhowmik and 46 more

We present a supercomputer-driven pipeline for in silico drug discovery using enhanced sampling molecular dynamics (MD) and ensemble docking. Ensemble docking makes use of MD results by compound databases into representative protein binding-site conformations, thus taking account the dynamic properties binding sites. also describe preliminary obtained 24 systems involving eight proteins proteome SARS-CoV-2. The involves temperature replica exchange sampling, making massively parallel...

10.1021/acs.jcim.0c01010 article EN public-domain Journal of Chemical Information and Modeling 2020-12-16

Applying the FAIR Principles to computational workflows

OPENALEX - Publications

Sean Wilkinson Meznah Aloqalaa Khalid Belhajjame Michael R. Crusoe Bruno P. Kinoshita and 18 more

Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines...

10.1038/s41597-025-04451-9 article EN cc-by-nc-nd Scientific Data 2025-02-24

The Earth System Grid: Supporting the Next Generation of Climate Modeling Research

OPENALEX - Publications

David E. Bernholdt S. Bharathi D. I. Brown Kasidit Chanchio M. Chen and 15 more

Understanding the Earth's climate system and how it might be changing is a preeminent scientific challenge. Global models are used to simulate past, present, future climates, experiments executed continuously on an array of distributed supercomputers. The resulting data archive, spread over several sites, currently contains upwards 100 TB simulation growing rapidly. Looking toward mid-decade beyond, we must anticipate prepare for research holdings many petabytes. Earth System Grid (ESG)...

10.1109/jproc.2004.842745 article EN Proceedings of the IEEE 2005-02-28

Automatic tag recommendation for metadata annotation using probabilistic topic modeling

OPENALEX - Publications

Suppawong Tuarob Line Pouchard C. Lee Giles

The increase of the complexity and advancement in ecological environmental sciences encourages scientists across world to collect data from multiple places, times, thematic scales verify their hypotheses. Accumulated over time, such not only increases amount, but also diversity sources spread around world. This poses a huge challenge for who have manually search information. To alleviate problems, ONEMercury has recently been implemented as part DataONE project serve portal accessing...

10.1145/2467696.2467706 article EN 2013-07-22

Revisiting the Data Lifecycle with Big Data Curation

OPENALEX - Publications

Line Pouchard

As science becomes more data-intensive and collaborative, researchers increasingly use larger complex data to answer research questions. The capacity of storage infrastructure, the increased sophistication deployment sensors, ubiquitous availability computer clusters, development new analysis techniques, collaborations allow address grand societal challenges in a way that is unprecedented. In parallel, repositories have been built host response requirements sponsors be publicly available....

10.2218/ijdc.v10i2.342 article EN cc-by International Journal of Digital Curation 2016-05-27

Introduction to the Minitrack on Trustworthy Artificial Intelligence and Machine Learning

OPENALEX - Publications

Line Pouchard Peter Salhofer

10.24251/hicss.2025.897 article EN Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2025-01-01

An Overview of Decentralized Web Technologies as a Foundation for Future IPFS-Centric FDOs

OPENALEX - Publications

Andrei Vukolov E. Van Winkle Erik Schultes Line Pouchard Sina Iman and 2 more

dPIDs are an emerging PID technology based on decentralized architectures and self-sovereign identity [1]. containers, forming persistent storage systems where each object is identified by a unique PID. immune to content drift resolves deterministically their mapped content, providing reproducible binding between the (meta)data identifier. As take net-work protocol approach PIDs, implementation of FDOF recommendations may require further explanation [2]. This presentation primer technologies...

10.52825/ocp.v5i.1054 article EN cc-by Open Conference Proceedings 2025-03-18

Performance analysis and data reduction for exascale scientific workflows

OPENALEX - Publications

Christopher Kelly Wei Xu Line Pouchard Hubertus J. J. van Dam Tanzima Islam and 2 more

Chimbuko is the first in situ, scalable, workflow-level performance analysis tool for trace-level and visualization of application performance. This was developed by Co-design Center Online Data Analysis Reduction funded U.S. Department Energy’s Exascale Computing Project. We provide a detailed description Chimbuko’s architecture illustrate our online offline with multiple use cases. also present results deployment scalability as applied to high-energy physics workflow running at large scale...

10.1177/10943420251316253 article EN The International Journal of High Performance Computing Applications 2025-03-31

SWEET ontology coverage for earth system sciences

OPENALEX - Publications

Nicholas DiGiuseppe Line Pouchard Natalya F. Noy

10.1007/s12145-013-0143-1 article EN Earth Science Informatics 2014-01-20

A generalized topic modeling approach for automatic document annotation

OPENALEX - Publications

Suppawong Tuarob Line Pouchard Prasenjit Mitra C. Lee Giles

10.1007/s00799-015-0146-2 article EN International Journal on Digital Libraries 2015-03-06

Supercomputer-Based Ensemble Docking Drug Discovery Pipeline with Application to Covid-19

OPENALEX - Publications

Atanu Acharya Rupesh Agarwal Matthew Baker Jérôme Baudry Debsindhu Bhowmik and 44 more

We present a supercomputer-driven pipeline for

10.26434/chemrxiv.12725465 preprint EN cc-by-nc-nd 2020-07-29

Reproducibility Practice in High-Performance Computing: Community Survey Results

OPENALEX - Publications

Beth Plale Tanu Malik Line Pouchard

The integrity of science and engineering research is grounded in assumptions rigor transparency on the part those engaging such research. HPC community effort to strengthen take form reproducibility efforts. In a recent survey SC conference community, we collected information about initiative activities. We present results this article. Results show that activities have contributed higher levels awareness technical program participants, hint at contributing greater scientific impact for...

10.1109/mcse.2021.3096678 article EN publisher-specific-oa Computing in Science & Engineering 2021-09-01

An ontology for scientific information in a Grid environment: the earth system Grid

OPENALEX - Publications

Line Pouchard L. Cinquini Bob Drach D. Middleton David E. Bernholdt and 12 more

In the emerging world of Grid Computing, shared computational, data, other distributed resources are becoming available to enable scientific advancement through collaborative research and collaboratories. This paper describes increasing role ontologies in context Computing for obtaining, comparing analyzing data. We present ontology entities a declarative model that provide outline an information. Relationships between concepts also given. The implementation some described this is discussed...

10.1109/ccgrid.2003.1199424 article EN 2003-01-01

Computational reproducibility of scientific workflows at extreme scales

OPENALEX - Publications

Line Pouchard Sterling Baldwin Todd Elsethagen Shantenu Jha Bibi Raju and 3 more

We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics performance metrics. discuss two use cases: scientific of results in the Energy Exascale Earth System Model (E3SM—previously ACME) molecular dynamics workflows on HPC platforms. To capture persist data these workflows, we have designed developed Chimbuko ProvEn frameworks. captures enables detailed single workflow analysis. is a hybrid, queryable system storing analyzing...

10.1177/1094342019839124 article EN The International Journal of High Performance Computing Applications 2019-04-08

Online data analysis and reduction: An important Co-design motif for extreme-scale computers

OPENALEX - Publications

Ian Foster Mark Ainsworth Julie Bessac Franck Cappello Jong Choi and 23 more

A growing disparity between supercomputer computation speeds and I/O rates means that it is rapidly becoming infeasible to analyze application output only after has been written a file system. Instead, data-generating applications must run concurrently with data reduction and/or analysis operations, which they exchange information via high-speed methods such as interprocess communications. The resulting parallel computing motif, online (ODAR), important implications for both HPC systems...

10.1177/10943420211023549 article EN The International Journal of High Performance Computing Applications 2021-06-12

Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool

OPENALEX - Publications

Christopher Kelly Sungsoo Ha Kevin Huck Hubertus J. J. van Dam Line Pouchard and 6 more

Due to the sheer volume of data it is typically impractical analyze detailed performance an HPC application running at-scale. While conventional small-scale benchmarking and scaling studies are often sufficient for simple applications, many modern workflow-based applications couple multiple elements with competing resource demands complex inter-communication patterns which cannot easily be studied in isolation at small scale. This work discusses Chimbuko, a analysis framework that provides...

10.1145/3426462.3426465 article EN 2020-11-12

Priority research directions for in situ data management: Enabling scientific discovery from diverse data sources

OPENALEX - Publications

Tom Peterka Deborah Bard Janine Camille Bennett E. Wes Bethel Ron A. Oldfield and 3 more

In January 2019, the US Department of Energy, Office Science program in Advanced Scientific Computing Research, convened a workshop to identify priority research directions (PRDs) for situ data management (ISDM). A fundamental finding this is that methodologies used manage among variety tasks can be facilitate scientific discovery from many different sources—simulation, experiment, and sensors, example—and being able do so at numerous computing scales will benefit real-time decision-making,...

10.1177/1094342020913628 article EN The International Journal of High Performance Computing Applications 2020-03-27

Applying the FAIR Principles to Computational Workflows

OPENALEX - Publications

Sean Wilkinson Meznah Aloqalaa Khalid Belhajjame Michael R. Crusoe Bruno de Paula Kinoshita and 18 more

Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity, reproducibility, democratized access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines domains,...

10.48550/arxiv.2410.03490 preprint EN arXiv (Cornell University) 2024-10-04

A Linked Science investigation: enhancing climate change data discovery with semantic technologies

OPENALEX - Publications

Line Pouchard M. L. Branstetter Robert B. Cook Ranjeet Devarakonda Jim Green and 3 more

10.1007/s12145-013-0118-2 article EN Earth Science Informatics 2013-06-20

A rigorous uncertainty-aware quantification framework is essential for reproducible and replicable machine learning workflows

OPENALEX - Publications

Line Pouchard Kristofer G. Reyes Francis J. Alexander Byung-Jun Yoon

The capability to replicate the predictions by machine learning (ML) or artificial intelligence (AI) models and results in scientific workflows that incorporate such ML/AI is driven a variety of factors.

10.1039/d3dd00094j article EN cc-by-nc Digital Discovery 2023-01-01

Multiagent framework for lean manufacturing

OPENALEX - Publications

Nenad Ivezic Thomas E. Potok Line Pouchard

We have developed the manufacturing agent-based emulation system as an open framework for design and analysis of discrete systems. MABES currently supports transition from traditional to lean in two major functions: alternative scheduling control approaches that can be implemented across extended enterprise; real-time collaboration teams during line stages. bases its support these functions on paradigms: distributed agents synchronous collaboration.

10.1109/4236.793459 article EN IEEE Internet Computing 1999-01-01

Coming Soon ...