NFDI4DS | UHH-SEMS - Publication Details

Tyler J. Skluzacek

ORCID: 0000-0003-2242-4931

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5079278437

Research Areas

Scientific Computing and Data Management
Distributed and Parallel Computing Systems
Advanced Data Storage Technologies
Research Data Management Practices
Data Quality and Management
Cloud Computing and Resource Management
Distributed systems and fault tolerance
Advanced Database Systems and Queries
Data Management and Algorithms
Geographic Information Systems Studies
Business Process Modeling and Analysis
Mobile Crowdsensing and Crowdsourcing
Geological Modeling and Analysis
Environmental Monitoring and Data Management
Semantic Web and Ontologies
IoT and Edge/Fog Computing
Explainable Artificial Intelligence (XAI)
Service-Oriented Architecture and Web Services
Blockchain Technology Applications and Security
Cloud Data Security Solutions
Machine Learning in Materials Science
Machine Learning and Data Classification

Oak Ridge National Laboratory
2022-2024

Office of Scientific and Technical Information
2024

Naval Research Laboratory Information Technology Division
2023

University of Chicago
2016-2022

University of Illinois Chicago
2020-2022

Argonne National Laboratory
2017

funcX: A Federated Function Serving Fabric for Science

OPENALEX - Publications

Ryan Chard Yadu Babuji Zhuozhao Li Tyler J. Skluzacek Anna Woodard and 3 more

Exploding data volumes and velocities, new computational methods platforms, ubiquitous connectivity demand approaches to computation in the sciences. These must enable be mobile, so that, for example, it can occur near data, triggered by events (e.g., arrival of data), offloaded specialized accelerators, or run remotely where resources are available. They also require design which monolithic applications decomposed into smaller components, that may turn executed separately on most suitable...

10.1145/3369583.3392683 preprint EN 2020-06-22

DLHub: Simplifying publication, discovery, and use of machine learning models in science

OPENALEX - Publications

Zhuozhao Li Ryan Chard Logan Ward Kyle Chard Tyler J. Skluzacek and 6 more

10.1016/j.jpdc.2020.08.006 article EN publisher-specific-oa Journal of Parallel and Distributed Computing 2020-08-27

funcX: Federated Function as a Service for Science

OPENALEX - Publications

Zhuozhao Li Ryan Chard Yadu Babuji Ben Galewsky Tyler J. Skluzacek and 7 more

funcX is a distributed function as service (FaaS) platform that enables flexible, scalable, and high performance remote execution. Unlike centralized FaaS systems, decouples the cloud-hosted management functionality from edge-hosted execution functionality. funcX's endpoint software can be deployed, by users or administrators, on arbitrary laptops, clouds, clusters, supercomputers, in effect turning them into serving systems. provides single location for registering, sharing, managing both...

10.1109/tpds.2022.3208767 article EN IEEE Transactions on Parallel and Distributed Systems 2022-09-22

Serverless Supercomputing: High Performance Function as a Service for Science

OPENALEX - Publications

Ryan Chard Tyler J. Skluzacek Zhuozhao Li Yadu Babuji Anna Woodard and 4 more

Growing data volumes and velocities are driving exciting new methods across the sciences in which analytics machine learning increasingly intertwined with research. These require approaches for scientific computing computation is mobile, so that, example, it can occur near data, be triggered by events (e.g., arrival of data), or offloaded to specialized accelerators. They also design monolithic applications decomposed into smaller components, that may turn executed separately on most...

10.48550/arxiv.1908.04907 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Klimatic: A Virtual Data Lake for Harvesting and Distribution of Geospatial Data

OPENALEX - Publications

Tyler J. Skluzacek Kyle Chard Ian Foster

Many interesting geospatial datasets are publicly accessible on web sites and other online repositories. However, the sheer number of locations, plus a lack support for cross-repository search, makes it difficult researchers to discover integrate relevant data. We describe here early results from system, Klimatic, that aims overcome these barriers discovery use by automating tasks crawling, indexing, integrating, distributing Klimatic implements scalable crawling processing architecture uses...

10.1109/pdsw-discs.2016.010 article EN 2016-11-01

Towards Lightweight Data Integration Using Multi-Workflow Provenance and Data Observability

OPENALEX - Publications

Renan P. Souza Tyler J. Skluzacek Sean Wilkinson Maxim Ziatdinov Rafael Ferreira da Silva

Modern large-scale scientific discovery requires multidisciplinary collaboration across diverse computing facilities, including High Performance Computing (HPC) machines and the Edge-to-Cloud continuum. Integrated data analysis plays a crucial role in discovery, especially current AI era, by enabling Responsible development, FAIR, Reproducibility, User Steering. However, heterogeneous nature of science poses challenges such as dealing with multiple supporting tools, cross-facility...

10.1109/e-science58273.2023.10254822 article EN 2023-09-25

Serverless Workflows for Indexing Large Scientific Data

OPENALEX - Publications

Tyler J. Skluzacek Ryan Chard Ryan Wong Zhuozhao Li Yadu Babuji and 4 more

The use and reuse of scientific data is ultimately dependent on the ability to understand what those represent, how they were captured, can be used. In many ways, are only as useful metadata available describe them. Unfortunately, due growing volumes, large distributed collaborations, a desire store for long periods time, "data lakes" quickly become disorganized lack necessary researchers. New automated approaches needed derive from files these organization discovery. Here we one such...

10.1145/3366623.3368140 article EN 2019-11-18

Klimatic: a virtual data lake for harvesting and distribution of geospatial data

OPENALEX - Publications

Tyler J. Skluzacek Kyle Chard Ian Foster

10.5555/3019046.3019052 article EN 2016-11-13

Skluma: An Extensible Metadata Extraction Pipeline for Disorganized Data

OPENALEX - Publications

Tyler J. Skluzacek Rohan Kumar Ryan Chard Galen Harrison Paul G. Beckman and 2 more

To mitigate the effects of high-velocity data expansion and to automate organization filesystems repositories, we have developed Skluma-a system that automatically processes a target filesystem or repository, extracts content-and context-based metadata, organizes extracted metadata for subsequent use. Skluma is able extract diverse including aggregate values derived from embedded structured data; named entities latent topics buried within free-text documents; content encoded in images....

10.1109/escience.2018.00040 article EN 2018-10-01

A Serverless Framework for Distributed Bulk Metadata Extraction

OPENALEX - Publications

Tyler J. Skluzacek Ryan Wong Zhuozhao Li Ryan Chard Kyle Chard and 1 more

We introduce Xtract, an automated and scalable system for bulk metadata extraction from large, distributed research data repositories. Xtract orchestrates the application of extractors to groups files, determining which apply each file and, extractor file, where execute. A hybrid computing model, built on funcX federated FaaS platform, enables balance tradeoffs between time transfer costs by dispatching task most appropriate location. Experiments a range clouds supercomputers show that can...

10.1145/3431379.3460636 article EN 2021-06-17

Towards Cross-Facility Workflows Orchestration through Distributed Automation

OPENALEX - Publications

Tyler J. Skluzacek Renan Souza Mark Coletti Frédéric Suter Rafael Ferreira da Silva

10.1145/3626203.3670606 article EN 2024-07-17

Skluma

OPENALEX - Publications

Paul G. Beckman Tyler J. Skluzacek Kyle Chard Ian Foster

Scientists' capacity to make use of existing data is predicated on their ability find and understand those data. While significant progress has been made with respect publication, indeed one can point a number well organized highly utilized repositories, there remain many such repositories in which archived are poorly described thus impossible use. We present Skluma---an automated system designed process vast amounts extract deeply embedded metadata, latent topics, relationships between...

10.1145/3085504.3091116 article EN 2017-06-05

Workflows Community Summit 2022: A Roadmap Revolution

OPENALEX - Publications

Rafael Ferreira da Silva Rosa M. Badía Venkat Bala Debbie Bard Peer‐Timo Bremer and 95 more

Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on to orchestrate large and complex experiments that range from execution of a cloud-based data preprocessing pipeline multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape evolving needs emerging applications, it paramount development novel system functionalities seek increase efficiency, resilience, pervasiveness...

10.48550/arxiv.2304.00019 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Can Automated Metadata Extraction Make Scientific Data More Navigable?

OPENALEX - Publications

Tyler J. Skluzacek Kyle Chard Ian Foster

FAIR principles require that scientific data be findable, discoverable, and reusable by users. To enable FAIRness, practioners of a science repository will often construct rich, searchable index metadata derived from the data. Unfortunately, manual annotation methods do not scale to many files generated projects; instead automated extraction systems are needed scalably parse these files—often with nonstandard schema requiring specialized parsing strategies—and deposit representative into...

10.1109/e-science58273.2023.10254801 article EN 2023-09-25

Dredging a data lake

OPENALEX - Publications

Tyler J. Skluzacek

The rapid generation of data from distributed IoT devices, scientific instruments, and compute clusters presents unique management challenges. influx large, heterogeneous, complex causes repositories to become siloed or generally unsearchable---both problems not currently well-addressed by file systems. In this work, we propose Xtract, a serverless middleware extract metadata files spread across heterogeneous edge computing resources. my future intend study how Xtract can automatically...

10.1145/3366624.3368170 article EN 2019-11-27

Advancing Computational Earth Sciences:Innovations and Challenges in Scientific HPC Workflows

OPENALEX - Publications

Rafael Ferreira da Silva Ketan Maheshwari Tyler J. Skluzacek Renan Souza Sean Wilkinson

The advancement of science is increasingly intertwined with complex computational processes [1]. Scientific workflows are at the heart this evolution, acting as essential orchestrators for a vast range experiments. Specifically, these central to field Earth Sciences, where they orchestrate diverse activities, from cloud-based data preprocessing pipelines in environmental modeling intricate multi-facility instrument-to-edge-to-HPC frameworks seismic analysis and geophysical simulations [2]....

10.5194/egusphere-egu24-21636 preprint EN 2024-03-11

Globus service enhancements for exascale applications and facilities

OPENALEX - Publications

Weijian Zheng Jack Kordas Tyler J. Skluzacek Raj Kettimuthu Ian Foster

Many extreme-scale applications require the movement of large quantities data to, from, and among leadership computing facilities, as well other scientific facilities home institutions facility users. These applications, particularly when are involved, can touch upon edge cases (e.g., terabyte files) that had not been a focus previous Globus optimization work, which emphasized rather many smaller (megabyte to gigabyte) files. We report here on how automated client-driven chunking be used...

10.1177/10943420241281744 article EN The International Journal of High Performance Computing Applications 2024-09-09

Workflow Provenance in the Computing Continuum for Responsible, Trustworthy, and Energy-Efficient AI

OPENALEX - Publications

Renan P. Souza Silvina Caíno‐Lores Mark Coletti Tyler J. Skluzacek Alexandru Costan and 3 more

10.1109/e-science62913.2024.10678731 article EN 2024-09-16

Workflows Community Summit 2024: Future Trends and Challenges in Scientific Workflows

OPENALEX - Publications

Rafael Ferreira da Silva Deborah Bard Kyle Chard de Witt Shaun Ian Foster and 95 more

The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive AI-HPC convergence, multi-facility heterogeneous HPC environments, user experience, FAIR computational workflows. integration of AI exascale computing has revolutionized enabling higher-fidelity models complex, processes, while introducing managing environments data dependencies. rise large language is driving...

10.5281/zenodo.13844758 preprint EN cc-by 2024-10-18

Scalable Multi-Facility Workflows for Artificial Intelligence Applications in Climate Research

OPENALEX - Publications

Takuya Kurihana Tyler J. Skluzacek Rafael Ferreira da Silva Valentine Anantharaj

10.1109/scw63240.2024.00266 article EN 2024-11-17

Coming Soon ...