NFDI4DS | UHH-SEMS - Publication Details

Mochi: Composing Data Services for High-Performance Computing Environments

OPENALEX - Publications

Robert Ross George Amvrosiadis Philip Carns Charles D. Cranor Matthieu Dorier and 12 more

10.1007/s11390-020-9802-0 article EN Journal of Computer Science and Technology 2020-01-01

Workload characterization of a leadership class storage cluster

OPENALEX - Publications

Youn Joong Kim Raghul Gunasekaran Galen Shipman David Dillow Zhe Zhang and 1 more

Understanding workload characteristics is critical for optimizing and improving the performance of current systems software, architecting new storage based on observed patterns. In this paper, we characterize scientific workloads world's fastest HPC (High Performance Computing) cluster, Spider, at Oak Ridge Leadership Computing Facility (OLCF). Spider provides an aggregate bandwidth over 240 GB/s with 10 petabytes RAID 6 formatted capacity. OLCFs flagship petascale simulation platform,...

10.1109/pdsw.2010.5668066 article EN 2010-11-01

DeltaFS

OPENALEX - Publications

Qing Zheng Kai Ren Garth A. Gibson Bradley W. Settlemyer Gary Grider

High performance computing fault tolerance depends on scalable parallel file system performance. For more than a decade bandwidth has been available from the object storage systems that underlie modern systems, and recently we have seen demonstrations of metadata using dynamic partitioning namespace over multiple servers. But even these require significant numbers dedicated servers, some workloads still experience bottlenecks. We envision exascale do not any server machines. Instead job...

10.1145/2834976.2834984 article EN 2015-11-11

A technique for moving large data sets over high-performance long distance networks

OPENALEX - Publications

Bradley W. Settlemyer Jonathan D. Dobson S. W. Hodson Jeffery A Kuehn Stephen W Poole and 1 more

In this paper we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although studies wide area networks have been a frequent topic interest, analyses tended focus on network latency and peak throughput using traffic generators. study instead perform an end-to-end analysis that includes reading from source file system committing remote destination system. An evaluation movement is also configurations...

10.1109/msst.2011.5937236 article EN 2011-05-01

Scaling Embedded In-Situ Indexing with DeltaFS

OPENALEX - Publications

Qing Zheng Charles D. Cranor Danhao Guo Gregory R. Ganger George Amvrosiadis and 4 more

Analysis of large-scale simulation output is a core element scientific inquiry, but analysis queries may experience significant I/O overhead when the data not structured for efficient retrieval. While in-situ processing allows improved time-to-insight many applications, scaling frameworks to hundreds thousands cores can be difficult in practice. The DeltaFS indexing new approach massive amounts achieve point and small-range queries. This paper describes challenges lessons learned this...

10.1109/sc.2018.00006 article EN 2018-11-01

Performance and scalability evaluation of the Ceph parallel file system

OPENALEX - Publications

Feiyi Wang Mark Nelson Sarp Oral Scott Atchley Sage A. Weil and 3 more

Ceph is an emerging open-source parallel distributed file and storage system. By design, leverages unreliable commodity network hardware, provides reliability fault-tolerance via controlled object placement data replication. This paper presents our block I/O performance scalability evaluation of for scientific high-performance computing (HPC) environments. Our work makes two unique contributions. First, performed under a realistic setup large-scale capability HPC environment using commercial...

10.1145/2538542.2538562 article EN 2013-11-15

Experimental Analysis of File Transfer Rates over Wide-Area Dedicated Connections

OPENALEX - Publications

Nageswara S. V. Rao Qiang Liu Satyabrata Sen Greg Hinkel Neena Imam and 5 more

File transfers over dedicated connections, supported by large parallel file systems, have become increasingly important in high-performance computing and big data workflows. It remains a challenge to achieve peak rates for such due the complexities of I/O, host, network transport subsystems, equally importantly, their interactions. We present extensive measurements disk-to-disk using Lustre XFS systems mounted on multi-core servers suite 10 Gbps emulated connections with 0-366 ms round trip...

10.1109/hpcc-smartcity-dss.2016.0038 article EN 2016-12-01

SSD-optimized workload placement with adaptive learning and classification in HPC environments

OPENALEX - Publications

Lipeng Wan Zheng Lu Qing Cao Feiyi Wang Sarp Oral and 1 more

In recent years, non-volatile memory devices such as SSD drives have emerged a viable storage solution due to their increasing capacity and decreasing cost. Due the unique capability requirements in large scale HPC (High Performance Computing) environment, hybrid configuration (SSD HDD) may represent one of most available balanced solutions considering cost performance. Under this setting, effective data placement well movement with controlled overhead become pressing challenge. paper, we...

10.1109/msst.2014.6855552 article EN 2014-06-01

Using server-to-server communication in parallel file systems to simplify consistency and improve performance

OPENALEX - Publications

Philip Carns Bradley W. Settlemyer W.B. Ligon

The trend in parallel computing toward clusters running thousands of cooperating processes per application has led to an I/O bottleneck that only gotten more severe as the CPU density increased. Current file systems provide large amounts aggregate bandwidth; however, they do not achieve high degrees metadata scalability required manage files distributed across hundreds or storage nodes. In this paper we examine use collective communication between servers improve operations. particular,...

10.5555/1413370.1413377 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2008-11-15

Asynchronous object storage with QoS for scientific and commercial big data

OPENALEX - Publications

Michael J. Brim David Dillow Sarp Oral Bradley W. Settlemyer Feiyi Wang

This paper presents our design for an asynchronous object storage system intended use in scientific and commercial big data workloads. Use cases from the target workload domains are used to motivate key abstractions application programming interface (API). The architecture of Scalable Object Store (SOS), a prototype that supports API's facilities, is presented. SOS serves as vehicle future research into scalable resilient storage. We briefly review providing efficient servers capable quality...

10.1145/2538542.2538565 article EN 2013-11-15

Profiling transport performance for big data transfer over dedicated channels

OPENALEX - Publications

Daqing Yun Chase Q. Wu Nageswara S. V. Rao Bradley W. Settlemyer Josh Lothian and 2 more

The transfer of big data is increasingly supported by dedicated channels in high-performance networks. Transport protocols play a critical role maximizing the link utilization such high-speed connections. We propose Profile Generator (TPG) to characterize and enhance end-to-end throughput performance transport protocols. TPG automates tuning various transport-related parameters including socket options protocol-specific configurations, supports multiple streams NIC-to-NIC To instantiate...

10.1109/iccnc.2015.7069458 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2015-02-01

To share or not to share: comparing burst buffer architectures

OPENALEX - Publications

Lei Cao Bradley W. Settlemyer John Bent

Modern high performance computing platforms employ burst buffers to overcome the I/O bottleneck that limits scale and efficiency of large-scale parallel computations. Currently there are two competing buffer architectures. One is treat as a dedicated shared resource, The other integrate hardware into each compute node. In this paper we examine design tradeoffs associated with local shared, architectures through modeling. By seeding our simulation realistic workloads, able systematically...

10.5555/3108096.3108100 article EN High Performance Computing Symposium 2017-04-23

KVRangeDB: Range Queries for a Hash-based Key–Value Device

OPENALEX - Publications

Mian Qin Qing Zheng Jason Lee Bradley W. Settlemyer Fei Wen and 2 more

Key–value (KV) software has proven useful to a wide variety of applications including analytics, time-series databases, and distributed file systems. To satisfy the requirements diverse workloads, KV stores have been carefully tailored best match performance characteristics underlying solid-state block devices. Emerging storage device is promising technology for both simplifying stack improving persistent storage-based applications. However, while providing fast, predictable put get...

10.1145/3582013 article EN ACM Transactions on Storage 2023-01-21

KV-CSD: A Hardware-Accelerated Key-Value Store for Data-Intensive Applications

OPENALEX - Publications

Inhyuk Park Qing Zheng Dominic Manno Soonyeal Yang Jason Lee and 5 more

Popular software key-value stores such as LevelDB and RocksDB are often tailored for efficient writing. Yet, they tend to also perform well on read operations. This is because while data initially stored in a format that favors writes, it later transformed by the DB background into better accommodates reads. Write-optimized can still block writes. happens when those workers cannot keep up with foreground insertion workload.This paper advocates hardware-accelerated store, enabling...

10.1109/cluster52292.2023.00019 article EN 2023-10-31

Optimizing Application Performance with BlueField: Accelerating Large-Message Blocking and Nonblocking Collective Operations

OPENALEX - Publications

Richard L. Graham George Bosilca Yong Qin Bradley W. Settlemyer Gilad Shainer and 6 more

With the end of Dennard scaling, specializing and distributing compute engines throughout system is a promising technique to improve applications performance. For example, NVIDIA's BlueField Data Processing Unit (DPU) integrates programmable processing elements within network offers specialized capabilities. These capabilities enable communication via offloads onto DPUs present new application opportunities for offloading nonblocking or complex patterns such as collective operations. This...

10.23919/isc.2024.10528935 article EN 2024-05-01

Experimental analysis of 10Gbps transfers over physical and emulated dedicated connections

OPENALEX - Publications

Bradley W. Settlemyer Nageswara S. V. Rao Stephen W Poole S. W. Hodson Susan E. Hicks and 1 more

Long haul data transfers require the optimization and balancing of performances host storage systems as well network transport. An assessment such transport methods requires a systematic generation throughput profiles from measurements collected over different system parameters connection lengths. We describe to support wide-area I/O at 10 Gbps, present memory disk transfer throughputs suites physical emulated connections several thousands miles. The are limited by infrastructure incur...

10.1109/iccnc.2012.6167544 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2012-01-01

Reliability Characterization of Solid State Drives in a Scalable Production Datacenter

OPENALEX - Publications

Shuwen Liang Zhi Qiao Jacob Hochstetler Song Huang Song Fu and 5 more

In recent years, NAND flash-based solid state drives (SSD) have been widely used in datacenters due to their better performance compared with the traditional hard disk drives. However, little is known about reliability characteristics of SSDs production systems. Existing works study statistical distributions SSD failures field. they do not go deep into and investigate unique error types health dynamics that distinguish from this paper, we explore SSD-specific SMART (Self-Monitoring,...

10.1109/bigdata.2018.8622643 article EN 2021 IEEE International Conference on Big Data (Big Data) 2018-12-01

Parameterized benchmarking of parallel discrete event simulation systems: communication, computation, and memory

OPENALEX - Publications

Eun Park Stephan Eidenbenz Nandakishore Santhi Guillaume Chapuis Bradley W. Settlemyer

We introduce La-pdes, a parameterized benchmark application for measuring parallel and serial discrete event simulation (PDES) performance. Applying holistic view of PDES system performance, La-pdes tests the performance factors (i) (P)DES engine in terms queue efficiency, synchronization mechanism, load-balancing schemes; (ii) available hardware handling computationally intensive loads, memory size, cache hierarchy, clock speed; (iii) interaction with communication middleware (often MPI)...

10.5555/2888619.2888945 article EN Winter Simulation Conference 2015-12-06

Using server-to-server communication in parallel file systems to simplify consistency and improve performance

OPENALEX - Publications

Philip Carns Bradley W. Settlemyer W.B. Ligon

The trend in parallel computing toward clusters running thousands of cooperating processes per application has led to an I/O bottleneck that only gotten more severe as the CPU density increased. Current file systems provide large amounts aggregate bandwidth; however, they do not achieve high degrees metadata scalability required manage files distributed across hundreds or storage nodes. In this paper we examine use collective communication between servers improve operations. particular,...

10.1109/sc.2008.5214724 article EN 2008-11-01

A Case for Optimistic Coordination in HPC Storage Systems

OPENALEX - Publications

Philip Carns Kevin Harms Dries Kimpe Robert Ross Justin M. Wozniak and 8 more

High-performance computing (HPC) storage systems rely on access coordination to ensure that concurrent updates do not produce incoherent results. HPC typically employ pessimistic distributed locking provide this functionality in cases where applications cannot perform their own coordination. This approach, however, introduces significant performance overhead and complicates fault handling. In work we evaluate the viability of optimistic conditional operations as an alternative systems. We...

10.1109/sc.companion.2012.19 article EN 2012-11-01

Sustained Wide-Area TCP Memory Transfers over Dedicated Connections

OPENALEX - Publications

Nageswara S. V. Rao Don Towsley Gayane Vardoyan Bradley W. Settlemyer Ian Foster and 1 more

Wide-area memory transfers between on-going computations and remote steering, analysis visualization sites can be utilized in several High-Performance Computing (HPC) scenarios. Dedicated network connections with high capacity, low loss rates competing traffic, are typically provisioned over current HPC infrastructures to support such transfers. To gain insights into transfers, we collected throughput measurements for different versions of TCP dedicated multi-core servers emulated 10 Gbps...

10.1109/hpcc-css-icess.2015.86 article EN 2015-08-01

Virtual Environment for Testing Software-Defined Networking Solutions for Scientific Workflows

OPENALEX - Publications

Qiang Liu Nageswara S. V. Rao Satyabrata Sen Bradley W. Settlemyer Hsing‐Bung Chen and 3 more

Recent developments in software-defined infrastructures promise that scientific workflows utilizing supercomputers, instruments, and storage systems will be dynamically composed orchestrated using software at unprecedented speed scale the near future. Testing of underlying networking software, particularly during initial exploratory stages, remains a challenge due to potential disruptions, resource allocation coordination needed over multi-domain physical infrastructure. To overcome these...

10.1145/3217197.3217202 article EN 2018-06-07

To Share Or Not To Share: Comparing Burst Buffer Architectures

OPENALEX - Publications

Lei Cao Bradley W. Settlemyer John Bent

Modern high performance computing platforms employ burst buffers to overcome the I/O bottleneck that limits scale and efficiency of large-scale parallel computations. Currently there are two competing buffer architectures. One is treat as a dedicated shared resource, The other integrate hardware into each compute node. In this paper we examine design tradeoffs associated with local shared, architectures through modeling. By seeding our simulation realistic workloads, able systematically...

10.22360/springsim.2017.hpc.009 article EN 2017-01-01

Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory

OPENALEX - Publications

Qing Zheng George Amvrosiadis Saurabh Kadekodi Garth A. Gibson Charles D. Cranor and 3 more

In this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as scalable, server-less file system HPC platforms, DeltaFS scales metadata performance with application scale. The Directory is novel extension to plane, enabling in-situ of massive amounts written single directory simultaneously, and in an arbitrarily large number files. We achieve through memory-efficient mechanism reordering data, log-structured storage layout pack...

10.1145/3149393.3149398 article EN 2017-11-03