NFDI4DS | UHH-SEMS - Publication Details

Temperature management in data centers

OPENALEX - Publications

Nosayba El-Sayed Ioan Stefanovici George Amvrosiadis Andy A. Hwang Bianca Schroeder

The energy consumed by data centers is starting to make up a significant fraction of the world's consumption and carbon emissions. A large spent on center cooling, which has motivated body work temperature management in centers. Interestingly, key aspect not been well understood: controlling setpoint at run center's cooling system. Most set their thermostat based (conservative) suggestions manufacturers, as there limited understanding how higher temperatures will affect At same time, studies...

10.1145/2254756.2254778 article EN 2012-06-11

Temperature management in data centers

OPENALEX - Publications

Nosayba El-Sayed Ioan Stefanovici George Amvrosiadis Andy A. Hwang Bianca Schroeder

The energy consumed by data centers is starting to make up a significant fraction of the world's consumption and carbon emissions. A large spent on center cooling, which has motivated body work temperature management in centers. Interestingly, key aspect not been well understood: controlling setpoint at run center's cooling system. Most set their thermostat based (conservative) suggestions manufacturers, as there limited understanding how higher temperatures will affect At same time, studies...

10.1145/2318857.2254778 article EN ACM SIGMETRICS Performance Evaluation Review 2012-06-07

File systems unfit as distributed storage backends

OPENALEX - Publications

Abutalib Aghayev Sage A. Weil Michael Kuchnik Mark Nelson Gregory R. Ganger and 1 more

For a decade, the Ceph distributed file system followed conventional wisdom of building its storage backend on top local systems. This is preferred choice for most systems today because it allows them to benefit from convenience and maturity battle-tested code. Ceph's experience, however, shows that this comes at high price. First, developing zero-overhead transaction mechanism challenging. Second, metadata performance level can significantly affect level. Third, supporting emerging hardware...

10.1145/3341301.3359656 article EN 2019-10-21

Mochi: Composing Data Services for High-Performance Computing Environments

OPENALEX - Publications

Robert Ross George Amvrosiadis Philip Carns Charles D. Cranor Matthieu Dorier and 12 more

10.1007/s11390-020-9802-0 article EN Journal of Computer Science and Technology 2020-01-01

RAIZN: Redundant Array of Independent Zoned Namespaces

OPENALEX - Publications

Thomas Kim Jekyeom Jeon Nikhil Arora Huaicheng Li Michael Kaminsky and 4 more

Zoned Namespace (ZNS) SSDs are the latest evolution of host-managed flash storage, enabling improved performance at a lower cost-per-byte than traditional block interface (conventional) SSDs. To date, there is no support for arranging these new devices in arrays that offer increased throughput and reliability (RAID). We identify key challenges designing redundant ZNS SSD arrays, such as managing metadata updates persisting partial stripe writes absence overwrite from device. present RAIZN,...

10.1145/3575693.3575746 article EN 2023-01-27

FairyWREN: A Sustainable Cache for Emerging Write-Read-Erase Flash Interfaces

OPENALEX - Publications

Sara McAllister Y. Wang Benjamin Wagner vom Berg Daniel S. Berger Nathan Beckmann and 2 more

Datacenters need to reduce embodied carbon emissions, particularly for flash, which accounts 40% of in servers. However, decreasing flash’s emissions is challenging due limited write endurance, more than halves with each generation denser flash. Reducing requires extending flash lifetime, stressing its endurance even further. The legacy Logical Block-Addressable Device (LBAD) interface exacerbates the problem by forcing devices perform garbage collection, leading writes. Flash-based caches...

10.1145/3718390 article EN cc-by ACM Transactions on Storage 2025-03-05

Opportunistic storage maintenance

OPENALEX - Publications

George Amvrosiadis Angela Demke Brown Ashvin Goel

Storage systems rely on maintenance tasks, such as backup and layout optimization, to ensure data availability good performance. These tasks access large amounts of can significantly impact foreground applications. We argue that storage be performed more efficiently by prioritizing processing is currently cached in memory. Data either due other requesting it previously, or overlapping I/O activity.

10.1145/2815400.2815424 article EN 2015-10-01

Scaling Embedded In-Situ Indexing with DeltaFS

OPENALEX - Publications

Qing Zheng Charles D. Cranor Danhao Guo Gregory R. Ganger George Amvrosiadis and 4 more

Analysis of large-scale simulation output is a core element scientific inquiry, but analysis queries may experience significant I/O overhead when the data not structured for efficient retrieval. While in-situ processing allows improved time-to-insight many applications, scaling frameworks to hundreds thousands cores can be difficult in practice. The DeltaFS indexing new approach massive amounts achieve point and small-range queries. This paper describes challenges lessons learned this...

10.1109/sc.2018.00006 article EN 2018-11-01

Practical scrubbing: Getting to the bad sector at the right time

OPENALEX - Publications

George Amvrosiadis Alina Oprea Bianca Schroeder

Latent sector errors (LSEs) are a common hard disk failure mode, where sectors become inaccessible while the rest of remains unaffected. To protect against LSEs, commercial storage systems use scrubbers: background processes verifying data. The efficiency different scrubbing algorithms in detecting LSEs has been studied depth; however, no attempts have made to evaluate or mitigate impact on application performance. We provide first known evaluation performance policies implementation,...

10.1109/dsn.2012.6263919 article EN 2012-06-01

The Case for Custom Storage Backends in Distributed Storage Systems

OPENALEX - Publications

Abutalib Aghayev Sage A. Weil Michael Kuchnik Mark Nelson Gregory R. Ganger and 1 more

For a decade, the Ceph distributed file system followed conventional wisdom of building its storage backend on top local systems. This is preferred choice for most systems today, because it allows them to benefit from convenience and maturity battle-tested code. Ceph’s experience, however, shows that this comes at high price. First, developing zero-overhead transaction mechanism challenging. Second, metadata performance level can significantly affect level. Third, supporting emerging...

10.1145/3386362 article EN ACM Transactions on Storage 2020-05-18

Validating Large Language Models with ReLM

OPENALEX - Publications

Michael Kuchnik Virginia Smith George Amvrosiadis

Although large language models (LLMs) have been touted for their ability to generate natural-sounding text, there are growing concerns around possible negative effects of LLMs such as data memorization, bias, and inappropriate language. Unfortunately, the complexity generation capacities make validating (and correcting) difficult. In this work, we introduce ReLM, a system querying using standard regular expressions. ReLM formalizes enables broad range model evaluations, reducing complex...

10.48550/arxiv.2211.15458 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Progressive compressed records

OPENALEX - Publications

Michael Kuchnik George Amvrosiadis Virginia Smith

Deep learning accelerators efficiently train over vast and growing amounts of data, placing a newfound burden on commodity networks storage devices. A common approach to conserve bandwidth involves resizing or compressing data prior training. We introduce Progressive Compressed Records (PCRs), format that uses compression reduce the overhead fetching transporting effectively reducing training time required achieve target accuracy. PCRs deviate from previous formats by combining progressive...

10.14778/3476249.3476308 article EN Proceedings of the VLDB Endowment 2021-07-01

Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines

OPENALEX - Publications

Michael Kuchnik Ana Klimovic Jiřı Šimša George Amvrosiadis Virginia Smith

Input pipelines, which ingest and transform input data, are an essential part of training Machine Learning (ML) models. However, it is challenging to implement efficient as requires reasoning about parallelism, asynchrony, variability in fine-grained profiling information. Our analysis over two million ML jobs Google datacenters reveals that a significant fraction model could benefit from faster data pipelines. At the same time, our indicates most do not saturate host hardware, pointing...

10.48550/arxiv.2111.04131 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Software-defined storage for fast trajectory queries using a deltaFS indexed massive directory

OPENALEX - Publications

Qing Zheng George Amvrosiadis Saurabh Kadekodi Garth A. Gibson Charles D. Cranor and 3 more

In this paper we introduce the Indexed Massive Directory, a new technique for indexing data within DeltaFS. With its design as scalable, server-less file system HPC platforms, DeltaFS scales metadata performance with application scale. The Directory is novel extension to plane, enabling in-situ of massive amounts written single directory simultaneously, and in an arbitrarily large number files. We achieve through memory-efficient mechanism reordering data, log-structured storage layout pack...

10.1145/3149393.3149398 article EN 2017-11-03

Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories

OPENALEX - Publications

Qing Zheng Charles D. Cranor Ankush Jain Gregory R. Ganger Garth A. Gibson and 3 more

Complex storage stacks providing data compression, indexing, and analytics help leverage the massive amounts of generated today to derive insights. It is challenging perform this computation, however, while fully utilizing underlying media. This because, servers with large core counts are widely available, single-core performance memory bandwidth per grow slower than count die. Computational offers a promising solution problem by dedicated compute resources along processing path. We present...

10.1145/3415581 article EN ACM Transactions on Storage 2020-09-24

Extending the Mochi Methodology to Enable Dynamic HPC Data Services

OPENALEX - Publications

Matthieu Dorier Philip Carns Robert Ross Seth W. Snyder Robert Latham and 4 more

10.1109/ipdpsw63119.2024.00091 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2024-05-27

Reducing Cross-Cloud/Region Costs with the Auto-Configuring MACARON Cache

OPENALEX - Publications

Hojin Park Ziyue Qiu Gregory R. Ganger George Amvrosiadis

An increasing demand for cross-cloud and cross-region data access is bringing forth challenges related to high transfer costs latency. In response, we introduce Macaron, an auto-configuring cache system designed minimize cost remote access. A key insight behind Macaron that cloud size tied cost, not hardware limits, shifting the way think about design eviction policies. dynamically configures utilizes a mix of storage types adapt workload changes reduce costs. We demonstrate reduces by 65%...

10.1145/3694715.3695972 article EN cc-by 2024-11-04