NFDI4DS | UHH-SEMS - Publication Details

M-S. Barisits

ORCID: 0000-0003-0253-106X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5103268175

Research Areas

Particle physics theoretical and experimental studies
High-Energy Particle Collisions Research
Particle Detector Development and Performance
Quantum Chromodynamics and Particle Interactions
Dark Matter and Cosmic Phenomena
Computational Physics and Python Applications
Cosmology and Gravitation Theories
Neutrino Physics Research
Distributed and Parallel Computing Systems
Advanced Data Storage Technologies
Scientific Computing and Data Management
Big Data Technologies and Applications
Radiation Detection and Scintillator Technologies
Medical Imaging Techniques and Applications
Parallel Computing and Optimization Techniques
Astrophysics and Cosmic Phenomena
Black Holes and Theoretical Physics
Distributed systems and fault tolerance
Atomic and Subatomic Physics Research
Particle Accelerators and Free-Electron Lasers
Superconducting Materials and Applications
Structural Analysis of Composite Materials
Digital Radiography and Breast Imaging
Caching and Content Delivery
Cloud Computing and Remote Desktop Technologies

European Organization for Nuclear Research
2016-2025

Government of Catalonia
2024

A. Alikhanyan National Laboratory
2024

Institute of High Energy Physics
2024

SR Research (Canada)
2024

Federación Española de Enfermedades Raras
2024

Atlas Scientific (United States)
2024

The University of Adelaide
2016-2023

Max Planck Institute for Physics
2019-2023

Brandeis University
2019-2020

A Roadmap for HEP Software and Computing R&D for the 2020s

OPENALEX - Publications

J. Albrecht A. A. Alves G. Amádio Giuseppe Andronico Nguyen Anh-Ky and 95 more

Particle physics has an ambitious and broad experimental programme for the coming decades. This requires large investments in detector hardware, either to build new facilities experiments, or upgrade existing ones. Similarly, it commensurate investment R&D of software acquire, manage, process, analyse shear amounts data be recorded. In planning HL-LHC particular, is critical that all collaborating stakeholders agree on goals priorities, efforts complement each other. this spirit, white paper...

10.1007/s41781-018-0018-8 article EN cc-by Computing and Software for Big Science 2019-03-20

Rucio: Scientific Data Management

OPENALEX - Publications

M-S. Barisits T. A. Beermann F. Berghaus Brian Bockelman Joaquin Bogado and 25 more

Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The can be distributed across heterogeneous centers widely locations. was originally developed meet requirements of high-energy physics experiment ATLAS, now continuously extended support LHC experiments other diverse communities. In this article, we detail fundamental concepts Rucio, describe architecture along implementation details,...

10.1007/s41781-019-0026-3 article EN cc-by Computing and Software for Big Science 2019-08-09

Rucio – The next generation of large scale distributed system for ATLAS Data Management

OPENALEX - Publications

V. Garonne R. Vigne G. A. Stewart M-S. Barisits T B eermann and 4 more

Rucio is the next-generation Distributed Data Management (DDM) system benefiting from recent advances in cloud and "Big Data" computing to address HEP experiments scaling requirements. an evolution of ATLAS DDM Don Quijote 2 (DQ2), which has demonstrated very large scale data management capabilities with more than 140 petabytes spread worldwide across 130 sites, accesses 1,000 active users. However, DQ2 reaching its limits terms scalability, requiring a number support staff operate being...

10.1088/1742-6596/513/4/042021 article EN Journal of Physics Conference Series 2014-06-11

Making cluster applications energy-aware

OPENALEX - Publications

Nedeljko Vasić M-S. Barisits Vincent Salzgeber Dejan Kostić

Power consumption has become a critical issue in large scale clusters.Existing solutions for addressing the servers' energy suggest "shrinking" set of active machines, at least until more power-proportional hardware devices available.This paper demonstrates that leveraging sleeping state, however, may lead to unacceptably poor performance and low data availability if distributed services are not aware power management's actions.Therefore, we present an architecture cluster which deployed...

10.1145/1555271.1555281 article EN 2009-06-19

The ATLAS Distributed Data Management project: Past and Future

OPENALEX - Publications

V. Garonne G. A. Stewart M. Lassnig Angelos Molfetas M-S. Barisits and 7 more

ATLAS has recorded more than 8 petabyte(PB) of RAW data since the LHC started running at end 2009. Many derived products and complimentary simulation have also been produced by collaboration and, in total, 90PB are currently stored Worldwide Computing Grid ATLAS. All these managed Distributed Data Management system, called Don Quijote 2 (DQ2). DQ2 evolved rapidly to help operations manage large quantities across many grid sites which runs, physicists get access data.

10.1088/1742-6596/396/3/032045 article EN Journal of Physics Conference Series 2012-12-13

Extending Rucio with modern cloud storage support

OPENALEX - Publications

M-S. Barisits R. M. Barnsley Fernando Harald Barreiro Megino J. Elmsheuser M. Lassnig and 5 more

Rucio is a software framework designed to facilitate scientific collaborations in efficiently organising, managing, and accessing extensive volumes of data through customizable policies. The enables distribution across globally distributed locations heterogeneous centres, integrating various storage network technologies into unified federated entity. offers advanced features like recovery adaptive replication, it exhibits high scalability, modularity, extensibility. Originally developed meet...

10.1051/epjconf/202429501030 article EN cc-by EPJ Web of Conferences 2024-01-01

ATLAS Data Carousel

OPENALEX - Publications

M-S. Barisits M. Borodin A. Di Girolamo J. Elmsheuser D. Golubkov and 5 more

The ATLAS experiment at CERN’s LHC stores detector and simulation data in raw derived formats across more than 150 Grid sites world-wide, currently total about 200PB on disk 250PB tape. Data have different access characteristics due to various computational workflows, can be accessed from media, such as remote I/O, cache hard drives or SSDs. Also, larger centers provide the majority of offline storage capability via tape systems. For HighLuminosity (HL-LHC), estimated requirements are...

10.1051/epjconf/202024504035 article EN cc-by EPJ Web of Conferences 2020-01-01

Rucio, the next-generation Data Management system in ATLAS

OPENALEX - Publications

C. Serfon M-S. Barisits T. A. Beermann V. Garonne L. Goossens and 3 more

Rucio is the next-generation of Distributed Data Management (DDM) system benefiting from recent advances in cloud and "Big Data" computing to address HEP experiments scaling requirements. an evolution ATLAS DDM Don Quixote 2 (DQ2), which has demonstrated very large scale data management capabilities with more than 160 petabytes spread worldwide across 130 sites, accesses 1,000 active users. However, DQ2 reaching its limits terms scalability, requiring a number support staff operate being...

10.1016/j.nuclphysbps.2015.09.151 article EN cc-by Nuclear and Particle Physics Proceedings 2016-04-01

Popularity framework to process dataset traces and its application on dynamic replica reduction in the ATLAS experiment

OPENALEX - Publications

Angelos Molfetas Fernando Barreiro Megino A. Tykhonov M. Lassnig V. Garonne and 6 more

The ATLAS experiment's data management system is constantly tracing file movement operations that occur on the Worldwide LHC Computing Grid (WLCG). Due to large scale of WLCG, statistical analysis traces infeasible in real-time. Factors contribute scalability problems include capability for users initiate on-demand queries, high dimensionality tracer entries combined with very low cardinality parameters, and size namespace. These issues are alleviated through adoption an incremental model...

10.1088/1742-6596/331/6/062018 article EN Journal of Physics Conference Series 2011-12-23

ATLAS Replica Management in Rucio: Replication Rules and Subscriptions

OPENALEX - Publications

M-S. Barisits C. Serfon V. Garonne M. Lassnig G. A. Stewart and 5 more

The ATLAS Distributed Data Management system stores more than 150PB of physics data across 120 sites globally. To cope with the anticipated workload coming decade, Rucio, next-generation management has been developed. Replica management, as one key aspects system, to satisfy critical performance requirements in order keep pace experiment's high rate continual generation. challenge lies meeting these objectives while still giving users and applications a powerful toolkit control their...

10.1088/1742-6596/513/4/042003 article EN Journal of Physics Conference Series 2014-06-11

The ATLAS Data Management System Rucio: Supporting LHC Run-2 and beyond

OPENALEX - Publications

M-S. Barisits T. A. Beermann V. Garonne T. Javůrek M. Lassnig and 1 more

With this contribution we present some recent developments made to Rucio, the data management system of High-Energy Physics Experiment ATLAS. Already managing 300 Petabytes both official and user data, Rucio has seen incremental improvements throughout LHC Run-2, is currently laying groundwork for HEP computing in HL-LHC era. The focus are (a) automations that have been put place such as rebalancing or dynamic replication well their supporting infrastructures real-time networking metrics...

10.1088/1742-6596/1085/3/032030 article EN Journal of Physics Conference Series 2018-09-01

Popularity Prediction Tool for ATLAS Distributed Data Management

OPENALEX - Publications

T. A. Beermann Peter Maettig G. A. Stewart M. Lassnig V. Garonne and 6 more

This paper describes a popularity prediction tool for data-intensive data management systems, such as ATLAS distributed (DDM). It is fed by the DDM system, which produces historical reports about usage, providing information files, datasets, users and sites where was accessed. The described in this contribution uses to make future of data. finds trends usage using set neural networks input parameters predicts number accesses near term future. can then be used second step improve distribution...

10.1088/1742-6596/513/4/042004 article EN Journal of Physics Conference Series 2014-06-11

Forecasting network throughput of remote data access in computing grids

OPENALEX - Publications

Volodimir Begy M-S. Barisits M. Lassnig Erich Schikuta

10.1016/j.jocs.2020.101158 article EN Journal of Computational Science 2020-06-15

A similarity measure for time, frequency, and dependencies in large-scale workloads

OPENALEX - Publications

M. Lassnig Thomas Fahringer V. Garonne Angelos Molfetas M-S. Barisits

Performance evaluations of large-scale systems require the use representative workloads with certifiable similar or dissimilar characteristics. To quantify similarity characteristics, we describe a novel measure comprising two efficient methods that are suitable for workloads. One method uses discrete wavelet transform to assess periodic time and frequency characteristics in workload. The second evaluates dependencies descriptive attributes via association rule learning. Both evaluated find...

10.1145/2063384.2063441 article EN 2011-11-08

The Data Ocean Project

OPENALEX - Publications

M-S. Barisits F. Barreiro T. A. Beermann Karan Bhatia K. De and 12 more

Transparent use of commercial cloud resources for scientific experiments is a hard problem. In this article, we describe the first steps Data Ocean R&D collaboration between high-energy physics experiment ATLAS together with Google Cloud Platform, to allow seamless Compute Engine and Storage analysis. We start by describing three preliminary cases that were identified at beginning project. The following sections then detail work done in data management system Rucio workflow systems PanDA...

10.1051/epjconf/201921404020 article EN cc-by EPJ Web of Conferences 2019-01-01

Popularity framework for monitoring user workload

OPENALEX - Publications

Angelos Molfetas M. Lassnig V. Garonne G. A. Stewart M-S. Barisits and 2 more

This paper describes a monitoring framework for large scale data management systems with frequent access. allows to generate meaningful information from collected tracing and be queried on demand specific user usage patterns in respect source destination locations, period intervals, other searchable parameters. The feasibility of such system at the petabyte is demonstrated by describing implementation operational experience real world ATLAS experiment employing proposed framework. Our...

10.1088/1742-6596/396/5/052055 article EN Journal of Physics Conference Series 2012-12-13

Experiences with the new ATLAS Distributed Data Management System

OPENALEX - Publications

V. Garonne M-S. Barisits T. A. Beermann M. Lassnig C. Serfon and 1 more

The ATLAS Distributed Data Management (DDM) system has evolved drastically in the last two years with Rucio software fully replacing previous before start of LHC Run-2. DDM manages now more than 250 petabytes spread on 130 storage sites and can handle file transfer rates up to 30Hz. In this paper, we discuss our experience acquired developing, commissioning, running maintaining such a large system. First, describe general architecture system, integration external services like WLCG File...

10.1088/1742-6596/898/6/062019 article EN Journal of Physics Conference Series 2017-10-01

Scalable and fail-safe deployment of the ATLAS Distributed Data Management system Rucio

OPENALEX - Publications

M. Lassnig R. Vigne T. A. Beermann M-S. Barisits V. Garonne and 1 more

This contribution details the deployment of Rucio, ATLAS Distributed Data Management system. The main complication is that Rucio interacts with a wide variety external services, and connects globally distributed data centres under different technological administrative control, at an unprecedented volume. It therefore not possible to create duplicate instance for testing or integration. Every software upgrade configuration change thus potentially disruptive requires fail-safe automatic error...

10.1088/1742-6596/664/6/062027 article EN Journal of Physics Conference Series 2015-12-23

Rucio beyond ATLAS: experiences from Belle II, CMS, DUNE, EISCAT3D, LIGO/VIRGO, SKA, XENON

OPENALEX - Publications

M. Lassnig M-S. Barisits P. Laycock C. Serfon Eric Wayne Vaandering and 9 more

For many scientific projects, data management is an increasingly complicated challenge. The number of data-intensive instruments generating unprecedented volumes growing and their accompanying workflows are becoming more complex. Their storage computing resources heterogeneous distributed at numerous geographical locations belonging to different administrative domains organisations. These do not necessarily coincide with the places where produced nor stored, analysed by researchers, or...

10.1051/epjconf/202024511006 article EN cc-by EPJ Web of Conferences 2020-01-01

Advances in service and operations for ATLAS data management

OPENALEX - Publications

G. A. Stewart V. Garonne M. Lassnig Angelos Molfetas M-S. Barisits and 9 more

ATLAS has recorded almost 5PB of RAW data since the LHC started running at end 2009. Many more derived products and complimentary simulation have also been produced by collaboration and, in total, 70PB is currently stored Worldwide Computing Grid ATLAS. All this managed Distributed Data Management system, called Don Quixote 2 (DQ2). DQ2 evolved rapidly to help operations manage these large quantities across many grid sites which runs physicists get access data. In paper we describe new...

10.1088/1742-6596/368/1/012005 article EN Journal of Physics Conference Series 2012-06-21

A Hybrid Simulation Model for Data Grids

OPENALEX - Publications

M-S. Barisits Eva Kühn M. Lassnig

Data grids are used in large scale scientific experiments to access and store nontrivial amounts of data by combining the storage resources from multiple centers one system. This enables users automated services use a common efficient way. However, as grow it becomes hard problem for developers operators estimate how modifications policy, hardware, software affect performance metrics grid. In this paper we address modeling operational grids. We first analyze grid middleware system ATLAS...

10.1109/ccgrid.2016.36 article EN 2016-05-01

DDM Workload Emulation

OPENALEX - Publications

R. Vigne Erich Schikuta V. Garonne G. A. Stewart M-S. Barisits and 5 more

Rucio is the successor of current Don Quijote 2 (DQ2) system for distributed data management (DDM) ATLAS experiment. The reasons replacing DQ2 are manifold, but besides high maintenance costs and architectural limitations, scalability concerns on top list. Current expectations that amount will be three to four times as it today by end 2014. Further availability more powerful computing resources pushing additional pressure DDM increases demands provisioning. Although capable handling...

10.1088/1742-6596/513/4/042048 article EN Journal of Physics Conference Series 2014-06-11

ATLAS DQ2 to Rucio renaming infrastructure

OPENALEX - Publications

C. Serfon M-S. Barisits T. A. Beermann V. Garonne L. Goossens and 5 more

To prepare the migration to new ATLAS Data Management system called Rucio, a renaming campaign of all physical files produced by is needed.It represents around 300 million split between ∼120 sites with 6 different storage technologies.It must be done in transparent way order not disrupt ongoing computing activities.An infrastructure perform this has been developed and presented paper as well its performance.

10.1088/1742-6596/513/4/042008 article EN Journal of Physics Conference Series 2014-06-11

Coming Soon ...