NFDI4DS | UHH-SEMS - Publication Details

The gLite workload management system

OPENALEX - Publications

Paolo Andreetto Sergio Andreozzi Giuseppe Avellino Stefano Beco Andrea Cavallini and 28 more

The gLite Workload Management System (WMS) is a collection of components that provide the service responsible for distributing and managing tasks across computing storage resources available on Grid. WMS basically receives requests job execution from client, finds required appropriate resources, then dispatches follows jobs until completion, handling failure whenever possible. Other than single batch-like jobs, compound types handled by are Directed Acyclic Graphs (a set where...

10.1088/1742-6596/119/6/062007 article EN Journal of Physics Conference Series 2008-07-01

Practical approaches to Grid workload and resource management in the EGEE project

OPENALEX - Publications

Paolo Andreetto Daniel Kouřil Valentina Borgia Aleš Křenek A. Dorigo and 31 more

Resource management and scheduling of distributed, data-driven applications in a Grid environment are challenging problems. Although significant results were achieved the past few years, development proper deployment generic, reliable, standard components present issues that still need to be completely solved. Interested domains include workload management, resource discovery, matchmaking and brokering, accounting, authorization policies, access, reliability dependability. The evolution...

10.5170/cern-2005-002.899 article EN 2004-01-01

Analysis of the ATLAS Rome Production Experience on the LHC Computing Grid

OPENALEX - Publications

S. Campana D. Barberis F. Brochu Alessandro de Salvo Flavia Donno and 11 more

The Large Hadron Collider at CERN will start data acquisition in 2007. ATLAS (A Toroidal LHC Apparatus) experiment is preparing for the handling and analysis via a series of Data Challenges production exercises to validate its computing model provide useful samples detector physics studies. last Challenge, begun June 2004 ended early 2005, was first performed completely Grid environment. Immediately afterwards, new activity necessary order event workshop, taking place 2005 Rome. This...

10.1109/e-science.2005.18 article EN 2006-01-05

Status and Developments of the CREAM Computing Element Service

OPENALEX - Publications

Paolo Andreetto Sara Bertocco Fabio Capannini Marco Cecchi A. Dorigo and 10 more

The CREAM CE implements a Grid job management service available to end users and other higher level submission services. It allows the submission, monitoring of computational jobs local resource systems. CREAM, which is part gLite middleware, in EGI production where it used by several user communities different scenarios. In this paper, after quick description architecture functionality, we report on status service, focusing results, feedback issues that had be addressed. We also discuss...

10.1088/1742-6596/331/6/062024 article EN Journal of Physics Conference Series 2011-12-23

Experience with the gLite workload management system in ATLAS Monte Carlo production on LCG

OPENALEX - Publications

S. Campana D Rebatto A. Sciabà

The ATLAS experiment has been running continuous simulated events production since more than two years. A considerable fraction of the jobs is daily submitted and handled via gLite Workload Management System, which overcomes several limitations previous LCG Resource Broker. WMS tested very intensively for LHC experiments use cases six months, both in terms performance reliability. tests were carried out by Experiment Integration Support team (in close contact with experiments) together EGEE...

10.1088/1742-6596/119/5/052009 article EN Journal of Physics Conference Series 2008-07-01

Job submission and control on a generic batch system: the BLAH experience

OPENALEX - Publications

Massimo Mezzadri F. Prelz D Rebatto

The status, functionality and design rationale of the 'Batch Local Ascii Helper' (BLAH) service for submitting controlling jobs on a batch system are presented. BLAH is part both Condor CREAM systems. At end major cycle represented by EU EGEE projects, reflexion its evolution provides some insight technological choices that can stand proof time.

10.1088/1742-6596/331/6/062039 article EN Journal of Physics Conference Series 2011-12-23

EGI federated platforms supporting accelerated computing

OPENALEX - Publications

M. Verlato Paolo Andreetto J. Astaloš Miroslav Dobrucký Andrea Giachetti and 4 more

While accelerated computing instances providing access to NVIDIA TM GPUs are already available since a couple of years in commercial public clouds like Amazon EC2, the EGI Federated Cloud has put production its first OpenStack-based site GPU-equipped at end 2015. However, many sites which or MIC coprocessors enable high performance processing not directly supported yet federated manner by HTC and platforms. In fact, use accelerator cards capabilities resource centre level, users must...

10.22323/1.293.0020 article EN cc-by-nc-nd 2017-12-06

The WorldGrid transatlantic testbed: a successful example of Grid interoperability across EU and U.S. domains

OPENALEX - Publications

Flavia Donno Vincenzo Ciaschini D Rebatto Luca Vaccarossa M. Verlato

The European DataTAG project has taken a major step towards making the concept of worldwide computing Grid reality. In collaboration with companion U.S. iVDGL, realized an intercontinental testbed spanning Europe and integrating architecturally different implementations based on Globus toolkit. WorldGrid been successfully demonstrated at SuperComputing 2002 IST2002 where real HEP application jobs were transparently submitted from using native mechanisms run resources available, independently...

10.48550/arxiv.cs/0306045 preprint EN other-oa arXiv (Cornell University) 2003-01-01

Distributed Tracking, Storage, and Re-use of Job State Information on the Grid

OPENALEX - Publications

Daniel Kouřil Aleš Křenek Luděk Matyska Miloš Mulač Jan Pospíšil and 31 more

The Logging and Bookkeeping service tracks jobs passing through the Grid. It collects important events generated by both the grid middleware components applications, processes them at a chosen LB server to provide the job state. are transported through secure reliable channels. Job tracking is fully distributed does not depend on single information source, robustness is achieved through speculative state computation in case of reordered, delayed or lost events. easily adaptable to modified...

10.5170/cern-2005-002.798 article EN 2004-01-01

Large-Scale ATLAS Simulated Production on EGEE

OPENALEX - Publications

Xavier Espinal D. Barberis K. Bos S. Campana L. Goossens and 9 more

In preparation for first data at the LHC, a series of Data Challenges, increasing scale and complexity, have been performed. Large quantities simulated produced on three different Grids, integrated into ATLAS production system. During 2006, emphasis moved towards providing stable continuous production, as is required in immediate run-up to data, thereafter. Here, we discuss experience done EGEE resources, using submission based gLite WMS, CondorG system Condor Glide-ins. The overall walltime...

10.1109/e-science.2007.47 article EN 2007-12-01

PROOF-based analysis on the ATLAS Grid facilities: first experience with the PoD/PanDa plugin

OPENALEX - Publications

E. Vilucchi A. De Salvo C. Di Donato R. Di Nardo A. Doria and 8 more

In the ATLAS computing model Grid resources are managed by PanDA, system designed for production and distributed analysis, data stored under various formats in ROOT files.End-user physicists have choice to use either ATHENA framework or directly ROOT, that provides users possibility PROOF exploit power of multi-core machines dynamically manage analysis facilities.Since facilities are, general, not dedicated only, PROOF-on-Demand (PoD) is used enable on top an existing resource management...

10.1088/1742-6596/513/3/032102 article EN Journal of Physics Conference Series 2014-06-11

The Platform-as-a-Service paradigm meets ATLAS: Developing an automated analysis workflow on the newly established INFN CLOUD

OPENALEX - Publications

C. Marcon Leonardo Carminati D Rebatto R. Turra

The Worldwide LHC Computing Grid (WLCG) is a large-scale collaboration which gathers computing resources from more than 170 centers worldwide. To fulfill the requirements of new applications and to improve long-term sustainability grid middleware, newly available solutions are being investigated. Like open-source commercial players, HEP community has also recognized benefits integrating cloud technologies into legacy, grid-based workflows. Since March 2021, INFN entered field establishing...

10.1051/epjconf/202429504048 article EN cc-by EPJ Web of Conferences 2024-01-01

Experimental evaluation of job provenance in ATLAS environment

OPENALEX - Publications

Aleš Křenek Jiří Sitera J. Chudoba František Dvořák Jiří Filipovič and 8 more

Grid middleware stacks, including gLite, matured into the state of being able to process up millions jobs per day. Logging and Bookkeeping, gLite job-tracking service, keeps pace with this rate; however, it is not designed provide a long-term archive information on executed jobs.

10.1088/1742-6596/119/6/062034 article EN Journal of Physics Conference Series 2008-07-01

A monitoring tool for a GRID operation center

OPENALEX - Publications

Sergio Andreozzi S. Fantinel D Rebatto Luca Vaccarossa G. Tortone

WorldGRID is an intercontinental testbed spanning Europe and the US integrating architecturally different Grid implementations based on Globus toolkit. The has been successfully demonstrated during demos at SuperComputing 2002 (Baltimore) IST2002 (Copenhagen) where real HEP application jobs were transparently submitted from using "native" mechanisms run resources available, independently of their location. To monitor behavior performance such spot problems as soon they arise, DataTAG...

10.48550/arxiv.cs/0306018 preprint EN other-oa arXiv (Cornell University) 2003-01-01

LEXOR, the LCG-2 Executor for the ATLAS DC2 Production System

OPENALEX - Publications

D Rebatto C Negri Luca Vaccarossa Alessandro de Salvo

10.5170/cern-2005-002.1038 article EN 2005-01-01

Improvements of LHC data analysis techniques at Italian WLCG sites. Case-study of the transfer of this technology to other research areas

OPENALEX - Publications

L. Alunni Solestizi S. Argirò S. Bagnasco D. Barberis L.M. Barone and 33 more

In 2012, 14 Italian institutions participating in LHC Experiments won a grant from the Ministry of Research (MIUR), with aim optimising analysis activities, and general Tier2/Tier3 infrastructure. We report on activities being researched upon, considerable improvement ease access to resources by physicists, also those no specific computing interests. focused items like distributed storage federations, batch-like facilities, provisioning user interfaces demand cloud systems. R&D...

10.1088/1742-6596/664/3/032006 article EN Journal of Physics Conference Series 2015-12-23

Design and Evaluation in a Real Use-case of Closed-loop Scheduling Algorithms for the gLite Workload Management System

OPENALEX - Publications

Paolo Andreetto M. Bauce Sara Bertocco Fabio Capannini Marco Cecchi and 13 more

The High Throughput Computing paradigm typically involves a scenario whereby given, estimated processing power is made available and sustained by the computing environment over medium/long period of time. As consequence, performance goals are in general targeted at maximizing resource utilization to obtain expected throughput, rather than minimizing run time for individual jobs. This does not mean that optimal selection through adequate workload management desired nor effective, nonetheless,...

10.1088/1742-6596/331/6/062029 article EN Journal of Physics Conference Series 2011-12-23

New developments in the CREAM Computing Element

OPENALEX - Publications

Paolo Andreetto Sara Bertocco Fabio Capannini Marco Cecchi Alvise Dorigo and 8 more

The EU-funded project EMI aims at providing a unified, standardized, easy to install software for distributed computing infrastructures. CREAM is one of the middleware products part distribution: it implements Grid job management service which allows submission, and monitoring computational jobs local resource systems. In this paper we discuss about some new features being implemented in Computing Element. implementation Execution Service (EMI-ES) specification (an agreement consortium on...

10.1088/1742-6596/396/3/032004 article EN Journal of Physics Conference Series 2012-12-13

ATLAS computing activities and developments in the Italian Grid cloud

OPENALEX - Publications

L. Rinaldi A. Annovi M. Antonelli D. Barberis S Barberis and 21 more

The large amount of data produced by the ATLAS experiment needs new computing paradigms for processing and analysis, which involve many centres spread around world. workload is managed regional federations, called "clouds". Italian cloud consists a main (Tier-1) center, located in Bologna, four secondary (Tier-2) centers, few smaller (Tier-3) sites. In this contribution we describe facilities activities processing, simulation software development performed within cloud, discuss tests...

10.1088/1742-6596/396/4/042052 article EN Journal of Physics Conference Series 2012-12-13

Computing infrastructure for ATLAS data analysis in the Italian Grid cloud

OPENALEX - Publications

A. Andreazza A. Annovi D. Barberis A. Brunengo S. Campana and 22 more

ATLAS data are distributed centrally to Tier-1 and Tier-2 sites. The first stages of selection analysis take place mainly at centres, with the final, iterative interactive, taking mostly Tier-3 clusters. Italian cloud consists a Tier-1, four Tier-2s, sites each institute. Tier-3s that grid-enabled used test code will then be run on larger scale Tier-2s. All offer interactive access their users possibility PROOF. This paper describes hardware software infrastructure choices taken, operational...

10.1088/1742-6596/331/5/052001 article EN Journal of Physics Conference Series 2011-12-23

Activities and performance optimization of the Italian computing centers supporting the ATLAS experiment

OPENALEX - Publications

E. Vilucchi A. Andreazza Daniela Anzellotti D. Barberis A. Brunengo and 26 more

With this work we present the activity and performance optimization of Italian computing centers supporting ATLAS experiment forming so-called Cloud. We describe activities Tier-2s Federation inside model some original contributions. StoRM, a new Storage Resource Manager developed by INFN, as replacement Castor at CNAF - Tier-1- under test Tier-2 centers. also show failover solution for LFC, based on Oracle DataGuard, load-balancing DNS LFC daemon reconfiguration, realized between in Roma....

10.1109/nssmic.2009.5401639 article EN 2009-10-01

Optimisation of the usage of LHC and local computing resources in a multidisciplinary physics department hosting a WLCG Tier-2 centre

OPENALEX - Publications

S Barberis L. Carminati Franco Leveraro S. M. Mazza L. Perini and 5 more

We present the approach of University Milan Physics Department and local unit INFN to allow encourage sharing among different research areas computing, storage networking resources (the largest ones being those composing WLCG Tier-2 centre tailored needs ATLAS experiment). Computing are organised as independent HTCondor pools, with a global master in charge monitoring them optimising their usage. The configuration has provide satisfactory throughput for both serial parallel (multicore, MPI)...

10.1088/1742-6596/664/5/052041 article EN Journal of Physics Conference Series 2015-12-23

Certification of production-quality gLite Job Management components

OPENALEX - Publications

Paolo Andreetto Sara Bertocco Fabio Capannini Marco Cecchi A. Dorigo and 10 more

With the advent of recent European Union (EU) funded projects aimed at achieving an open, coordinated and proactive collaboration among communities that provide distributed computing services, more strict requirements quality standards will be asked to middleware providers. Such a highly competitive dynamic environment, organized comply business-oriented model, has already started pursuing criteria, thus requiring formally define rigorous procedures, interfaces roles for each step software...

10.1088/1742-6596/331/6/062026 article EN Journal of Physics Conference Series 2011-12-23

CREAM Computing Element: a status update

OPENALEX - Publications

Paolo Andreetto Sara Bertocco Fabio Capannini Marco Cecchi Alvise Dorigo and 8 more

The European Middleware Initiative (EMI) project aims to deliver a consolidated set of middleware products based on the four major providers in Europe -ARC, dCache, gLite and UNICORE. CREAM (Computing Resource Execution And Management) Service, service for job management operation at Computing Element (CE) level, is software product which part EMI distribution. In this paper we discuss about some new functionality CE introduced with first release (EMI-1, codename Kebnekaise). integration...

10.1088/1742-6596/396/3/032003 article EN Journal of Physics Conference Series 2012-12-13

Workload management in the EMI project

OPENALEX - Publications

Paolo Andreetto Sara Bertocco Fabio Capannini Marco Cecchi Alvise Dorigo and 9 more

Workload management in the EMI project, Paolo Andreetto, Sara Bertocco, Fabio Capannini, Marco Cecchi, Alvise Dorigo, Eric Frizziero, Alessio Gianelle, Aristotelis Kretsis, Massimo Mezzadri, Salvatore Monforte, Francesco Prelz, David Rebatto, Sgaravatto, Luigi Zangrando

10.1088/1742-6596/396/3/032021 article IT Journal of Physics Conference Series 2012-12-13