A. Klimentov

ORCID: 0000-0003-2748-4829
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Particle physics theoretical and experimental studies
  • High-Energy Particle Collisions Research
  • Particle Detector Development and Performance
  • Quantum Chromodynamics and Particle Interactions
  • Distributed and Parallel Computing Systems
  • Dark Matter and Cosmic Phenomena
  • Scientific Computing and Data Management
  • Computational Physics and Python Applications
  • Advanced Data Storage Technologies
  • Neutrino Physics Research
  • Cosmology and Gravitation Theories
  • Big Data Technologies and Applications
  • Medical Imaging Techniques and Applications
  • Radiation Detection and Scintillator Technologies
  • Parallel Computing and Optimization Techniques
  • advanced mathematical theories
  • Astrophysics and Cosmic Phenomena
  • Cloud Computing and Resource Management
  • Advanced Database Systems and Queries
  • Particle Accelerators and Free-Electron Lasers
  • Research Data Management Practices
  • Software System Performance and Reliability
  • Data Mining Algorithms and Applications
  • Advanced Data Processing Techniques
  • Atomic and Subatomic Physics Research

Brookhaven National Laboratory
2016-2025

The University of Adelaide
2013-2023

Brandeis University
2023

University of Birmingham
2020

Technion – Israel Institute of Technology
2019

Kurchatov Institute
2015-2019

European Organization for Nuclear Research
2015-2019

Ministry of Industry and Information Technology
2019

Applied BioPhysics (United States)
2019

Czech Technical University in Prague
2019

Machine learning is an important applied research area in particle physics, beginning with applications to high-level physics analysis the 1990s and 2000s, followed by explosion of event identification reconstruction 2010s. In this document we discuss promising future development areas machine a roadmap for their implementation, software hardware resource requirements, collaborative initiatives data science community, academia industry, training community science. The main objective connect...

10.1088/1742-6596/1085/2/022008 article EN Journal of Physics Conference Series 2018-09-01

Abstract The Production and Distributed Analysis (PanDA) system is a data-driven workload management engineered to operate at the LHC data processing scale. PanDA provides solution for scientific experiments fully leverage their distributed heterogeneous resources, showcasing scalability, usability, flexibility, robustness. has successfully proven itself through nearly two decades of steady operation in ATLAS experiment, addressing intricate requirements such as diverse resources worldwide...

10.1007/s41781-024-00114-3 article EN cc-by Computing and Software for Big Science 2024-01-23

This paper presents a summary of beam-induced backgrounds observed in the ATLAS detector and discusses methods to tag remove background contaminated events data. Trigger-rate based monitoring beam-related is presented. The correlations with machine conditions, such as residual pressure beam-pipe, are discussed. Results from dedicated beam-background simulations shown, their qualitative agreement data evaluated. Data taken during passage unpaired, i.e. non-colliding, proton bunches used...

10.1088/1748-0221/8/07/p07004 article EN Journal of Instrumentation 2013-07-17

An important foundation underlying the impressive success of data processing and analysis in ATLAS experiment [1] at LHC [2] is Production Distributed Analysis (PanDA) workload management system [3]. PanDA was designed specifically for proved to be highly successful meeting all distributed computing needs experiment. However, core design not specific. The capable other intensive scientific applications. Alpha-Magnetic Spectrometer [4], an astro-particle on International Space Station,...

10.1088/1742-6596/513/3/032062 article EN Journal of Physics Conference Series 2014-06-11

The Production and Distributed Analysis (PanDA) system has been developed to meet ATLAS production analysis requirements for a data-driven workload management capable of operating at the Large Hadron Collider (LHC) data processing scale. Heterogeneous resources used by experiment are distributed worldwide hundreds sites, thousands physicists analyse remotely, volume processed is beyond exabyte scale, dozens scientific applications supported, while requires more than few billion hours...

10.1088/1742-6596/898/5/052002 article EN Journal of Physics Conference Series 2017-10-01

The ATLAS experiment at CERN is one of the largest scientific machines built to date and will have ever growing computing needs as Large Hadron Collider collects an increasingly larger volume data over next 20 years. conducting R&D projects on Amazon Web Services Google Cloud complementary resources for distributed computing, focusing some key features commercial clouds: lightweight operation, elasticity availability multiple chip architectures. proof concept phases concluded with...

10.1051/epjconf/202429507002 article EN cc-by EPJ Web of Conferences 2024-01-01

This paper presents a novel approach to the joint optimization of job scheduling and data allocation in grid computing environments. We formulate this problem as mixed integer quadratically constrained program. To tackle nonlinearity constraint, we alternatively fix subset decision variables optimize remaining ones via Mixed Integer Linear Programming (MILP). solve MILP at each iteration an off-the-shelf solver. Our experimental results show that our method significantly outperforms existing...

10.48550/arxiv.2502.00261 preprint EN arXiv (Cornell University) 2025-01-31

Machine learning has been applied to several problems in particle physics research, beginning with applications high-level analysis the 1990s and 2000s, followed by an explosion of event identification reconstruction 2010s. In this document we discuss promising future research development areas for machine physics. We detail a roadmap their implementation, software hardware resource requirements, collaborative initiatives data science community, academia industry, training community science....

10.48550/arxiv.1807.02876 preprint EN other-oa arXiv (Cornell University) 2018-01-01

The ATLAS experiment at CERN relies on a worldwide distributed computing Grid infrastructure to support its physics program the Large Hadron Collider. has integrated cloud resources complement and conducted an R&D Google Cloud Platform. These initiatives leverage key features of commercial providers: lightweight configuration operation, elasticity availability diverse infrastructure. This paper examines seamless integration services as conventional site within workflow management data...

10.1142/s0217751x24500544 preprint EN arXiv (Cornell University) 2024-03-23

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments LHC explore fundamental nature of matter and basic forces that shape our universe, were recently credited for discovery a Higgs boson. ATLAS ALICE are largest collaborations ever assembled sciences forefront research LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on...

10.1088/1742-6596/608/1/012040 article EN Journal of Physics Conference Series 2015-05-22

The Big Data processing needs of the ATLAS experiment grow continuously, as more data and use cases emerge. For adopted transformation approach, where software applications transform input into outputs. In production system, each is represented by a task, collection many jobs, submitted workload management system (PanDA) executed on Grid. Our experience shows that rate task submission grows exponentially over years. To scale up for new challenges, we started ProdSys2 project. PanDA has been...

10.1088/1742-6596/664/6/062005 article EN Journal of Physics Conference Series 2015-12-23

Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide hundreds of sites, thousands physicists analyse data remotely, volume processed is beyond exabyte scale, while processing requires more than a few billion hours usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet scale complexity LHC for ATLAS experiment. In process, old batch job paradigm locally managed in HEP...

10.1088/1742-6596/664/6/062035 article EN Journal of Physics Conference Series 2015-12-23

This document describes the design of new Production System ATLAS experiment at LHC [1]. The is top level workflow manager which translates physicists' needs for production processing and analysis into actual workflows executed across over a hundred Grid sites used globally by ATLAS. As workload increased in volume complexity recent years (the tasks count above one million, with each task containing hundreds or thousands jobs) there need to upgrade meet challenging requirements next run...

10.1088/1742-6596/513/3/032078 article EN Journal of Physics Conference Series 2014-06-11

The computing systems used by LHC experiments has historically consisted of the federation hundreds to thousands distributed resources, ranging from small mid-size re-source. In spite impressive scale existing solutions, resources will be insufficient meet projected future demands. This paper is a case study how ATLAS experiment embraced Titan - DOE leadership facility in conjunction with traditional high-throughput reach sustained production scales approximately 52M core-hours years. three...

10.1109/escience.2017.43 article EN 2017-10-01

The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale complexity of LHC distributed computing for ATLAS experiment. While currently distributes jobs more than 100,000 cores at well over 100 Grid sites, future data taking runs will require resources can possibly provide. To alleviate these challenges, is engaged in an ambitious program expand current model include additional such as opportunistic use supercomputers.

10.1088/1742-6596/664/9/092020 article EN Journal of Physics Conference Series 2015-12-23

The ATLAS experiment at CERN’s LHC stores detector and simulation data in raw derived formats across more than 150 Grid sites world-wide, currently total about 200PB on disk 250PB tape. Data have different access characteristics due to various computational workflows, can be accessed from media, such as remote I/O, cache hard drives or SSDs. Also, larger centers provide the majority of offline storage capability via tape systems. For HighLuminosity (HL-LHC), estimated requirements are...

10.1051/epjconf/202024504035 article EN cc-by EPJ Web of Conferences 2020-01-01

The second generation of the ATLAS Production System called ProdSys2 is a distributed workload manager that runs daily hundreds thousands jobs, from dozens different specific workflows, across more than hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based on many criteria, such as input and output size, memory requirements CPU consumption, with manageable scheduling policies supporting kind computational resources, GRID, clouds, supercomputers...

10.1088/1742-6596/898/5/052016 article EN Journal of Physics Conference Series 2017-10-01

A new type of parallel workflow is developed for the ATLAS experiment at Large Hadron Collider, that makes use distributed computing combined with a cloud-based infrastructure. This has been specific analysis using data, one popularly referred to as Simulation-Based Inference (SBI). The JAX library used parts compute gradients well accelerate program execution just-in-time compilation, which becomes essential in full SBI and can also offer significant speed-ups more traditional types analysis.

10.1051/epjconf/202429504007 article EN cc-by EPJ Web of Conferences 2024-01-01

Monitoring services play a crucial role in the day-to-day operation of distributed computing systems. The ATLAS Experiment at LHC uses Production and Distributed Analysis workload management system (PanDA WMS), which allows million computational jobs to run daily over 170 centers WLCG opportunistic resources, utilizing 600k cores simultaneously on average. BigPanDA monitor is an essential part monitoring infrastructure for that provides wide range views, from top-level summaries single job...

10.1051/epjconf/202429504010 article EN cc-by EPJ Web of Conferences 2024-01-01

The Vera C. Rubin Observatory will produce an unprecedented astronomical data set for studies of the deep and dynamic universe. Its Legacy Survey Space Time (LSST) image entire southern sky every three to four days tens petabytes raw associated calibration over course experiment’s run. More than 20 terabytes must be stored night, annual campaigns reprocess dataset since beginning survey conducted ten years. Production Distributed Analysis (PanDA) system was evaluated by Data Management team...

10.1051/epjconf/202429504026 article EN cc-by EPJ Web of Conferences 2024-01-01

In recent years, advanced and complex analysis workflows have gained increasing importance in the ATLAS experiment at CERN, one of large scientific experiments LHC. Support for such has allowed users to exploit remote computing resources service providers distributed worldwide, overcoming limitations on local services. The spectrum options keeps across Worldwide LHC Computing Grid (WLCG), volunteer computing, high-performance commercial clouds, emerging levels like Platform-as-a-Service...

10.1051/epjconf/202429504053 article EN cc-by EPJ Web of Conferences 2024-01-01

Machine Learning (ML) has become one of the important tools for High Energy Physics analysis. As size dataset increases at Large Hadron Collider (LHC), and same time search spaces bigger in order to exploit physics potentials, more computing resources are required processing these ML tasks. In addition, complex advanced workflows developed which task may depend on results previous How make use vast distributed CPUs/GPUs WLCG big tasks a popular research area. this paper, we present our...

10.1051/epjconf/202429504019 article EN cc-by EPJ Web of Conferences 2024-01-01

Operational analytics is the direction of research related to analysis current state computing processes and prediction future states in order anticipate imbalances take timely measures stabilize a complex system. There are two relevant areas ATLAS Distributed Computing that currently focus studies: user physics including forecast popularity data samples among users, evaluating WLCG centers for their readiness process payloads. Studying these challenging due complexity involved, as it...

10.1051/epjconf/202429504033 article EN cc-by EPJ Web of Conferences 2024-01-01

The production system for Grid Data Processing handles petascale ATLAS data reprocessing and Monte Carlo activities. empowered further processing steps on the performed by dozens of physics groups with coordinated access to computing resources worldwide, including additional sponsored regional facilities. provides knowledge management configuration parameters massive tasks, reproducibility results, scalable database access, orchestrated workflow performance monitoring, dynamic workload...

10.1088/1742-6596/396/3/032049 article EN Journal of Physics Conference Series 2012-12-13
Coming Soon ...