- Distributed and Parallel Computing Systems
- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Cloud Computing and Resource Management
- Scientific Computing and Data Management
- Ionosphere and magnetosphere dynamics
- Distributed systems and fault tolerance
- Magnetic confinement fusion research
- Solar and Space Plasma Dynamics
- Fluid Dynamics and Turbulent Flows
- Computational Fluid Dynamics and Aerodynamics
- Cloud Data Security Solutions
- Privacy-Preserving Technologies in Data
- Low-power high-performance VLSI design
- Lattice Boltzmann Simulation Studies
- Computational Drug Discovery Methods
- Meteorological Phenomena and Simulations
- Gas Dynamics and Kinetic Theory
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Machine Learning in Materials Science
- Interconnection Networks and Systems
- Advanced Memory and Neural Computing
- Matrix Theory and Algorithms
- Software System Performance and Reliability
Max Planck Computing and Data Facility
2022-2023
KTH Royal Institute of Technology
2013-2022
ORCID
2020
Swedish e-Science Research Centre
2011-2019
Khyber Teaching Hospital
2014-2018
Universidad Autónoma de Madrid
2018
University of Kassel
2018
ARM (United Kingdom)
2018
Computing Center
2016
University College Dublin
2013
The past few years have seen the creation of first production level Grid infrastructures that offer their users a dependable service at an unprecedented scale.Depending on flavor middleware services these deploy (for instance Condor, gLite, Globus, UNICORE, to name only few) different interfaces program are provided.Despite ongoing efforts standardize interfaces, there still significant differences in how applications can interface infrastructure.In this paper we describe (gLite) and...
The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called Tensor Core that performs one matrix-multiply-and-accumulate on 4x4 matrices per clock cycle. Tesla V100 accelerator, featuring the microarchitecture, provides 640 Cores with theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to program Cores, their performances and precision loss due computation Currently, three different ways programming Cores: CUDA...
Task-based programming models for shared memory—such as Cilk Plus and OpenMP 3—are well established documented. However, with the increase in parallel, many-core, heterogeneous systems, a number of research-driven projects have developed more diversified task-based support, employing various runtime features. Unfortunately, despite fact that dozens different systems exist today are actively used parallel high-performance computing (HPC), no comprehensive overview or classification...
Cloud computing is revolutionizing many ecosystems by providing organizations with resources featuring easy deployment, connectivity, configuration, automation and scalability.This paradigm shift raises a broad range of security privacy issues that must be taken into consideration.Multi-tenancy, loss control, trust are key challenges in cloud environments.This paper reviews the existing technologies wide array both earlier state-of-the-art projects on privacy.We categorize research according...
Docking and scoring large libraries of ligands against target proteins forms the basis structure-based virtual screening. The problem is trivially parallelizable, calculations are generally carried out on computer clusters or workstations in a brute force manner, by docking all available ligands.In this study we propose strategy that based iteratively set to form training set, ligand-based model predicting remainder exclude those predicted as 'low-scoring' ligands. Then, another docked,...
Traditional scientific and emerging data analytics applications require fast, power-efficient, large, persistent memories. Combining all these characteristics within a single memory technology is expensive hence future supercomputers will feature different technologies side-by-side. However, it complex task to program hybrid-memory systems identify the best object-to-memory mapping. We envision that programmers probably resort use default configurations only minimal interventions on...
The aim of the EGEE (Enabling Grids for E-Science in Europe) project is to create a reliable and dependable European Grid infrastructure e-Science. objective Middleware Re-engineering Integration Research Activity provide robust middleware components, deployable on several platforms operating systems, corresponding core services resource access, data management, information collection, authentication & authorization, matchmaking brokering, monitoring accounting. For achieving this objective,...
The dynamics of a plasmoid chain is studied with three dimensional Particle-in-Cell simulations. evolution the system and without uniform guide field, whose strength 1/3 asymptotic magnetic investigated. forms by spontaneous reconnection: tearing instability rapidly disrupts initial current sheet generating several small-scale plasmoids, that grow in size coalescing kinking. kink mainly driven coalescence process. It found presence field strongly influences chain. Without main reconnection...
We present a case study of porting NekBone, skeleton version the Nek5000 code, to parallel GPU-accelerated system. is computational fluid dynamics code based on spectral element method used for simulation incompressible flow. The original NekBone Fortran source has been as base and enhanced by OpenACC directives. profiling provided an assessment suitability GPU systems, indicated possible kernel optimizations. To port systems required little effort small number additional lines...
Hardware accelerators have become a de-facto standard to achieve high performance on current supercomputers and there are indications that this trend will increase in the future. Modern feature high-bandwidth memory next computing cores. For example, Intel Knights Landing (KNL) processor is equipped with 16 GB of (HBM) works together conventional DRAM memory. Theoretically, HBM can provide ~4× higher bandwidth than DRAM. However, many factors impact effective achieved by applications,...
The performance of Deep-Learning (DL) computing frameworks rely on the data ingestion and checkpointing. In fact, during training, a considerable high number relatively small files are first loaded pre-processed CPUs then moved to accelerator for computation. addition, checkpointing restart operations carried out allow DL quickly from checkpoint. Because this, I/O affects applications. this work, we characterize scaling TensorFlow, an open-source programming framework developed by Google...
Many production Grid and e-Science infrastructures have begun to offer services end-users during the past several years with an increasing number of scientific applications that require access a wide variety resources in multiple Grids. Therefore, Interoperation Now—Community Group Open Forum—organizes manages interoperation efforts among those reach goal world-wide vision on technical level near future. This contribution highlights fundamental approaches group discusses open standards...
A spectral method for kinetic plasma simulations based on the expansion of velocity dis- tribution function in a variable number Hermite polynomials is presented. The set non-linear equations that solved to determine coefficients satisfying Vlasov and Poisson equations. In this paper, we first show technique combines fluid approaches into one framework. Second, present an adaptive strategy increase decrease functions dynamically during simulation. applied Landau damping two-stream...
We demonstrate the improvements to an implicit Particle-in-Cell code, iPic3D, on example of dipolar magnetic field immersed in flow plasma and show formation a mag- netosphere. address problem modelling multi-scale phenomena during magnetosphere by implementing adaptive sub-cycling technique resolve motion particles located close dipole centre, where intensity is maximum. In addition, we implemented new open boundary conditions model inflow outflow plasma. present results global...