Misbah Mubarak

ORCID: 0000-0002-9923-9825
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Interconnection Networks and Systems
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Distributed and Parallel Computing Systems
  • Cloud Computing and Resource Management
  • Simulation Techniques and Applications
  • Data Visualization and Analytics
  • Software-Defined Networks and 5G
  • Complex Network Analysis Techniques
  • Scientific Computing and Data Management
  • Distributed systems and fault tolerance
  • Anomaly Detection Techniques and Applications
  • Semiconductor materials and devices
  • Peer-to-Peer Network Technologies
  • Topological and Geometric Data Analysis
  • Time Series Analysis and Forecasting
  • Software System Performance and Reliability
  • Radiation Effects in Electronics
  • Blockchain Technology in Education and Learning
  • Neuroscience and Neural Engineering
  • Advanced Memory and Neural Computing
  • Low-power high-performance VLSI design
  • Simulation and Modeling Applications
  • Video Analysis and Summarization
  • Data Mining Algorithms and Applications

Sultan Ageng Tirtayasa University
2023

Sandia National Laboratories
2020

Argonne National Laboratory
2015-2020

National Research Council
2020

Institute of Electronics, Computer and Telecommunication Engineering
2020

Prince of Songkla University
2020

Webb Institute
2020

Sandia National Laboratories California
2018-2020

Rensselaer Polytechnic Institute
2012-2015

With the increasing complexity of today's high-performance computing (HPC) architectures, simulation has become an indispensable tool for exploring design space HPC systems-in particular, networks. In order to make effective decisions, simulations these systems must possess following properties: (1) have high accuracy and fidelity, (2) produce results in a timely manner, (3) be able analyze broad range network workloads. Most state-of-the-art frameworks, however, are constrained one or more...

10.1109/tpds.2016.2543725 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2016-04-07

A low-latency and low-diameter interconnection network will be an important component of future exascale architectures. The dragonfly topology, a two-level directly connected network, is candidate for architectures because its low diameter reduced latency. To date, small-scale simulations with few thousand nodes have been carried out to examine the topology. However, machines millions cores up 1 million nodes. In this paper, we focus on modeling simulation large-scale networks using...

10.1109/sc.companion.2012.56 article EN 2012-11-01

High-radix, low-diameter dragonfly networks will be a common choice in next-generation supercomputers. Preliminary studies show that random job placement with adaptive routing should the rule of thumb to utilize such networks, since it uniformly distributes traffic and alleviates congestion. Nevertheless, this work we find while coupled is good at load balancing network traffic, cannot guarantee best performance for every job. The improvement communication-intensive applications comes...

10.1109/sc.2016.63 article EN 2016-11-01

High-radix, low-diameter dragonfly networks will be a common choice in next-generation supercomputers. Preliminary studies show that random job placement with adaptive routing should the rule of thumb to utilize such networks, since it uniformly distributes traffic and alleviates congestion. Nevertheless, this work we find while coupled is good at load balancing network traffic, cannot guarantee best performance for every job. The improvement communication-intensive applications comes...

10.5555/3014904.3014990 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2016-11-13

The fat-tree topology is one of the most commonly used network topologies in HPC systems. Vendors support several options that can be configured when deploying networks on production systems, such as link bandwidth, number rails, planes, and tapering. This paper showcases use simulations to compare impact these design representative applications, libraries, multi-job workloads. We present advances TraceR-CODES simulation framework enable this analysis evaluate its prediction accuracy against...

10.1145/3126908.3126967 article EN 2017-11-08

A high-bandwidth, low-latency interconnect will be a critical component of future exascale systems. The torus network topology, which uses multidimensional links to improve path diversity and exploit locality between nodes, is potential candidate for interconnects.

10.1145/2601381.2601383 article EN 2014-05-18

Accurate analysis of HPC storage system designs is contingent on the use I/O workloads that are truly representative expected use. However, analyses generally bound to specific workload modeling techniques such as synthetic benchmarks or trace replay mechanisms, despite fact no single technique appropriate for all cases. In this work, we present design IOWA, a novel abstraction allows arbitrary consumer components obtain from range diverse input sources. Thus, researchers can choose...

10.1145/2832087.2832091 article EN 2015-11-11

HPC systems have shifted to burst buffer storage and high radix interconnect topologies in order meet the challenges of large-scale, data-intensive scientific computing. Both these technologies been studied detail independently, but interaction between them is not well understood. I/O traffic communication from concurrently scheduled applications may interfere with each other unexpected ways, this behavior vary considerably depending on resource allocation, scheduling, routing policies. In...

10.1109/cluster.2017.25 article EN 2017-09-01

The overall efficiency of an extreme-scale supercomputer largely relies on the performance its network interconnects. Several state art supercomputers use networks based increasingly popular Dragonfly topology. It is crucial to study behavior and different parallel applications running in order make optimal system configurations design choices, such as job scheduling routing strategies. However, these temporal behavior, we would need a tool analyze correlate numerous sets multivariate...

10.1016/j.visinf.2018.04.010 article EN cc-by-nc-nd Visual Informatics 2018-03-01

As supercomputers close in on exascale performance, the increased number of processors and processing power translates to an demand underlying network interconnect. The Slim Fly topology, a new lowdiameter low-latency interconnection network, is gaining interest as one possible solution for next-generation supercomputing interconnect systems. In this paper, we present high-fidelity it-level model leveraging Rensselaer Optimistic Simulation System (ROSS) Co-Design Exascale Storage (CODES)...

10.1145/2901378.2901389 article EN 2016-05-13

Dragonfly networks are being widely adopted in high-performance computing systems. On these networks, however, interference caused by resource sharing can lead to significant network congestion and performance variability. We present a comparative analysis exploring the trade-off between localizing communication balancing traffic. conduct trace-based simulations for applications with different patterns, using multiple job placement policies routing mechanisms. perform an in-depth on...

10.1109/ipdps.2018.00120 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

Among the low-diameter, high-radix networks beingdeployed in next-generation HPC systems, dual-rail fat-treenetworks are a promising approach. Adding additional injectionconnections (rails) to one or more network planes allows multirailfat-tree alleviate communication bottlenecks. These multi-rail necessitate new design considerations, such as routing choices, job placements, and scalability of rails. We extend our fat-tree model CODES parallelsimulation framework support...

10.1109/ccgrid.2017.102 article EN 2017-05-01

Performance modeling of extreme-scale applications on accurate representations potential architectures is critical for designing next generation supercomputing systems because it impractical to construct prototype at scale with new network hardware in order explore designs and policies. However, these simulations often rely static application traces that can be difficult work their size lack flexibility extend or up without rerunning the original application. To address this problem, we have...

10.1145/3064911.3064923 article EN 2017-05-16

High-radix, low-diameter, hierarchical networks based on the Dragonfly topology are common picks for building next generation HPC systems. However, effective tools lacking analyzing network performance and exploring design choices such emerging at scale. In this paper, we present visual analytics methods that couple data aggregation techniques with interactive visualizations large-scale networks. We create an system these techniques. To facilitate analysis exploration of behaviors, our...

10.1109/cluster.2017.26 article EN 2017-09-01

Burst buffers (BBs) are increasingly exploited in contemporary supercomputers to bridge the performance gap between compute and storage systems. The design of BBs, particularly placement these devices underlying network topology, impacts both cost. As cost other components such as memory accelerators is increasing, it becoming more important that HPC centers provision BBs tailored their workloads.This work contributes a provisioning system provide accurate, multi-tenant simulations model...

10.1109/cluster.2019.8891051 article EN 2019-09-01

Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to computational cost applying offline algorithms vast amounts log data. Analyzing large data is challenging because rate receiving limited time comprehend make it difficult for analysts sufficiently examine without missing important changes or patterns. To support analysis, we introduce visual analytic framework comprising three modules: management, interactive...

10.1109/pacificvis48177.2020.9280 article EN 2020-05-08

Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ offer more path diversity than the original design, they still prone to performance variability due their hierarchical architecture and resource sharing design. Event-driven network simulators indispensable tools navigating complex system In this study, we quantitatively evaluate a variety application communication interactions on 3,456-node by using CODES toolkit. This...

10.1145/3316480.3325517 article EN 2019-05-29

With the rapid growth of machine learning applications, workloads future HPC systems are anticipated to be a mix scientific simulation, big data analytics, and applications. Simulation is great research vehicle understand performance implications co-running applications with on large-scale systems. In this paper, we present Union, workload manager that provides an automatic framework facilitate hybrid simulation in CODES. Furthermore, use along CODES, investigate various composed traditional...

10.1109/ipdps47924.2020.00089 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2020-05-01

MPI collective operations are a critical and frequently used part of most MPI-based large-scale scientific applications. In previous work, we have enabled the Rensselaer Optimistic Simulation System (ROSS) to predict performance point-to-point messaging on high-fidelity million-node network simulations torus dragonfly interconnects. The main contribution this work is an extension these models support communication using optimistic event scheduling capability ROSS. We demonstrate that both...

10.5555/2693848.2694239 article EN Winter Simulation Conference 2014-12-07

Critical to the scalability of parallel adaptive simulations are control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently avoid performance degradation when neighbors on different processors. This article presents a algorithm creating deleting copies, referred as ghost localize...

10.1155/2013/654971 article EN cc-by Scientific Programming 2013-01-01

Two-tiered direct network topologies such as Dragonflies have been proposed for future post-petascale and exascale machines, since they provide a high-radix, low-diameter, fast interconnection network. Such call redesigningMPI collective communication algorithms in order to attain the best performance. Yet increasingly more applications share machine, it is not clear how these topology-aware will react interference with concurrent jobs accessing same In this paper, we study three broadcast...

10.1109/cluster.2016.26 article EN 2016-09-01

Network contention between concurrently running jobs on HPC systems is a primary cause of performance variability. Optimizing job allocation and avoiding network sharing are hence crucial to alleviate the potential degradation. In order do so effectively, an understanding interference among jobs, their communication patterns, in required. this work, we choose three representative applications from DOE Design Forward Project conduct detailed simulations torus model analyze both intra-and...

10.1109/icpads.2016.0040 article EN 2016-12-01
Coming Soon ...