NFDI4DS | UHH-SEMS - Publication Details

darkNoC

OPENALEX - Publications

Haseeb Bokhari Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

In this paper, we propose a novel NoC architecture, called darkNoC, where multiple layers of architecturally identical, but physically different routers are integrated, leveraging the extra transistors available due to dark silicon. Each layer is separately optimized for particular voltage-frequency range by adroit use multi-Vt circuit optimization. At given time, only one network illuminated while all other dark. We provide architectural support seamless integration layers, and fast...

10.1145/2593069.2593117 article EN 2014-05-27

Optimizing Validation Phase of Hyperledger Fabric

OPENALEX - Publications

Haris Javaid Chengchen Hu Gordon Brebner

Blockchain technologies are on the rise, and Hyperledger Fabric is one of most popular permissioned blockchain platforms. In this paper, we re-architect validation phase based our analysis from fine-grained breakdown phase's latency. Our optimized uses a chaincode cache during transactions, initiates state database reads in parallel with writes to ledger databases parallel. experiments reveal performance improvements 2x for CouchDB 1.3x LevelDB. Notably, optimizations can be adopted future...

10.1109/mascots.2019.00038 article EN 2019-09-25

Approximate Integer and Floating-Point Dividers with Near-Zero Error Bias

OPENALEX - Publications

Hassaan Saadat Haris Javaid Sri Parameswaran

We propose approximate dividers with near-zero error bias for both integer and floating-point numbers. The divider, INZeD, is designed using a novel, analytically deduced error-correction method in an log based divider. FaNZeD, on highly optimized mantissa divider that inspired by INZeD. Both of the are configurable.

10.1145/3316781.3317773 article EN 2019-05-23

Low-power adaptive pipelined MPSoCs for multimedia

OPENALEX - Publications

Haris Javaid Muhammad Shafique Sri Parameswaran Jörg Henkel

Pipelined MPSoCs provide a high throughput implementation platform for multimedia applications, with reduced design time and improved flexibility. Typically pipelined MPSoC is balanced at design-time using worst-case parameters. Where there widely varying workload, such designs consume exorbitant amount of power. In this paper, we propose novel adaptive architecture that adapts itself to workloads. Our consists Main Processors Auxiliary distributed run-time balancing approach, where each...

10.1145/2024724.2024951 article EN Proceedings of the 34th Design Automation Conference 2011-06-05

REALM: Reduced-Error Approximate Log-based Integer Multiplier

OPENALEX - Publications

Hassaan Saadat Haris Javaid Aleksandar Ignjatović Sri Parameswaran

We propose a new error-configurable approximate unsigned integer multiplier named REALM. It incorporates novel error-reduction method into the classical log-based multiplier. Each power-of-two-interval of input operands is partitioned M×M segments, and an factor for each segment analytically determined. These factors can be used across any power-of-two-interval, so we quantize only M <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> store...

10.23919/date48585.2020.9116315 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2020-03-01

Blockchain Machine: A Network-Attached Hardware Accelerator for Hyperledger Fabric

OPENALEX - Publications

Haris Javaid Ji Yang Nathania Santoso Mohit Upadhyay S. Mohan and 2 more

In this paper, we demonstrate how Hyperledger Fabric, one of the most popular permissioned blockchains, can benefit from network-attached acceleration. The scalability and peak performance Fabric is primarily limited by bottlenecks present in its block validation/commit phase. We propose Blockchain Machine, a hardware accelerator coupled with hardware-friendly communication protocol, to act as validator peer. It be adapted applications their smart contracts, targeted for server FPGA...

10.1109/icdcs54860.2022.00033 article EN 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS) 2022-07-01

Rapid Design Space Exploration of Application Specific Heterogeneous Pipelined Multiprocessor Systems

OPENALEX - Publications

Haris Javaid Aleksander Ignjatovic Sri Parameswaran

This paper describes a rapid design methodology to create pipeline of processors execute streaming applications. The seeks system with the smallest area while its runtime is within specified constraint. Initially, heuristic used rapidly explore large number processor configurations find near Pareto front space, and then an exact integer linear programming (ILP) formulation (EIF) optimal solution. A reduced ILP (RIF) or if EIF does not solution in given time window. was integrated into...

10.1109/tcad.2010.2061353 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2010-10-20

A design flow for application specific heterogeneous pipelined multiprocessor systems

OPENALEX - Publications

Haris Javaid Sri Parameswaran

This paper describes a rapid design methodology to create pipeline of processers execute streaming applications. The is in two separate phases: the first phase, uses heuristic rapidly search through large number processor configurations (configurations differ by base processor, additional instructions and cache sizes) find near Pareto front; second utilizes either above or an ILP (Integer Linear Programming) formulation smaller space appropriate final implementation. By utilization fast with...

10.1145/1629911.1629979 article EN 2009-07-26

System-level application-aware dynamic power management in adaptive pipelined MPSoCs for multimedia

OPENALEX - Publications

Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

System-level dynamic power management (DPM) schemes in Multiprocessor System on Chips (MPSoCs) exploit the idleness of processors to reduce energy consumption by putting idle low-power states. In presence multiple states, challenge is predict duration period with high accuracy so that most beneficial state can be selected for processor. this work, we propose a novel scheme adaptive pipelined MPSoCs, suitable multimedia applications. We leverage application knowledge form future workload...

10.5555/2132325.2132465 article EN International Conference on Computer Aided Design 2011-11-07

Optimal synthesis of latency and throughput constrained pipelined MPSoCs targeting streaming applications

OPENALEX - Publications

Haris Javaid Xin He Aleksander Ignjatovic Sri Parameswaran

A streaming application, characterized by a kernel that can be broken down into independent tasks which executed in pipelined fashion, inherently allows its implementation on pipeline of Application Specific Instruction set Processors (ASIPs), called MPSoC. The latency and throughput requirements applications put constraints the design such MPSoC, where each ASIP has number available configurations differing additional instructions, instruction data cache sizes. Thus, space MPSoC is all...

10.1145/1878961.1878978 article EN 2010-10-24

SuperNet

OPENALEX - Publications

Haseeb Bokhari Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

Designers of the on-chip interconnect for manycore chips are faced with dilemma meeting performance, power and reliability requirements different operational scenarios. In this paper, we propose a multimode called SuperNet. This can be configured to run in three modes: energy efficient mode; performance and, mode. Our proposed is based on two parallel multi-vt optimized packet switched network-on-chip (NoC) meshes. We describe circuit design techniques architectural modifications required...

10.1145/2744769.2744912 article EN 2015-06-02

ApproxTrain: Fast Simulation of Approximate Multipliers for DNN Training and Inference

OPENALEX - Publications

Jing Gong Hassaan Saadat Hasindu Gamaarachchi Haris Javaid Xiaobo Sharon Hu and 1 more

Edge training of deep neural networks (DNNs) is a desirable goal for continuous learning; however, it hindered by the enormous computational power required training. Hardware approximate multipliers have shown their effectiveness in gaining resource efficiency DNN inference accelerators; with largely unexplored. To build resource-efficient accelerators supporting training, thorough evaluation convergence and accuracy different architectures needed. This article presents ApproxTrain, an...

10.1109/tcad.2023.3253045 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2023-03-06

System-level application-aware dynamic power management in adaptive pipelined MPSoCs for multimedia

OPENALEX - Publications

Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

System-level dynamic power management (DPM) schemes in Multiprocessor System on Chips (MPSoCs) exploit the idleness of processors to reduce energy consumption by putting idle low-power states. In presence multiple states, challenge is predict duration period with high accuracy so that most beneficial state can be selected for processor. this work, we propose a novel scheme adaptive pipelined MPSoCs, suitable multimedia applications. We leverage application knowledge form future workload...

10.1109/iccad.2011.6105394 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2011-11-01

Malleable NoC: dark silicon inspired adaptable Network-on-Chip

OPENALEX - Publications

Haseeb Bokhari Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

Network on Chip (NoC) has been envisioned as a scalable fabric for many core chips. However, NoCs can consume considerable share of chip power. Moreover, diverse applications are executed in these multicore, where each application imposes unique load the NoC. To realise NoC which is Energy and Delay efficient, we propose combining multiple VF optimized routers node (in traditional NoCs, have only single router per node) efficient Dark Silicon We present generic with designed different...

10.5555/2755753.2757101 article EN Design, Automation, and Test in Europe 2015-03-09

Malleable NoC: Dark Silicon Inspired Adaptable Network-on-Chip

OPENALEX - Publications

Haseeb Bokhari Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

Network on Chip (NoC) has been envisioned as a scalable fabric for many core chips. However, NoCs can consume considerable share of chip power. Moreover, diverse applications are executed in these multicore, where each application imposes unique load the NoC. To realise NoC which is Energy and Delay efficient, we propose combining multiple VF optimized routers node (in traditional NoCs, have only single router per node) efficient Dark Silicon We present generic with designed different...

10.7873/date.2015.0694 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2015-01-01

Fidelity metrics for estimation models

OPENALEX - Publications

Haris Javaid Aleksander Ignjatovic Sri Parameswaran

Estimation models play a vital role in many aspects of day to life. Extremely complex estimation are employed the design space exploration SoCs, and efficacy these is usually measured by absolute error compared known actual results. Such based metrics can often result over-designed models, with number researchers suggesting that fidelity an model (correlation between ordering estimated points points) should be examined instead of, or addition to, error. In this paper, for first time, we...

10.5555/2133429.2133431 article EN International Conference on Computer Aided Design 2010-11-07

Performance Estimation of Pipelined MultiProcessor System-on-Chips (MPSoCs)

OPENALEX - Publications

Haris Javaid Aleksander Ignjatovic Sri Parameswaran

The paradigm of pipelined MPSoC (processors connected in a pipeline) is well suited to data flow nature multimedia applications. Often design space exploration performed optimize execution time, latency or throughput where the variants system are processor configurations due customizable options each processors. Since there can be billions combinations (design points), challenge quickly provide estimates performance metrics those points. Hence, this article, we propose analytical models...

10.1109/tpds.2013.268 article EN IEEE Transactions on Parallel and Distributed Systems 2013-10-18

Energy-Efficient Adaptive Pipelined MPSoCs for Multimedia Applications

OPENALEX - Publications

Haris Javaid Muhammad Shafique Jörg Henkel Sri Parameswaran

Pipelined MPSoCs provide a high throughput implementation platform for multimedia applications. They are typically balanced at design-time considering worst-case scenarios so that given can be fulfilled all times. Such pipelined lack runtime adaptability and result in inefficient resource utilization power/energy consumption under dynamic workload. In this paper, we propose novel adaptive architecture distributed processor manager to enable adaptation MPSoCs. The proposed consists of main...

10.1109/tcad.2014.2298196 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2014-04-18

Synthesis of heterogeneous pipelined multiprocessor systems using ILP

OPENALEX - Publications

Haris Javaid Sri Parameswaran

Streaming applications can be implemented with a pipeline of processors. Each processor in the an application Specific Instruction Set Processor (ASIP) result being heterogeneous pipelined MPSoC system. Since ASIPs differing configurations, finding optimal set configurations for multiprocessor architecture is difficult problem.

10.1145/1450135.1450137 article EN 2008-10-19

Fidelity metrics for estimation models

OPENALEX - Publications

Haris Javaid Aleksander Ignjatovic Sri Parameswaran

Estimation models play a vital role in many aspects of day to life. Extremely complex estimation are employed the design space exploration SoCs, and efficacy these is usually measured by absolute error compared known actual results. Such based metrics can often result over-designed models, with number researchers suggesting that fidelity an model (correlation between ordering estimated points points) should be examined instead of, or addition to, error. In this paper, for first time, we...

10.1109/iccad.2010.5653959 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2010-11-01

Rapid runtime estimation methods for pipelined MPSoCs

OPENALEX - Publications

Haris Javaid Andhi Janapsatya Mohammad Shihabul Haque Sri Parameswaran

The pipelined Multiprocessor System on Chip (MPSoC) paradigm is well suited to the data flow nature of streaming applications. A MPSoC a system where processing elements (PEs) are connected in pipeline. Each PE implemented using one number processor configurations (configurations differ by instruction sets and cache sizes) available for that PE. goal select with mapping configuration every To estimate run-time MPSoC, designers typically perform cycle-accurate simulation whole system. Since...

10.1109/date.2010.5457178 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2010-03-01

Multi-ASIP based parallel and scalable implementation of motion estimation kernel for high definition videos

OPENALEX - Publications

Hong Chinh Doan Haris Javaid Sri Parameswaran

Parallel implementations of motion estimation for high definition videos typically exploit various forms parallelism (GOP-, frame-, slice- and macroblock-level) to deliver real-time throughput. Although parallel throughput, they often suffer from limited flexibility scalability due the form architecture used. In this work, we use Group Of MacroBlocks (GOMB) Intra-MB (1MB) with a multi-ASIP (Application Specific Instruction set Processor) provide flexible scalable platform videos. Multiple...

10.1109/estimedia.2011.6088526 article EN 2011-10-01

Efficient FPGA-based ECDSA Verification Engine for Permissioned Blockchains

OPENALEX - Publications

Rashmi Agrawal Ji Yang Haris Javaid

Permissioned blockchain platforms heavily depend on cryptography to provide a layer of trust within the network, thus verification cryptographic signatures often becomes bottleneck. ECDSA is most commonly used scheme in permissioned blockchains. In this work, we propose an efficient implementation signature FPGA, order improve performance blockchains that aim use FPGA-based hardware accelerators. We several optimizations for modular arithmetic (e.g., custom multipliers and fast reduction)...

10.1109/asap54787.2022.00032 article EN 2022-07-01

Improving Energy Efficiency of Permissioned Blockchains Using FPGAs

OPENALEX - Publications

Nathania Santoso Haris Javaid

Permissioned blockchains like Hyperledger Fabric have become quite popular for implementation of enterprise applications. Recent research has mainly focused on improving performance permissioned without any consideration their power/energy consumption. In this paper, we conduct a comprehensive empirical study to understand energy efficiency (throughput/energy) validator peer in (a major bottleneck node). We pick number optimizations from literature (allocated CPUs, software block cache and...

10.1109/icpads56603.2022.00031 article EN 2023-01-01