Giorgos Dimitrakopoulos

ORCID: 0000-0003-3688-7865
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Interconnection Networks and Systems
  • Parallel Computing and Optimization Techniques
  • Low-power high-performance VLSI design
  • Embedded Systems Design Techniques
  • VLSI and Analog Circuit Testing
  • Advanced Memory and Neural Computing
  • VLSI and FPGA Design Techniques
  • Radiation Effects in Electronics
  • Supercapacitor Materials and Fabrication
  • Quantum-Dot Cellular Automata
  • Ferroelectric and Negative Capacitance Devices
  • Software-Defined Networks and 5G
  • Advancements in Semiconductor Devices and Circuit Design
  • Advanced Optical Network Technologies
  • Advanced MIMO Systems Optimization
  • Coding theory and cryptography
  • Advancements in Battery Materials
  • Advanced Neural Network Applications
  • Power Line Communications and Noise
  • Graphene research and applications
  • Cellular Automata and Applications
  • Advanced Bandit Algorithms Research
  • Advanced Graph Neural Networks
  • Manufacturing Process and Optimization
  • Cryptography and Residue Arithmetic

Democritus University of Thrace
2016-2025

University of Patras
2003-2018

University of Western Macedonia
2010-2012

Foundation for Research and Technology Hellas
2008-2009

National Technical University of Athens
1997-2005

Research Academic Computer Technology Institute
2004-2005

National Polytechnic School
2003

Manufacturing, through the Industry 4.0 concept, is moving to next phase of digitalization. supported by innovative technologies such as Internet Things, Cloud technology, Augmented and Virtual Reality will also play an important role in manufacturing education, supporting advanced life-long training skilled workforce. Advanced called Education 4.0, networked ecosystems develop skills build competences for new era manufacturing. Towards that, this work present how adoption cyber-physical...

10.1016/j.promfg.2018.04.005 article EN Procedia Manufacturing 2018-01-01

Parallel-prefix adders offer a highly efficient solution to the binary addition problem and are well-suited for VLSI implementations. A novel framework is introduced, which allows design of parallel-prefix Ling adders. The proposed approach saves one-logic level implementation compared structures traditional definition carry lookahead equations reduces fanout requirements design. Experimental results reveal that achieve delay reductions up 14 percent when fastest architectures presented equations.

10.1109/tc.2005.26 article EN IEEE Transactions on Computers 2005-01-12

Structured sparsity has been proposed as an efficient way to prune the complexity of Machine Learning (ML) applications and simplify handling sparse data in hardware. Accelerating ML models, whether for training, or inference, heavily relies on matrix multiplications that can be efficiently executed vector processors, custom engines. This work aims integrate simplicity structured into execution speed up corresponding multiplications. Initially, implementation structured-sparse multiplication...

10.48550/arxiv.2501.10189 preprint EN arXiv (Cornell University) 2025-01-17

In this work, we propose a new algorithm for designing diminished-1 modulo 2/sup n/+1multipliers. The implementation of the proposed requires n + 3 partial products that are reduced by tree architecture into two summands, which finally added n/+1 adder. multipliers, compared to existing implementations, offer enhanced operation speed and their regular structure allows efficient VLSI implementations.

10.1109/tc.2005.63 article EN IEEE Transactions on Computers 2005-03-07

Two architectures for modulo 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> +1 adders are introduced in this paper. The first one is built around a sparse carry computation unit that computes only some of the carries addition. This approach enabled by introduction inverted circular idempotency property parallel-prefix operator and its regularity area efficiency further enhanced new prefix operator. resulting diminished-1 can be...

10.1109/tc.2010.261 article EN IEEE Transactions on Computers 2010-12-22

In the last decade, Teaching Factories, which enable a two-way knowledge transfer in manufacturing education, have been built up industry and academia. Such initiatives from local to worldwide level help both parties mutually benefit. This paper introduces framework for delivery of industrial learning training young engineers creating at same time prerequisites SMEs explore new technologies through Factory paradigm. particular, this framework, participants will be receivers valuable able...

10.1016/j.promfg.2018.04.014 article EN Procedia Manufacturing 2018-01-01

The need for efficient implementation of simple crossbar schedulers has increased in the recent years due to advent on-chip interconnection networks that require low latency message delivery. core function any scheduler is arbitration resolves conflicting requests same output. Since, delay arbiters directly determine operation speed scheduler, design faster paramount importance. In this paper, we present a new bit-level algorithm and circuit techniques programmable priority offer...

10.1109/iccd.2008.4751932 article EN 2008-10-01

In this paper, a new leading-zero counter (or detector) is presented. New boolean relations for the bits of count are derived that allow their computation to be performed using standard carry-lookahead techniques. Using proposed approach various design choices can explored and different circuit topologies counting unit. The circuits efficiently implemented either in static or dynamic logic require significantly less energy per operation compared already known architectures. integration with...

10.1109/tvlsi.2008.2000458 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2008-07-01

Scalable Network-on-Chip (NoC) architectures should achieve high-throughput and low-latency operation without exceeding the stringent area/energy constraints of modern Systems-on-Chip (SoC), even when operating under a high clock frequency. Such requirements directly impact NoC routers interfaces comprising architecture. This paper focuses on micro-architecture presents ShortPath, pipelined router architecture that can high-speed implementations by parallelizing as much possible - resorting...

10.1109/tc.2016.2519916 article EN IEEE Transactions on Computers 2016-01-20

Machine learning adoption has seen a widespread bloom in recent years, with neural network implementations being at the forefront. In light of these developments, vector processors are currently experiencing resurgence interest, due to their inherent amenability accelerate data-parallel algorithms required machine environments. this paper, we propose scalable and high-performance RISC-V processor core. The presented employs triptych novel mechanisms that work synergistically achieve desired...

10.1109/iscas45731.2020.9181071 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2020-09-29

Approximate computation has evolved recently as a viable alternative for maximizing energy efficiency. One aspect of approximate computing involves the design hardware units that return sufficiently accurate result examined occasion, rather than an result. As long are allowed to compute approximately, they can be designed with multiple new ways. In this work, we focus on synthesis parallel-prefix adders. Instead exploring specific architectures, done by state-of-the-art approaches,...

10.1109/tvlsi.2023.3287631 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2023-06-30

The acceleration of deep-learning kernels in hardware relies on matrix multiplications that are executed efficiently Systolic Arrays (SA). To effectively trade off training/inference quality with cost, SA accelerators employ reduced-precision Floating-Point (FP) arithmetic. In this work, we demonstrate the need for new pipeline organizations to reduce latency and improve energy efficiency FP operators chained multiply-add operation imposed by structure SA. proposed skewed design reorganizes...

10.1109/aicas57966.2023.10168556 preprint EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2023-06-11

Large systems-on-chip (SoCs) and chip multiprocessors (CMPs), incorporating tens to hundreds of cores, create a significant integration challenge. Interconnecting huge amount architectural modules in an efficient manner, calls for scalable solutions that would offer both high throughput low-latency communication. The switches are the basic building blocks such interconnection networks their design critically affects performance whole system. So far, innovation switch relied mostly...

10.1109/tc.2012.116 article EN IEEE Transactions on Computers 2012-06-05

The efficiency of modern Networks-on-Chip (NoC) is no longer judged solely by their physical scalability, but also ability to deliver high performance, Quality-of-Service (QoS), and flow isolation at the minimum possible cost. Although traditional architectures supporting Virtual Channels (VC) offer resources for partitioning isolation, an adversarial workload can still interfere degrade performance other workloads that are active in a different set VCs. In this paper, we present PhaseNoC,...

10.7873/date.2015.0418 article EN Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2015 2015-01-01

As multicore systems transition to the many-core realm, pressure on interconnection network is substantially elevated. The chip (NoC) expected undertake expanding demands of ever-increasing numbers processing elements, while its area/power footprint remains severely constrained. Hence, low-cost NoC designs that achieve high-throughput and low-latency operation are imperative for future scalability. While buffers routers key enablers high performance, they also major consumers area power. In...

10.1109/tvlsi.2014.2383442 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2015-01-09

The need for higher throughput and lower communication latency in modern networks-on-chip (NoC) has led to low- high-radix topologies that exploit the speed provided by on-chip wires-after appropriate wire engineering-to transfer flits over longer distances a single clock cycle. In this paper, motivated same principle of fast link traversal, we propose RapidLink NoC architecture, which exploits said rapidly between adjacent routers using double-data-rate (DDR) traversals. is enhanced with...

10.1109/tcsi.2017.2734689 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2017-08-14

The efficiency of modern Networks-on-Chip (NoC) is no longer judged solely by their physical scalability, but also ability to deliver high performance, Quality-of-Service (QoS), and flow isolation at the minimum possible cost. Although traditional architectures supporting Virtual Channels (VC) offer resources for partitioning isolation, an adversarial workload can still interfere degrade performance other workloads that are active in a different set VCs. In this paper, we present PhaseNoC,...

10.5555/2755753.2757066 article EN Design, Automation, and Test in Europe 2015-03-09

Convolution is one of the most critical operations in various application domains and its computation should combine high performance with energy efficiency. This requirement both for standard convolution other spatial variants, such as dilated, strided, or transposed convolutions. In this work, we focus on design a streaming engine, called LazyDCstream, that tuned dilated convolution. LazyDCstream utilizes sliding-window architecture input data reuse leverages already-known decomposition...

10.1109/tvlsi.2022.3233882 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2023-01-09

Two architectures for parallel-prefix modulo 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> - 1 adders are presented in this paper. For large wordlengths we introduce the sparse that achieve significant reduction of wiring complexity without imposing any delay penalty. Then, Ling-carry formulation addition is presented. Ling save one logic level implementation and provide high-speed solutions smaller adder widths, where small. The...

10.1109/icecs.2005.4633502 article EN 2005-12-01
Coming Soon ...