NFDI4DS | UHH-SEMS - Publication Details

Giorgos Dimitrakopoulos

ORCID: 0000-0003-3688-7865

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5074704256

Research Areas

Interconnection Networks and Systems
Parallel Computing and Optimization Techniques
Low-power high-performance VLSI design
Embedded Systems Design Techniques
VLSI and Analog Circuit Testing
Advanced Memory and Neural Computing
VLSI and FPGA Design Techniques
Radiation Effects in Electronics
Supercapacitor Materials and Fabrication
Quantum-Dot Cellular Automata
Ferroelectric and Negative Capacitance Devices
Software-Defined Networks and 5G
Advancements in Semiconductor Devices and Circuit Design
Advanced Optical Network Technologies
Advanced MIMO Systems Optimization
Coding theory and cryptography
Advancements in Battery Materials
Advanced Neural Network Applications
Power Line Communications and Noise
Graphene research and applications
Cellular Automata and Applications
Advanced Bandit Algorithms Research
Advanced Graph Neural Networks
Manufacturing Process and Optimization
Cryptography and Residue Arithmetic

Democritus University of Thrace
2016-2025

University of Patras
2003-2018

University of Western Macedonia
2010-2012

Foundation for Research and Technology Hellas
2008-2009

National Technical University of Athens
1997-2005

Research Academic Computer Technology Institute
2004-2005

National Polytechnic School
2003

Cyber- Physical Systems and Education 4.0 –The Teaching Factory 4.0 Concept

OPENALEX - Publications

Dimitris Mourtzis Evangelia Vlachou Giorgos Dimitrakopoulos Vasilios Zogopoulos

Manufacturing, through the Industry 4.0 concept, is moving to next phase of digitalization. supported by innovative technologies such as Internet Things, Cloud technology, Augmented and Virtual Reality will also play an important role in manufacturing education, supporting advanced life-long training skilled workforce. Advanced called Education 4.0, networked ecosystems develop skills build competences for new era manufacturing. Towards that, this work present how adoption cyber-physical...

10.1016/j.promfg.2018.04.005 article EN Procedia Manufacturing 2018-01-01

High-speed parallel-prefix VLSI Ling adders

OPENALEX - Publications

Giorgos Dimitrakopoulos D. Nikolos

Parallel-prefix adders offer a highly efficient solution to the binary addition problem and are well-suited for VLSI implementations. A novel framework is introduced, which allows design of parallel-prefix Ling adders. The proposed approach saves one-logic level implementation compared structures traditional definition carry lookahead equations reduces fanout requirements design. Experimental results reveal that achieve delay reductions up 14 percent when fastest architectures presented equations.

10.1109/tc.2005.26 article EN IEEE Transactions on Computers 2005-01-12

Optimizing Structured-Sparse Matrix Multiplication in RISC-V Vector Processors

OPENALEX - Publications

Vasileios Titopoulos Kosmas Alexandridis Christodoulos Peltekis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

Structured sparsity has been proposed as an efficient way to prune the complexity of Machine Learning (ML) applications and simplify handling sparse data in hardware. Accelerating ML models, whether for training, or inference, heavily relies on matrix multiplications that can be efficiently executed vector processors, custom engines. This work aims integrate simplicity structured into execution speed up corresponding multiplications. Initially, implementation structured-sparse multiplication...

10.48550/arxiv.2501.10189 preprint EN arXiv (Cornell University) 2025-01-17

Optimizing Structured-Sparse Matrix Multiplication in RISC-V Vector Processors

OPENALEX - Publications

Vasileios Titopoulos Kosmas Alexandridis Christodoulos Peltekis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

10.1109/tc.2025.3533083 article EN IEEE Transactions on Computers 2025-01-01

Efficient Diminished-1 Modulo 2^n+1 Multipliers

OPENALEX - Publications

C. Efstathiou H.T. Vergos Giorgos Dimitrakopoulos D. Nikolos

In this work, we propose a new algorithm for designing diminished-1 modulo 2/sup n/+1multipliers. The implementation of the proposed requires n + 3 partial products that are reduced by tree architecture into two summands, which finally added n/+1 adder. multipliers, compared to existing implementations, offer enhanced operation speed and their regular structure allows efficient VLSI implementations.

10.1109/tc.2005.63 article EN IEEE Transactions on Computers 2005-03-07

On Modulo 2^n+1 Adder Design

OPENALEX - Publications

H.T. Vergos Giorgos Dimitrakopoulos

Two architectures for modulo 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> +1 adders are introduced in this paper. The first one is built around a sparse carry computation unit that computes only some of the carries addition. This approach enabled by introduction inverted circular idempotency property parallel-prefix operator and its regularity area efficiency further enhanced new prefix operator. resulting diminished-1 can be...

10.1109/tc.2010.261 article EN IEEE Transactions on Computers 2010-12-22

The fast evolving landscape of on-chip communication

OPENALEX - Publications

Davide Bertozzi Giorgos Dimitrakopoulos José Flich Sören Sonntag

10.1007/s10617-014-9137-6 article EN Design Automation for Embedded Systems 2014-04-29

Enabling Small Medium Enterprises (SMEs) to improve their potential through the Teaching Factory paradigm

OPENALEX - Publications

Dimitris Mourtzis Nikoletta Boli Giorgos Dimitrakopoulos Stelios Zygomalas Angelos Koutoupes

In the last decade, Teaching Factories, which enable a two-way knowledge transfer in manufacturing education, have been built up industry and academia. Such initiatives from local to worldwide level help both parties mutually benefit. This paper introduces framework for delivery of industrial learning training young engineers creating at same time prerequisites SMEs explore new technologies through Factory paradigm. particular, this framework, participants will be receivers valuable able...

10.1016/j.promfg.2018.04.014 article EN Procedia Manufacturing 2018-01-01

Reusing Softmax Hardware Unit for GELU Computation in Transformers

OPENALEX - Publications

Christodoulos Peltekis Kosmas Alexandridis Giorgos Dimitrakopoulos

10.1109/aicas59952.2024.10595882 article EN 2024-04-22

Fast arbiters for on-chip network switches

OPENALEX - Publications

Giorgos Dimitrakopoulos Nikos Chrysos Kostas Galanopoulos

The need for efficient implementation of simple crossbar schedulers has increased in the recent years due to advent on-chip interconnection networks that require low latency message delivery. core function any scheduler is arbitration resolves conflicting requests same output. Since, delay arbiters directly determine operation speed scheduler, design faster paramount importance. In this paper, we present a new bit-level algorithm and circuit techniques programmable priority offer...

10.1109/iccd.2008.4751932 article EN 2008-10-01

Low-Power Leading-Zero Counting and Anticipation Logic for High-Speed Floating Point Units

OPENALEX - Publications

Giorgos Dimitrakopoulos Kostas Galanopoulos Christos Mavrokefalidis Dimitris Nikolos

In this paper, a new leading-zero counter (or detector) is presented. New boolean relations for the bits of count are derived that allow their computation to be performed using standard carry-lookahead techniques. Using proposed approach various design choices can explored and different circuit topologies counting unit. The circuits efficiently implemented either in static or dynamic logic require significantly less energy per operation compared already known architectures. integration with...

10.1109/tvlsi.2008.2000458 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2008-07-01

ShortPath: A Network-on-Chip Router with Fine-Grained Pipeline Bypassing

OPENALEX - Publications

Αναστάσιος Ψαρράς Ioannis Seitanidis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

Scalable Network-on-Chip (NoC) architectures should achieve high-throughput and low-latency operation without exceeding the stringent area/energy constraints of modern Systems-on-Chip (SoC), even when operating under a high clock frequency. Such requirements directly impact NoC routers interfaces comprising architecture. This paper focuses on micro-architecture presents ShortPath, pipelined router architecture that can high-speed implementations by parallelizing as much possible - resorting...

10.1109/tc.2016.2519916 article EN IEEE Transactions on Computers 2016-01-20

RISC-V2: A Scalable RISC-V Vector Processor

OPENALEX - Publications

Kariofyllis Patsidis Chrysostomos Nicopoulos Georgios Ch. Sirakoulis Giorgos Dimitrakopoulos

Machine learning adoption has seen a widespread bloom in recent years, with neural network implementations being at the forefront. In light of these developments, vector processors are currently experiencing resurgence interest, due to their inherent amenability accelerate data-parallel algorithms required machine environments. this paper, we propose scalable and high-performance RISC-V processor core. The presented employs triptych novel mechanisms that work synergistically achieve desired...

10.1109/iscas45731.2020.9181071 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2020-09-29

Synthesis of Approximate Parallel-Prefix Adders

OPENALEX - Publications

Apostolos Stefanidis Ioanna Zoumpoulidou Dionysios Filippas Giorgos Dimitrakopoulos Georgios Ch. Sirakoulis

Approximate computation has evolved recently as a viable alternative for maximizing energy efficiency. One aspect of approximate computing involves the design hardware units that return sufficiently accurate result examined occasion, rather than an result. As long are allowed to compute approximately, they can be designed with multiple new ways. In this work, we focus on synthesis parallel-prefix adders. Instead exploring specific architectures, done by state-of-the-art approaches,...

10.1109/tvlsi.2023.3287631 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2023-06-30

Reduced-Precision Floating-Point Arithmetic in Systolic Arrays with Skewed Pipelines

OPENALEX - Publications

Dionysios Filippas Christodoulos Peltekis Giorgos Dimitrakopoulos Chrysostomos Nicopoulos

The acceleration of deep-learning kernels in hardware relies on matrix multiplications that are executed efficiently Systolic Arrays (SA). To effectively trade off training/inference quality with cost, SA accelerators employ reduced-precision Floating-Point (FP) arithmetic. In this work, we demonstrate the need for new pipeline organizations to reduce latency and improve energy efficiency FP operators chained multiply-add operation imposed by structure SA. proposed skewed design reorganizes...

10.1109/aicas57966.2023.10168556 preprint EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2023-06-11

Merged Switch Allocation and Traversal in Network-on-Chip Switches

OPENALEX - Publications

Giorgos Dimitrakopoulos Emmanouil Kalligeros Kostas Galanopoulos

Large systems-on-chip (SoCs) and chip multiprocessors (CMPs), incorporating tens to hundreds of cores, create a significant integration challenge. Interconnecting huge amount architectural modules in an efficient manner, calls for scalable solutions that would offer both high throughput low-latency communication. The switches are the basic building blocks such interconnection networks their design critically affects performance whole system. So far, innovation switch relied mostly...

10.1109/tc.2012.116 article EN IEEE Transactions on Computers 2012-06-05

PhaseNoC: TDM Scheduling at the Virtual-Channel Level for Efficient Network Traffic Isolation

OPENALEX - Publications

Αναστάσιος Ψαρράς Ioannis Seitanidis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

The efficiency of modern Networks-on-Chip (NoC) is no longer judged solely by their physical scalability, but also ability to deliver high performance, Quality-of-Service (QoS), and flow isolation at the minimum possible cost. Although traditional architectures supporting Virtual Channels (VC) offer resources for partitioning isolation, an adversarial workload can still interfere degrade performance other workloads that are active in a different set VCs. In this paper, we present PhaseNoC,...

10.7873/date.2015.0418 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2015-01-01

ElastiStore: Flexible Elastic Buffering for Virtual-Channel-Based Networks on Chip

OPENALEX - Publications

Ioannis Seitanidis Αναστάσιος Ψαρράς Kypros Chrysanthou Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

As multicore systems transition to the many-core realm, pressure on interconnection network is substantially elevated. The chip (NoC) expected undertake expanding demands of ever-increasing numbers processing elements, while its area/power footprint remains severely constrained. Hence, low-cost NoC designs that achieve high-throughput and low-latency operation are imperative for future scalability. While buffers routers key enablers high performance, they also major consumers area power. In...

10.1109/tvlsi.2014.2383442 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2015-01-09

Networks-on-Chip With Double-Data-Rate Links

OPENALEX - Publications

Αναστάσιος Ψαρράς Savvas Moisidis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

The need for higher throughput and lower communication latency in modern networks-on-chip (NoC) has led to low- high-radix topologies that exploit the speed provided by on-chip wires-after appropriate wire engineering-to transfer flits over longer distances a single clock cycle. In this paper, motivated same principle of fast link traversal, we propose RapidLink NoC architecture, which exploits said rapidly between adjacent routers using double-data-rate (DDR) traversals. is enhanced with...

10.1109/tcsi.2017.2734689 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2017-08-14

PhaseNoC: TDM scheduling at the virtual-channel level for efficient network traffic isolation

OPENALEX - Publications

Αναστάσιος Ψαρράς Ioannis Seitanidis Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

10.5555/2755753.2757066 article EN Design, Automation, and Test in Europe 2015-03-09

Streaming Dilated Convolution Engine

OPENALEX - Publications

Dionysios Filippas Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

Convolution is one of the most critical operations in various application domains and its computation should combine high performance with energy efficiency. This requirement both for standard convolution other spatial variants, such as dilated, strided, or transposed convolutions. In this work, we focus on design a streaming engine, called LazyDCstream, that tuned dilated convolution. LazyDCstream utilizes sliding-window architecture input data reuse leverages already-known decomposition...

10.1109/tvlsi.2022.3233882 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2023-01-09

A low-cost synthesizable RISC-V dual-issue processor core leveraging the compressed Instruction Set Extension

OPENALEX - Publications

Karyofyllis Patsidis Dimitris Konstantinou Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

10.1016/j.micpro.2018.05.007 article EN Microprocessors and Microsystems 2018-05-19

New architectures for modulo 2N - 1 adders

OPENALEX - Publications

Giorgos Dimitrakopoulos D. Nikolos H.T. Vergos D. Nikolos C. Efstathiou

Two architectures for parallel-prefix modulo 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> - 1 adders are presented in this paper. For large wordlengths we introduce the sparse that achieve significant reduction of wiring complexity without imposing any delay penalty. Then, Ling-carry formulation addition is presented. Ling save one logic level implementation and provide high-speed solutions smaller adder widths, where small. The...

10.1109/icecs.2005.4633502 article EN 2005-12-01

Coming Soon ...