- Semiconductor materials and devices
- Advanced Memory and Neural Computing
- Advancements in Semiconductor Devices and Circuit Design
- Radiation Effects in Electronics
- Low-power high-performance VLSI design
- Parallel Computing and Optimization Techniques
- Ferroelectric and Negative Capacitance Devices
- Integrated Circuits and Semiconductor Failure Analysis
- VLSI and Analog Circuit Testing
- Neural dynamics and brain function
- Neural Networks and Reservoir Computing
- Embedded Systems Design Techniques
- Interconnection Networks and Systems
- Real-Time Systems Scheduling
- CCD and CMOS Imaging Sensors
- Advanced Neural Network Applications
- Neuroscience and Neural Engineering
- Probabilistic and Robust Engineering Design
- Photoreceptor and optogenetics research
- ECG Monitoring and Analysis
- Distributed and Parallel Computing Systems
- Simulation Techniques and Applications
- Advanced Data Storage Technologies
- Adversarial Robustness in Machine Learning
- Real-time simulation and control systems
IMEC
2011-2019
National Technical University of Athens
2011-2018
KU Leuven
2011-2017
Institute of Communication and Computer Systems
2015
National and Kapodistrian University of Athens
2014
Carbon Nanotube Field-Effect Transistors (CNFETs) are highly promising to improve the energy efficiency of digital logic circuits. Here, we quantify Very-Large-Scale Integrated (VLSI) circuit-level CNFETs versus advanced technology options (ATOs) currently under consideration [e.g., silicon-germanium (SiGe) channels and progressing from today's FinFETs gate-all-around nanowires/nanosheets]. We use industry-practice physical designs VLSI processor cores in future nodes with millions...
Rapid advances in semiconductor fabrication technology have enabled the proliferation of miniaturized body-worn sensors capable long term pervasive biomedical signal monitoring. In this paper, we present a novel deep learning-based framework (BiometricNET) on biometric identification using data collected from wrist-worn Photoplethysmography (PPG) signals ambulatory environments. We formulated completely personalized data-driven approach, four-layer neural network - employing two convolution...
In this paper, we present a design-technology tradeoff analysis to implement fully connected neural network using non-volatile OxRRAM cells. The requirement of high number distinct levels in synaptic weight has been established as primary bottleneck for single NVM unit. We propose mixed-radix encoding system multi-device unit achieving classification accuracy (94%) including device variability. To our knowledge, is the first paper discuss between and terms design technology silicon data....
To enable energy-efficient embedded execution of Deep Neural Networks (DNNs), the critical sections these workloads, their multiply-accumulate (MAC) operations, need to be carefully optimized. The SotA pursues this through run-time precision-scalable MAC operators, which can support varying precision needs DNNs in an way. Yet, implement adaptable operation, most solutions rely on separately optimized low multipliers and a precision-variable accumulation scheme, with possible disadvantages...
Simulations of an inverter and a 32-bit SRAM bit slice are performed based on atomistic approach. The circuits' devices populated with individual defects, which have realistic carrier-capture emission behaviour. wide distribution defect time scales, accounts for both fast (Random Telegraph Noise - RTN) near-permanent (Bias Temperature Instability BTI) defects. property the model allows detection workload dependency in delay circuits.
Bias Temperature Instability (BTI) is a major concern for the reliability of decameter to nanometer devices. Older modeling approaches fail capture time-dependent device variability or maintain crude view device's stress. Previously, two-state atomistic model has been introduced, which based on gate stack defect kinetics. Its complexity preventing seamless integration in simulations large inventories over typical system lifetimes. In this paper, we present an approach that alleviates...
The advent of high-performance computing (HPC) in recent years has led to its increasing use brain studies through computational models. scale and complexity such models are constantly increasing, leading challenging requirements. Even though modern HPC platforms can often deal with challenges, the vast diversity modeling field does not permit for a homogeneous acceleration platform effectively address complete array requirements.In this paper we propose build BrainFrame, heterogeneous that...
As technology nodes approach deca-nanometer dimensions, many phenomena threaten the binary correctness of processor operation. Computer architects typically enhance their designs with reliability, availability and serviceability (RAS) schemes to correct such errors, in cases at cost extra clock cycles, which, turn, leads performance variability. The goal current paper is absorb this variability using Dynamic Voltage Frequency Scaling (DVFS). A closed-loop implementation proposed, which...
In this paper, standard cell design for iN7 CMOS platform technology targeting the tightest contacted poly pitch (CPP) of 42 nm and a metal 32 in FinFET is presented. Three architectures iN7, 7.5-Track library, 6.5-Track 6-Track library have been designed. Scaling boosters are introduced libraries progressively: first an extra MOL layer to enable efficient layout three starting with library; second, fully self aligned gate contact 6.5 third, includes buried rail track supply. The cells on...
Prior art on Bias Temperature Instability (BTI) and Random Telegraph Noise (RTN) shows their importance for digital system reliability. Reaction-diffusion models align poorly with deca-nanometer dimension experiments. Modern atomistic capture time-zero/-dependent effects but are complicated constrained by memory. We propose an BTI/RTN transient simulator that can be massively threaded across any many-core platform a hypervisor. Compared to commercial reference we achieve x7 maximum speedup...
Atomistic-based approaches accurately model Bias Temperature Instability phenomena, but they suffer from prolonged execution times, preventing their seamless integration in system-level analysis flows. In this paper we present a comprehensive flow that combines the accuracy of Capture Emission Time (CET) maps with efficiency Compact Digital Waveform (CDW) representation. That way, capture true workload-dependent BTI-induced degradation selected CPU components. First, show existing works...
In-vivo and in-vitro experiments are routinely used in neuroscience to unravel brain functionality. Although they a powerful experimentation tool, also time-consuming and, often, restrictive. Computational attempts solve this by using biologically-plausible biophysically-meaningful neuron models, most prominent among which the conductance-based models. Their computational complexity calls for accelerator-based computing mount large-scale or real-time neuroscientific experiments. In paper, we...
Biologically accurate neuron simulations are increasingly important in research related to brain activity. They computationally intensive and feature data task parallelism. In this paper, we present a case study for the mapping of biologically inferior-olive (InfOli), neural cell simulator on an many-core platform. The Single-Chip Cloud Computer (SCC) is experimental processor created by Intel Labs. target neurons provide major input cerebellum involved motor skills space perception. We...
In this paper, we propose EDA methodologies for efficient, datapath-wide reliability analysis under Bias Temperature Instability (BTI). The proposed flow combines the efficiency of atomistic, pseudo-transient BTI modeling with accuracy commercial Static Timing Analysis (STA) tools. order to reduce transistor inventory that needs be tracked by STA solver, develop a threshold-pruning methodology identify variation-critical part design. That way, accelerate variation-aware iterations, maximum...
Detailed thermal analysis is usually performed exclusively at design time since it a computationally intensive task. In this paper, we introduce novel methodology for fast, yet accurate, analysis. The introduced software supported by new open source tool that enables hierarchical with adaptive levels of granularity. Experimental results prove the efficiency our approach leads to average reduction execution overhead up 70% penalty in accuracy ranging between 2% and 8%.
The development of physiologically plausible neuron models comes with increased complexity, which poses a challenge for many-core computing. In this work, we have chosen an extension the demanding Hodgkin-Huxley model neurons Inferior Olivary Nucleus, area vital importance motor skills. computing fabric choice is Intel Xeon-Xeon Phi system, widely-used in modern infrastructure. target application parallelized combinations MPI and OpenMP. best configurations are scaled up to human InfOli numbers.
Brain modeling has been receiving significant attention over the years, both for its neuroscientific potential and challenges in context of high-performance computing. The development physiologically plausible neuron models comes at cost increased complexity. In this work, we have selected a highly computationally demanding model Inferior-Olivary Nucleus (InfOli) based on Hodgkin-Huxley (HH) model. This brain region, functionally coupled with cerebellum, is vital importance motor skills...
Silicon design miniaturization has dramatically improved the integration scale in one chip, highlighting same time reliability issues. Error-correction mechanisms deal with these issues ensuring operation Reliability, Availability and Serviceability (RAS), paying a price performance. The current study deploys run-time mechanism that mitigates correction overhead, guaranteeing performance dependability. In this direction, closed-loop controller absorbs RAS-induced delay by triggering Dynamic...
Transient errors are a major concern for the correct operation of low-level cache memories. Aggressive integration requires effective mitigation such errors, without extreme overheads in power, timing, or silicon area. We demonstrate hybrid (hardware-software) scheme that mitigates bit flips data reside caches. The methodology is shown to be applicable streaming applications and we illustrate with video decoding case study on state-of-the-art many-core chip. single-chip cloud computer an...
In this paper we propose optimization algorithms for the runtime management of gracefully degradable adaptive MP-SoCs. Assuring reliability all hardware components in a system becomes increasingly difficult. On top growing defect densities and rising complexity conventional testing, wear-out effects may reduce availability on-chip resources during lifetime. However, adaptability modern MPSoCs can provide means permanent fault tolerance graceful degradation via management. We have developed...
Transistor miniaturization, combined with the dawn of novel switching semiconductor structures, calls for careful examination variability and aging computer fabric. Time-zero time-dependent phenomena need to be carefully considered so that dependability digital systems can guaranteed. Already, architectures contain many mechanisms detect correct physically induced reliability violations. In cases, guarantees on functional correctness come at a quantifiable performance cost. The current paper...