- Advanced Neural Network Applications
- Advanced Memory and Neural Computing
- CCD and CMOS Imaging Sensors
- Parallel Computing and Optimization Techniques
- Embedded Systems Design Techniques
- Adversarial Robustness in Machine Learning
- Radiation Effects in Electronics
- Anomaly Detection Techniques and Applications
- Distributed systems and fault tolerance
- Real-Time Systems Scheduling
- Real-time simulation and control systems
- Interconnection Networks and Systems
- Industrial Vision Systems and Defect Detection
- Hand Gesture Recognition Systems
- Image and Video Stabilization
- Software Testing and Debugging Techniques
- Software Reliability and Analysis Research
- Advanced Image and Video Retrieval Techniques
- Fault Detection and Control Systems
- Stochastic Gradient Optimization Techniques
- Gaze Tracking and Assistive Technology
- Time Series Analysis and Forecasting
- Silicon Carbide Semiconductor Technologies
- Face recognition and analysis
- Electrostatic Discharge in Electronics
Karlsruhe Institute of Technology
2019-2024
Institut für Informationsverarbeitung
2019-2023
Enabling the use of Deep Neural Networks (DNNs) for time-series-based applications on low-power devices such as wearables opens up a wide range new features and services. However, inference requires an enormous amount operations to be performed by computing platform. In addition, Long Short-Term Memory (LSTM)-based networks require memory store internal cell state future calculations. this paper, we therefore propose hardware/software co-design based LSTM hardware accelerator architecture...
Since their breakthrough, complexity of Deep Neural Networks (DNNs) is rising steadily. As a result, accelerators for DNNs are now used in many domains. However, designing and configuring an accelerator that meets the requirements given application perfectly challenging task. In this paper, we therefore present our approach to support design process. With analytical model systolic array can estimate performance, energy consumption area each option. To determine these metrics, usually cycle...
For AI-based systems in safety-critical domains, it is inevitable to understand the impact of random hardware faults affecting target accelerators. The high degree data reuse makes Deep Neural Network (DNN) accelerators susceptible significant fault propagation and hence hazardous predictions. Therefore, we present SiFI-AI, a simulation framework for injection DNN SiFI-AI proposes hybrid approach combining fast AI inference with cycle-accurate RTL simulation. Time-expensive only used...
In the era of electric vehicles, reliability power electronics has become a crucial part as industry evolves. Changes in electrical parameters caused by aging and degradation can lead to performance deterioration eventually total failure (end-of-life) electronic components, whereas lifetime transistors depends large extent on temperature fluctuations. Therefore, it is desired extend minimizing swings without affecting vehicle dynamics. this paper, we propose LETSCOPE (Lifecycle Extensions...
Time series-based applications such as recognition of handwriting benefit from using Deep Neural Networks (DNNs) in terms accuracy and efficiency. Due to strict power memory limitations embedded platforms the Internet-of-Things (IoT), inference DNNs is usually performed on more powerful less constrained devices. However, mobile devices smartphones or tablets leads high system requirements. In this paper, we present our approach for distributing computational workload between sensor pen a...
In the future, it is expected that safety-critical and non-critical applications are executed on same hardware. Therefore, future hardware systems should be capable of providing runtime support for higher reliability requirements performance noncritical equally. this paper, we present a run-time adaptive cache with coarse-grained safety mechanism to tackle emerging challenge. For applications, operates in mode without any mechanisms. On other hand, checkpointing rollback feature fault...
The goal of modern high performance computing platforms is to combine low power consumption and throughput. Within the European Processor Initiative (EPI), such an SoC platform meet novel exascale requirements built investigated. As part this project, we introduce embedded Field Programmable Gate Array (eFPGA), adding flexibility accelerate various workloads. In article, show our approach design eFPGA tile that supports EPI SoC. While eFPGAs are inherently reconfigurable, their initial has...
Embedded image processing applications like multicamera-based object detection or semantic segmentation are often based on Convolutional Neural Networks (CNNs) to provide precise and reliable results. The deployment of CNNs in embedded systems, however, imposes additional constraints such as latency restrictions limited energy consumption the sensor platform. These requirements have be considered during hardware/software co-design Artifical Intelligence (AI) applications. In addition,...
Within the European Processor Initiative (EPI) an objective is build embedded High-Performance processing platform for future automotive applications such as autonomous driving. An Field-Programmable-Gate-Array (eFPGA) enables to be extended needs and requirements by various stakeholders. In this paper we give overview about project our contributions define architecture of eFPGA, which suitable market.Therefore, describe concept explore eFPGA architecture. It motivated a sound use case that...
Machine learning and data processing are trending topics at the moment. However, there is still alack of a standard process to support fast, simple, effective development machine learningmodels for academia industry combined. Processes such as KDD or CRISP-DM highlyspecialized in mining business cases. Therefore, engineers often refer individualapproaches solve problem. Especially teaching, lack standardprocess challenge. Students typically get better understanding if systematic approach...
Hardware accelerators for deep neural networks (DNNs) have established themselves over the past decade. Most developments worked towards higher efficiency with an individual application in mind. This highlights strong relationship between co-designing accelerator together requirements of application. Currently a structured design flow, however, it lacks tool to evaluate DNN embedded System on Chip (SoC) platform.To address this gap state art, we introduce FLECSim, framework that enables...
Neural networks achieve high accuracy in tasks like image recognition or segmentation. However, their application safety-critical domains is limited due to black-box nature and vulnerability specific types of attacks. To mitigate this, methods detecting out-of-distribution adversarial attacks parallel the network inference were introduced. These are hard compare because they developed for different use cases, datasets, networks. fill this gap, we introduce EFFECT, an end-to-end framework...
Convolutional Neural Networks (CNNs) show tremendous performance in many Computer Vision (CV) tasks like image segmentation crucial to autonomous driving. However, they are computationally demanding and usually not robust corruptions weather influences. In this paper, we introduce our mixed-precision inference method overcome these two challenges. Therefore, enable CNN execution on modern embedded system chips (SoC) that feature a DNN accelerator reconfigurable fabric. case of change, can...
Distributed systems can be found in various applications, e.g., robotics or autonomous driving, to achieve higher flexibility and robustness. Thereby, data flow centric applications such as Deep Neural Network (DNN) inference benefit from partitioning the workload over multiple compute nodes terms of performance energy-efficiency. However, mapping large models on distributed embedded is a complex task, due low latency high throughput requirements combined with strict energy memory...
A key challenge in computing convolutional neural networks (CNNs) besides the vast number of computations are associated numerous energy-intensive transactions from main to local memory. In this paper, we present our methodical approach maximize and prune coarse-grained regular blockwise sparsity activation feature maps during CNN inference on dedicated dataflow architectures. Regular that fits target accelerator, e.g., a systolic array or vector processor, allows simplified resource...
The European Processor Initiative (EPI) is developing a processor for various sectors, including the automotive industry. To benchmark new processor, EPI uses test vehicle to demonstrate different use cases, like semi-autonomous driving. In this paper, we focus on object detection and describe cases in perception stage of autonomous Therefore, introduce four applications that include face recognition, blind spot detection, near-range far-range spatial cover wide range domains. Each case runs...