- Advanced Memory and Neural Computing
- Ferroelectric and Negative Capacitance Devices
- Semiconductor materials and devices
- Neural Networks Stability and Synchronization
- Advanced Neural Network Applications
- Machine Learning and ELM
- Photonic and Optical Devices
- Image and Signal Denoising Methods
- Advanced Image Processing Techniques
- Neural Networks and Reservoir Computing
- Advanced Sensor and Energy Harvesting Materials
- Advanced Vision and Imaging
- Parallel Computing and Optimization Techniques
- Optical Network Technologies
- Remote-Sensing Image Classification
- Low-power high-performance VLSI design
- Quantum-Dot Cellular Automata
- Adversarial Robustness in Machine Learning
- Embedded Systems Design Techniques
- Particle accelerators and beam dynamics
- Advancements in Semiconductor Devices and Circuit Design
- Wireless Body Area Networks
- Image Processing Techniques and Applications
- Video Surveillance and Tracking Methods
- Fire Detection and Safety Systems
University of Notre Dame
2014-2023
Neural network accelerators with low latency and energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow accelerating the extremely bit-width neural (ELB-NN) in embedded FPGAs hybrid quantization schemes. This covers both training FPGA-based deployment, which facilitates space exploration simplifies tradeoff between accuracy computation efficiency. Using this helps hardware designers to deliver accelerator devices under strict resource power...
Co-exploration of neural architectures and hardware design is promising due to its capability simultaneously optimize network accuracy efficiency. However, state-of-the-art architecture search algorithms for the co-exploration are dedicated conventional von-Neumann computing architecture, whose performance heavily limited by well-known memory wall. In this article, we first bring computing-in-memory which can easily transcend wall, interplay with search, aiming find most efficient high...
Deep neural network (DNN) accelerators with improved energy and delay are desirable for meeting the requirements of hardware targeted IoT edge computing systems. Convolutional networks (CoNNs) belong to one most popular types DNN architectures. This article presents design evaluation an accelerator CoNNs. The system-level architecture is based on mixed-signal, cellular (CeNNs). Specifically, we present (i) implementation different layers, including convolution, ReLU, pooling, in a CoNN using...
This paper discusses the development and evaluation of a Cellular Neural Network (CeNN) friendly deep learning network for solving MNIST digit recognition problem. Prior work has shown that CeNNs leveraging emerging technologies such as tunnel transistors can improve energy or EDP CeNNs, while simultaneously offering richer/more complex functionality. Important questions to address are what applications benefit from whether eventually outperform other alternatives at application-level in...
Urban land usage classification is a critical task in big data based smart city applications that aim to understand the social-economic functions and physical attributes urban environments. This paper focuses on migratable problem using remote sensing (i.e., satellite images). Our goal accurately classify of locations target where ground truth not available by leveraging model from source such available. motivated limitation current solutions primarily rely rich set ground-truth for accurate...
Emerging memory devices are an attractive choice for implementing very energy-efficient in-situ matrix-vector multiplication (MVM) use in intelligent edge platforms. Despite their great potential, device-level non-idealities have a large impact on the application-level accuracy of deep neural network (DNN) inference. We introduce low-density parity-check code (LDPC) based approach to correct non-ideality induced errors encountered during MVM. first encode weights using error correcting codes...
The memory wall bottleneck is a key challenge across many data-intensive applications. Multi-level FeFET-based embedded non-volatile memories are promising solution for denser and more energy-efficient on-chip memory. However, reliable multi-level cell storage requires careful optimizations to minimize the design overhead costs. In this work, we investigate interplay between FeFET device characteristics, programming schemes, array architecture, explore different choices optimize performance,...
In this article, we perform a uniform benchmarking for the convolutional neural network (CoNN) based on cellular (CeNN) using variety of beyond-CMOS technologies. Representative charge-based and spintronic device technologies are implemented to enable energy-efficient CeNN related computations. To alleviate delay energy overheads fully connected layer, hybrid CeNN-based CoNN system is proposed. It shown that low-power FETs devices promising candidates implement CoNNs CeNNs. Specifically,...
A Cellular Neural Network (CNN) is a powerful processor that can significantly improve the performance of spatio-temporal applications such as pattern recognition, image processing, motion detection, when compared to more traditional von Neumann architecture. In this paper, we show how tunneling field effect transistors (TFETs) be utilized enhance CNNs. Specifically, power consumption TFET-based CNNs lower MOSFET-based due improved voltage controlled current sources (VCCSs) - an important...
Deep Neural Networks (DNNs) have achieved tremendous success in many application domains. Inspired by its success, specialized accelerators been and continue to be developed process DNN workloads an energy-efficient manner. The design space for can extremely large since they employ different datapaths, data mapping strategies, circuits, device technologies. To explore the developing accelerators, it is important quickly estimate energy cost associated with accelerator. This paper introduces...
Neural network accelerators with low latency and energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow accelerating the extremely bit-width neural (ELB-NN) in embedded FPGAs hybrid quantization schemes. This covers both training FPGA-based deployment, which facilitates space exploration simplifies tradeoff between accuracy computation efficiency. Using this helps hardware designers to deliver accelerator devices under strict resource power...
A new spintronic nonvolatile memory cell analogous to 1T DRAM with non-destructive READ is proposed. The cells can be used as neural computing units. dual-circuit network architecture proposed leverage these devices against the complex operations involved in convolutional networks. Simulations based on HSPICE and MATLAB were performed study performance of this when classifying images well effect varying size stability nanomagnets. outperform a purely charge-based implementation same network,...
Traditional CMOS based von Neumann architectures face daunting challenges in performing complex computational tasks at high speed and with low power on spatio-temporal data, e.g., image processing, pattern recognition, etc. In this study, we discuss the utilities of various steep slope, beyond-CMOS emerging devices for processing applications within non-von computing paradigm cellular neural networks (CNNs). general, subthreshold swing obviates output transfer hardware used a conventional...
Cellular neural networks (CNNs) are a powerful analog architecture that can outperform traditional von Neumann for spatio-temporal information processing applications, e.g., image and speech recognition. Much existing work reports energy dissipation CNNs at the chip level, which includes of sensors, actuators, other components. As such, impacts various system variables, application templates, characteristics resistive element, etc., on profile CNN cannot be easily determined. In this work,...
Cellular neural networks (CNNs) are a powerful analog architecture that can outperform traditional von Neumann for spatio-temporal information processing applications, e.g., image and speech recognition. Much existing work reports energy dissipation CNNs at the chip level, which includes of sensors, actuators, other components. As such, impacts various system variables, application templates, characteristics resistive element, etc., on profile CNN cannot be easily determined. In this work,...
Traditional CMOS based von Neumann architectures face daunting challenges in performing complex computational tasks at high speed and with low power on spatio-temporal data, e.g., image processing, pattern recognition, etc. In this study, we discuss the utilities of various steep slope, beyond-CMOS emerging devices for processing applications within non-von computing paradigm cellular neural networks (CNNs). general, subthreshold swing obviates output transfer hardware used a conventional...
Video quality can suffer from limited internet speed while being streamed by users. Compression artifacts start to appear when the bitrate decreases match available bandwidth. Existing algorithms either focus on removing compression at same video resolution, or upscaling resolution but not artifacts. Super resolution-only approaches will amplify along with details default. We propose a lightweight convolutional neural network (CNN)-based algorithm which simultaneously performs reduction and...
Video quality can suffer from limited internet speed while being streamed by users. Compression artifacts start to appear when the bitrate decreases match available bandwidth. Existing algorithms either focus on removing compression at same video resolution, or upscaling resolution but not artifacts. Super resolution-only approaches will amplify along with details default. We propose a lightweight convolutional neural network (CNN)-based algorithm which simultaneously performs reduction and...
Co-exploration of neural architectures and hardware design is promising to simultaneously optimize network accuracy efficiency. However, state-of-the-art architecture search algorithms for the co-exploration are dedicated conventional von-neumann computing architecture, whose performance heavily limited by well-known memory wall. In this paper, we first bring computing-in-memory which can easily transcend wall, interplay with search, aiming find most efficient high maximized Such a novel...
Deep neural network (DNN) accelerators with improved energy and delay are desirable for meeting the requirements of hardware targeted IoT edge computing systems. Convolutional networks (CoNNs) belong to one most popular types DNN architectures. This paper presents design evaluation an accelerator CoNNs. The system-level architecture is based on mixed-signal, cellular (CeNNs). Specifically, we present (i) implementation different layers, including convolution, ReLU, pooling, in a CoNN using...
Optical deep learning (DL) accelerators have attracted significant interests due to their latency and power advantages. In this article, we focus on incoherent optical designs. A challenge is that there no known solution perform single-wavelength accumulation (a key operation required for DL workloads) using signals efficiently. Therefore, devise a hybrid approach, where done in the electrical domain, multiplication performed domain. The technology enabler of our design transistor laser,...
As cost and performance benefits associated with Moore's Law scaling slow, researchers are studying alternative architectures (e.g., based on analog and/or spiking circuits) computational models convolutional recurrent neural networks) to perform application-level tasks faster, more energy efficiently, accurately. We investigate cellular network (CeNN)-based co-processors at the for these metrics. While it is well-known that CeNNs can be well-suited spatio-temporal information processing,...