- Advanced Memory and Neural Computing
- Ferroelectric and Negative Capacitance Devices
- Parallel Computing and Optimization Techniques
- Magnetic properties of thin films
- Neural Networks and Reservoir Computing
- Semiconductor materials and devices
- Energy Harvesting in Wireless Networks
- Advanced Data Storage Technologies
- Neural Networks and Applications
- Phase-change materials and chalcogenides
- Superconducting Materials and Applications
- Magnetic confinement fusion research
- Advancements in Semiconductor Devices and Circuit Design
- Algorithms and Data Compression
- Physics of Superconductivity and Magnetism
- Photoreceptor and optogenetics research
- Network Packet Processing and Optimization
- Genomics and Phylogenetic Studies
- Machine Learning and ELM
- VLSI and Analog Circuit Testing
- Quantum Computing Algorithms and Architecture
- RNA and protein synthesis mechanisms
- Advanced Optical Imaging Technologies
- Green IT and Sustainability
- Advanced biosensing and bioanalysis techniques
University of Minnesota
2018-2024
Twin Cities Orthopedics
2019-2024
Northeastern University
2023
University of Minnesota System
2018-2019
The Computational Random Access Memory (CRAM) is a platform that makes small modification to standard spintronics-based memory array organically enable logic operations within the array. CRAM provides true in-memory computational can perform computations array, as against other methods send tasks separate processor module or near-memory at periphery of This paper describes how structure be built and utilized, accounting for considerations device, gate, functional levels. Techniques...
We present the Spin Hall Effect (SHE) Computational Random Access Memory (CRAM) for in-memory computation, incorporating considerations at device, gate, and functional levels. For two specific applications (2-D convolution neuromorphic digit recognition), we show that SHE-CRAM is 3x faster has over 4x lower energy than a prior STT-based CRAM implementation, 2000x least 130x more energy-efficient state-of-the-art near-memory processing.
There is increasing demand to bring machine learning capabilities low power devices. By integrating the computational of with deployment devices, a number new applications become possible. In some applications, such devices will not even have battery, and must rely solely on energy harvesting techniques. This puts extreme constraints hardware, which be efficient capable tolerating interruptions due outages. Here, we propose an in-memory accelerator utilizing non-volatile spintronic memory....
Neural networks span a wide range of applications industrial and commercial significance. Binary neural (BNN) are particularly effective in trading accuracy for performance, energy efficiency, or hardware/software complexity. Here, we introduce spintronic, re-configurable in-memory BNN accelerator, PIMBALL: P rocessing I n M emory B NN A cce L(L) erator, which allows massively parallel efficient computation. PIMBALL is capable being used as standard spintronic memory (STT-MRAM) array...
Recent years have witnessed an increasing interest in the processing-in-memory (PIM) paradigm computing due to its promise improve performance through reduction of energy-hungry and long-latency memory accesses. Joined with explosion data be processed, produced genomics - particularly genome sequencing PIM has become a potential promising candidate for accelerating applications since they do not scale up well conventional von Neumann systems. In this article, we present in-memory accelerator...
Stochastic computing (SC) has emerged as a promising solution for performing complex functions on large amounts of data to meet future demands. However, the hardware needed generate random bit-streams using conventional CMOS based technologies drastically increases area and delay cost. Area costs can be reduced spintronics RNGs, however, this will not alleviate since stochastic bit generation is still performed separately from computation. In paper, we present an SC method embedding...
Evaluating CAD solutions to physical implementation problems has been extremely challenging due the unavailability of modern benchmarks in public domain. This work aims address this challenge by proposing a process-portable machine learning (ML)-based methodology for synthesizing synthetic power delivery network (PDN) that obfuscate intellectual property information. In particular, proposed approach leverages generative adversarial networks (GAN) and transfer techniques create realistic PDN...
Processing-in-Memory (PIM) architectures have gained popularity due to their ability alleviate the memory wall by performing large numbers of operations within itself. On top this, nonvolatile (NVM) technologies offer highly energy-efficient operations, rendering processing in NVM especially promising. Unfortunately, a major drawback is that has limited endurance. Even when used for standard memory, face lifetimes, which exacerbated imbalanced usage cells. PIM significantly increases number...
High resolution Fast Fourier Transform (FFT) is important for various applications while increased memory access and parallelism requirement limits the traditional hardware. In this work, we explore acceleration opportunities high FFTs in spintronic computational RAM (CRAM) which supports true in-memory processing semantics. We experiment with Spin-Torque-Transfer (STT) Spin-Hall-Effect (SHE) based CRAMs implementing CRAFFT, a FFT accelerator memory. For one million point fixed-point FFT,...
Traditional Von Neumann computing is falling apart in the era of exploding data volumes as overhead transfer becomes forbidding. Instead, it more energy-efficient to fuse compute capability with memory where reside. This particularly critical pattern matching, a key computational step large-scale analytics, which involves repetitive search over very large databases residing memory. Emerging spintronic technologies show remarkable versatility for tight integration logic and In this article,...
This article presents a method for analyzing the parasitic effects of interconnects on performance STT-MTJ-based computational random access memory (CRAM) in-memory computation platform. The CRAM is platform that makes small reconfiguration to standard spintronics-based array enable logic operations within array. analytical in this develops methodology quantifies way which wire parasitics limit size and configuration studies impact cell- array-level design choices noise margin. Finally,...
Beyond-edge devices can operate outside the reach of power grid and without batteries. Such be deployed in large numbers regions that are difficult to access. Using machine learning, these solve complex problems relay valuable information back a host. Many such low Earth orbit even used as nanosatellites. Due harsh unpredictable nature environment, must highly energy-efficient, capable operating intermittently over wide temperature range, tolerant radiation. Here, we propose non-volatile...
RNA Sequence (RNA-Seq) abundance quantification is an important application in different fields of genomic studies, e.g., analysis offunctionally similar genes a biological sample. This depends on the availability high volume sequence data for accuracy estimation, which made possible by next generation sequencing platforms. Large scale processing requirements this push conventional computing systems to their limits due excessive movement required between and memory elements....
Spiking Neural Networks (SNNs) represent a biologically inspired computation model capable of emulating neural in human brain and brain-like structures. The main promise is very low energy consumption. Classic Von Neumann architecture based SNN accelerators hardware, however, often fall short addressing demanding data transfer requirements efficiently at scale. In this article, we propose promising alternative to overcome scalability limitations, on network in-memory accelerators, which can...
No abstract available.
This article describes how 3-D XPoint memory arrays can be used as in-memory computing accelerators. We first show that thresholded matrix-vector multiplication (TMVM), the fundamental computational kernel in many applications including machine learning (ML), implemented within a array without requiring data to leave for processing. Using implementation of TMVM, we then discuss binary neural inference engine. application core concept address issues such system scalability, where connect...
Traditional Von Neumann computing is falling apart in the era of exploding data volumes as overhead transfer becomes forbidding. Instead, it more energy-efficient to fuse compute capability with memory where reside. This particularly critical for pattern matching, a key computational step large-scale analytics, which involves repetitive search over very large databases residing memory. Emerging spintronic technologies show remarkable versatility tight integration logic and In this paper, we...
Superconducting circuits, like Adiabatic Quantum-Flux-Parametron (AQFP), offer exceptional energy efficiency but face challenges in physical design due to sophisticated spacing and timing constraints. Current tools often neglect the importance of constraint adherence throughout entire flow. In this paper, we propose SuperFlow, a fully-customized RTL-to-GDS flow tailored for AQFP devices. SuperFlow leverages synthesis tool based on CMOS technology transform any input RTL netlist an AQFP-based...
Adiabatic Quantum-Flux-Parametron (AQFP) is a superconducting logic with extremely high energy efficiency. By employing the distinct polarity of current to denote '0' and '1', AQFP devices serve as excellent carriers for binary neural network (BNN) computations. Although recent research has made initial strides toward developing an AQFP-based BNN accelerator, several critical challenges remain, preventing design from being comprehensive solution. In this paper, we propose SupeRBNN,...
There is increasing demand to bring machine learning capabilities low power devices. By integrating the computational of with deployment devices, a number new applications become possible. In some applications, such devices will not even have battery, and must rely solely on energy harvesting techniques. This puts extreme constraints hardware, which be efficient capable tolerating interruptions due outages. Here, as representative example, we propose an in-memory support vector accelerator...
Embedded/edge computing comes with a very stringent hardware resource (area) budget and need for extreme energy efficiency. This motivates repurposing, i.e., reconfiguring resources on demand, where the overhead of reconfiguration itself is subject to same tight budgets in area Numerous applications running constrained environments such as wearable devices Internet-of-Things incorporate CAM (Content Addressable Memory) key computational building block. In this paper we present CAMeleon --...