Shubham Jain

ORCID: 0000-0002-2291-7712
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Memory and Neural Computing
  • Ferroelectric and Negative Capacitance Devices
  • Advanced Neural Network Applications
  • Semiconductor materials and devices
  • Phase Change Materials Research
  • Adsorption and Cooling Systems
  • Solar Thermal and Photovoltaic Systems
  • Adversarial Robustness in Machine Learning
  • Parallel Computing and Optimization Techniques
  • Low-power high-performance VLSI design
  • Machine Learning and Data Classification
  • VLSI and Analog Circuit Testing
  • Neuroscience and Neural Engineering
  • Imbalanced Data Classification Techniques
  • Advancements in Semiconductor Devices and Circuit Design
  • Advanced Data Storage Technologies
  • Magnetic properties of thin films
  • Solar-Powered Water Purification Methods
  • Music and Audio Processing
  • Anomaly Detection Techniques and Applications
  • Heat Transfer and Optimization
  • Explainable Artificial Intelligence (XAI)
  • Data Stream Mining Techniques
  • Music Technology and Sound Studies
  • Speech and Audio Processing

IBM (United States)
2021-2025

IBM Research - Thomas J. Watson Research Center
2020-2024

Bhabha Hospital
2023

Indian Institute of Technology Delhi
2009-2023

Athlone Institute of Technology
2021-2023

Visa (United Kingdom)
2022

Visa (United States)
2022

Purdue University West Lafayette
2017-2021

Bharati Vidyapeeth Deemed University
2019

Motilal Nehru National Institute of Technology
2016

In this paper we propose a novel model for unconditional audio generation based on generating one sample at time. We show that our model, which profits from combining memory-less modules, namely autoregressive multilayer perceptrons, and stateful recurrent neural networks in hierarchical structure is able to capture underlying sources of variations the temporal sequences over very long time spans, three datasets different nature. Human evaluation generated samples indicate preferred...

10.48550/arxiv.1612.07837 preprint EN other-oa arXiv (Cornell University) 2016-01-01

In-memory computing is a promising approach to addressing the processor-memory data transfer bottleneck in systems. We propose spin-transfer torque compute-in-memory (STT-CiM), design for in-memory with magnetic RAM (STT-MRAM). The unique properties of spintronic memory allow multiple wordlines within an array be simultaneously enabled, opening up possibility directly sensing functions values stored rows using single access. modifications STT-MRAM peripheral circuits that leverage this...

10.1109/tvlsi.2017.2776954 article EN publisher-specific-oa IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2017-12-28

Resistive crossbars designed with nonvolatile memory devices have emerged as promising building blocks for deep neural network (DNN) hardware, due to their ability compactly and efficiently realize vector-matrix multiplication (VMM), the dominant computational kernel in DNNs. However, a key challenge resistive is that they suffer from range of device circuit level nonidealities, such driver resistance, sensing sneak paths, interconnect parasitics, nonlinearities peripheral circuits,...

10.1109/tcad.2020.3000185 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2020-06-04

Traditional computing systems based on the von Neumann architecture are fundamentally bottlenecked by data transfers between processors and memory. The emergence of data-intensive workloads, such as machine learning (ML), creates an urgent need to address this bottleneck designing platforms that utilize principle colocated memory processing units. Such approach, known "in-memory computing," can potentially eliminate movement costs inside array itself. Crossbars resistive nonvolatile (NVM)...

10.1109/jproc.2020.3003007 article EN publisher-specific-oa Proceedings of the IEEE 2020-07-15

Resistive crossbars have shown strong potential as the building blocks of future neural fabrics, due to their ability natively execute vector-matrix multiplication (the dominant computational kernel in DNNs). However, a key challenge that arises resistive is non-idealities synaptic devices, interconnects, and peripheral circuits lead errors computations performed. When large-scale DNNs are executed on crossbar systems, these compound result unacceptable degradation application-level...

10.1145/3362035 article EN ACM Transactions on Embedded Computing Systems 2019-11-15

Advances in deep neural networks (DNNs) and the availability of massive real-world data have enabled superhuman levels accuracy on many AI tasks ushered explosive growth workloads across spectrum computing devices. However, their superior comes at a high computational cost, which necessitates approaches beyond traditional paradigms to improve operational efficiency. Leveraging application-level insight error resilience, we demonstrate how approximate (AxC) can significantly boost efficiency...

10.1109/jproc.2020.3029453 article EN Proceedings of the IEEE 2020-11-10

The growing prevalence and computational demands of Artificial Intelligence (AI) workloads has led to widespread use hardware accelerators in their execution. Scaling the performance AI across generations is pivotal success commercial deployments. intrinsic error-resilient nature present a unique opportunity for performance/energy improvement through precision scaling. Motivated by recent algorithmic advances scaling inference training, we designed RaPiD <sup...

10.1109/isca52012.2021.00021 article EN 2021-06-01

Deep neural networks (DNNs) have gained tremendous popularity in recent years due to their ability achieve superhuman accuracy a wide variety of machine learning tasks. However, the compute and memory requirements DNNs grown rapidly, creating need for energy-efficient hardware. Resistive crossbars attracted significant interest design next generation DNN accelerators natively execute massively parallel vector-matrix multiplications within dense arrays. crossbar-based computations face major...

10.1109/tvlsi.2021.3063543 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2021-03-17

We introduce a highly heterogeneous and programmable compute-in-memory (CIM) accelerator architecture for deep neural network (DNN) inference. This combines spatially distributed CIM memory array “tiles” weight-stationary, energy-efficient multiply–accumulate (MAC) operations, together with special-function compute cores auxiliary digital computation. Massively parallel vectors of neuron activation data are exchanged over short distances using dense efficient circuit-switched 2-D mesh,...

10.1109/tvlsi.2022.3221390 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2022-11-21

The use of lower precision has emerged as a popular technique to optimize the compute and storage requirements complex deep neural networks (DNNs). In quest for precision, recent studies have shown that ternary DNNs (which represent weights activations by signed values) promising sweet spot, achieving accuracy close full-precision on tasks. We propose TiM-DNN, programmable in-memory accelerator is specifically designed execute DNNs. TiM-DNN supports various representations including...

10.1109/tvlsi.2020.2993045 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2020-05-20

Deep Neural Networks (DNNs) represent the state-of-the-art in many Artificial Intelligence (AI) tasks involving images, videos, text, and natural language. Their ubiquitous adoption is limited by high computation storage requirements of DNNs, especially for energy-constrained inference at edge using wearable IoT devices. One promising approach to alleviate computational challenges implementing DNNs low-precision fixed point (<16 bits) representation. However, quantization error inherent any...

10.1145/3195970.3196012 article EN 2018-06-19

Deep Neural Networks (DNNs) have emerged as the method of choice for solving a wide range machine learning tasks. The enormous computational demand posed by DNNs is key challenge computing system designers and has most commonly been addressed through design DNN accelerators. However, these specialized accelerators utilize large quantities multiply-accumulate units on-chip memory are prohibitive in area cost constrained systems such wearable devices IoT sensors. In this work, we take...

10.1109/tc.2018.2879434 article EN IEEE Transactions on Computers 2018-11-06

Fixed-point implementations (FxP) are prominently used to realize Deep Neural Networks (DNNs) efficiently on energy-constrained platforms. The choice of bit-width is often constrained by the ability FxP represent entire range numbers in datastructure with sufficient resolution. At low bit-widths (< 8 bits), state-of-the-art DNNs invariably suffer a loss classification accuracy due quantization/saturation errors.

10.1145/3316781.3317783 article EN 2019-05-23

Deep Neural Networks (DNNs) represent the state-of-the-art in many Artificial Intelligence (AI) tasks involving images, videos, text, and natural language. Their ubiquitous adoption is limited by high computation storage requirements of DNNs, especially for energy-constrained inference at edge using wearable IoT devices. One promising approach to alleviate computational challenges implementing DNNs low-precision fixed point (<;16 bits) representation. However, quantization error inherent any...

10.1109/dac.2018.8465893 article EN 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) 2018-06-01

Memory Augmented Neural Networks (MANNs) enhance a deep neural network with an external differentiable memory, enabling them to perform complex tasks well beyond the capabilities of conventional networks. We identify unique challenge that arises in MANNs due soft reads and writes each which requires access all memory locations. This characteristic MANN workloads severely limits performance on CPUs, GPUs, classical accelerators. present first effort design hardware architecture improves...

10.1145/3316781.3317935 article EN 2019-05-23

We propose a non-volatile memory based on cross-coupled reconfigurable ferroelectric transistors (R-FEFETs) which features differential read along with low power computation-in-memory (CiM). Exploiting the dynamic modulation of hysteresis in R-FEFETs, we achieve aforementioned functionalities just 2 access (in addition to R-FEFETs). The proposed not only enhances sense margin during read, but also enables natural computation AND and NOR logic functions between two bits stored array,...

10.1109/islped.2019.8824948 article EN 2019-07-01

Intrinsic application resilience, a property exhibited by many emerging domains, allows designers to optimize computing platforms approximating selected computations within an without any perceivable loss in its output quality. At the circuit level, this is often achieved designing circuits that are more efficient but realize slightly modified functionality. Most prior efforts on approximate design hardwire degree of approximation into implementation. This severely limits their...

10.3850/9783981537079_0416 article EN 2016-01-01

In-memory computing is a promising approach to alleviating the processor-memory data transfer bottleneck in systems. While spintronics has attracted great interest as non-volatile memory technology, recent work shown that its unique properties can also enable in-memory computing. We summarize efforts this direction, and describe three different designs enhance STT-MRAM perform logic, arithmetic, vector operations evaluate transcendental functions within arrays.

10.23919/date.2018.8342277 article EN Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2015 2018-03-01

The rapid emergence of AI models, specifically large language models (LLMs) requiring amounts compute, drives the need for dedicated inference hardware. During deployment, compute utilization (and thus power consumption) can vary significantly across layers an model, number tokens, precision, and batch size [1]. Such wide variation, which may occur at fast time scales, poses unique challenges in optimizing performance within system-level specifications discrete accelerator cards, including...

10.1109/isscc49657.2024.10454301 article EN 2022 IEEE International Solid- State Circuits Conference (ISSCC) 2024-02-18
Coming Soon ...