NFDI4DS | UHH-SEMS - Publication Details

Radhika Jain

ORCID: 0000-0003-0896-8131

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5016267765

Research Areas

Parallel Computing and Optimization Techniques
Ferroelectric and Negative Capacitance Devices
Advancements in Semiconductor Devices and Circuit Design
Advanced Neural Network Applications
VLSI and Analog Circuit Testing
Low-power high-performance VLSI design

IBM Research - Thomas J. Watson Research Center
2021-2024

IBM (United States)
2021-2024

9.1 A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling

OPENALEX - Publications

Ankur Agrawal Sae Kyu Lee J. A. Silberman Matthew M. Ziegler Mingu Kang and 39 more

Low-precision computation is the key enabling factor to achieve high compute densities (T0PS/W and T0PS/mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) in AI hardware accelerators across cloud edge platforms. However, robust deep learning (DL) model accuracy equivalent high-precision must be maintained. Improvements bandwidth, architecture, power management are also required harness benefit of reduced precision by feeding supporting...

10.1109/isscc42613.2021.9365791 article EN 2022 IEEE International Solid- State Circuits Conference (ISSCC) 2021-02-13

A 7-nm Four-Core Mixed-Precision AI Chip With 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

OPENALEX - Publications

Sae Kyu Lee Ankur Agrawal J. A. Silberman Matthew M. Ziegler Mingu Kang and 39 more

Reduced precision computation is a key enabling factor for energy-efficient acceleration of deep learning (DL) applications. This article presents 7-nm four-core mixed-precision artificial intelligence (AI) chip that supports four compute precisions—FP16, Hybrid-FP8 (HFP8), INT4, and INT2—to support diverse application demands training inference. The leverages cutting-edge algorithmic advances to demonstrate leading-edge power efficiency 8-bit floating-point (FP8) INT4 inference without...

10.1109/jssc.2021.3120113 article EN IEEE Journal of Solid-State Circuits 2021-11-10

14.1 A Software-Assisted Peak Current Regulation Scheme to Improve Power-Limited Inference Performance in a 5nm AI SoC

OPENALEX - Publications

Monodeep Kar J. A. Silberman Swagath Venkataramani V. Srinivasan Bruce Fleischer and 41 more

The rapid emergence of AI models, specifically large language models (LLMs) requiring amounts compute, drives the need for dedicated inference hardware. During deployment, compute utilization (and thus power consumption) can vary significantly across layers an model, number tokens, precision, and batch size [1]. Such wide variation, which may occur at fast time scales, poses unique challenges in optimizing performance within system-level specifications discrete accelerator cards, including...

10.1109/isscc49657.2024.10454301 article EN 2022 IEEE International Solid- State Circuits Conference (ISSCC) 2024-02-18

Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoC

OPENALEX - Publications

Monodeep Kar J. A. Silberman Swagath Venkataramani V. Srinivasan Bruce Fleischer and 41 more

10.1109/jssc.2024.3472023 article EN IEEE Journal of Solid-State Circuits 2024-01-01

Coming Soon ...