NFDI4DS | UHH-SEMS - Publication Details

Wojciech Romaszkan

ORCID: 0000-0003-0906-7079

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5084220043

Research Areas

Stochastic Gradient Optimization Techniques
Advanced Neural Network Applications
Advanced Memory and Neural Computing
Error Correcting Code Techniques
Neural Networks and Applications
Transportation Systems and Safety
Ferroelectric and Negative Capacitance Devices
Cryptographic Implementations and Security
Green IT and Sustainability
Chaos-based Image/Signal Encryption
Anomaly Detection Techniques and Applications
Network Packet Processing and Optimization
Mobile Ad Hoc Networks
Machine Learning and ELM
Coding theory and cryptography
Age of Information Optimization
Network Time Synchronization Technologies
CCD and CMOS Imaging Sensors
Mining and Industrial Processes
Power Line Communications and Noise
Advanced Data Storage Technologies
Radiation Effects in Electronics
Engine and Fuel Emissions
Brain Tumor Detection and Classification
Real-Time Systems Scheduling

University of California, Los Angeles
2020-2025

Amazon (United States)
2023

A 65-nm Digital Stochastic Compute-in-Memory CNN Processor With 8-bit Precision

OPENALEX - Publications

Jiyue Yang Tianmu Li Wojciech Romaszkan Puneet Gupta Sudhakar Pamarti

10.1109/jssc.2025.3554554 article EN IEEE Journal of Solid-State Circuits 2025-01-01

ACOUSTIC: Accelerating Convolutional Neural Networks through Or-Unipolar Skipped Stochastic Computing

OPENALEX - Publications

Wojciech Romaszkan Tianmu Li Tristan Melton Sudhakar Pamarti Puneet Gupta

As privacy and latency requirements force a move towards edge Machine Learning inference, resource constrained devices are struggling to cope with large computationally complex models. For Convolutional Neural Networks, those limitations can be overcome by taking advantage of enormous data reuse opportunities amenability reduced precision. To do that however, level compute density unattainable for conventional binary arithmetic is required. Stochastic Computing deliver such density, but it...

10.23919/date48585.2020.9116289 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2020-03-01

A 278-514M Event/s ADC-Less Stochastic Compute-In-Memory Convolution Accelerator for Event Camera

OPENALEX - Publications

Jiyue Yang Alexander Graening Wojciech Romaszkan Vinod Kurian Jacob Puneet Gupta and 1 more

10.1109/vlsitechnologyandcir46783.2024.10631484 article EN 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits) 2024-06-16

REX-SC: Range-Extended Stochastic Computing Accumulation for Neural Network Acceleration

OPENALEX - Publications

Tianmu Li Wojciech Romaszkan Sudhakar Pamarti Puneet Gupta

Deep learning has grown in capability and size recent years, prompting research on alternative computing methods to cope with the increased compute cost. Stochastic (SC) promises higher efficiency its compact units, but accuracy issues have prevented wide adoption, accuracy-improving techniques sacrificed runtime or training performance. In this work, we propose extended range SC—Range-Extended SC Accumulation deal of SC. By modifying functionality OR-based accumulation, increase computation...

10.1109/tcad.2023.3284289 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2023-06-08

A 4.4–75-TOPS/W 14-nm Programmable, Performance- and Precision-Tunable All-Digital Stochastic Computing Neural Network Inference Accelerator

OPENALEX - Publications

Wojciech Romaszkan Tianmu Li Rahul Garg Jiyue Yang Sudhakar Pamarti and 1 more

We present the first programmable and precision-tunable Stochastic Computing (SC) neural network (NN) inference accelerator. The use of SC makes it possible to achieve multiply-accumulate (MAC) density 38.4k MAC/mm2, enabling a level spatial data reuse unachievable conventional, fixed-point architectures. This extensive amortizes cost conversion reduces number memory accesses, which can otherwise consume significant energy latency. Our accelerator is stand-alone architecture, with custom...

10.1109/lssc.2022.3200064 article EN IEEE Solid-State Circuits Letters 2022-01-01

GEO: Generation and Execution Optimized Stochastic Computing Accelerator for Neural Networks

OPENALEX - Publications

Tianmu Li Wojciech Romaszkan Sudhakar Pamarti Puneet Gupta

Stochastic computing (SC) has seen a renaissance in recent years as means for machine learning acceleration due to its compact arithmetic and approximation properties. Still, SC accuracy remains an issue, with prior works either not fully utilizing the computational density or suffering from significant losses. In this work, we propose GEO - Generation Execution Optimized Computing Accelerator Neural Networks, which optimizes stream generation execution components of SC, bridges gap between...

10.23919/date51398.2021.9473911 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2021-02-01

3PXNet

OPENALEX - Publications

Wojciech Romaszkan Tianmu Li Puneet Gupta

As the adoption of Neural Networks continues to proliferate different classes applications and systems, edge devices have been left behind. Their strict energy storage limitations make them unable cope with sizes common network models. While many compression methods such as precision reduction sparsity proposed alleviate this, they don’t go quite far enough. To push size its absolute limits, we combine binarization in Pruned-Permuted-Packed XNOR (3PXNet), which can be efficiently implemented...

10.1145/3371157 article EN ACM Transactions on Embedded Computing Systems 2020-01-31

SASCHA—Sparsity-Aware Stochastic Computing Hardware Architecture for Neural Network Acceleration

OPENALEX - Publications

Wojciech Romaszkan Tianmu Li Puneet Gupta

Stochastic computing (SC) has recently emerged as a promising method for efficient machine learning acceleration. Its high compute density, affinity with dense linear algebra primitives, and approximation properties have an uncanny level of synergy the deep neural network computational requirements. However, there is conspicuous lack works trying to integrate SC hardware sparsity awareness, which brought significant performance improvements conventional architectures. In this work, we...

10.1109/tcad.2022.3197503 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2022-08-09

A 65nm 8-bit All-Digital Stochastic-Compute-In-Memory Deep Learning Processor

OPENALEX - Publications

Jiyue Yang Tianmu Li Wojciech Romaszkan Puneet Gupta Sudhakar Pamarti

High compute density improves the data reuse and is key to reducing off-chip memory access achieving high energy efficiency in ML accelerators. Compute-in-Memory (CIM) promises but requires ADCs, DACs that add macro's area [1] [2] limiting its density. Besides, CIM's analog sensitive process variability mismatches. The transistor nonlinearity also significantly degrades accuracy. Stochastic Computing (SC), which represents numbers as probability of 1s random binary streams, a digital-compute...

10.1109/a-sscc56115.2022.9980613 article EN 2022 IEEE Asian Solid-State Circuits Conference (A-SSCC) 2022-11-06

SCIMITAR: Stochastic Computing In-Memory In-situ Tracking ARchitecture for Event-Based Cameras

OPENALEX - Publications

Wojciech Romaszkan Jiyue Yang Alexander Graening Vinod Kurian Jacob Jishnu Sen and 2 more

10.1109/tcad.2024.3448227 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2024-08-27

Cost-Driven Hardware-Software Co-Optimization of Machine Learning Pipelines

OPENALEX - Publications

Ravit Sharma Wojciech Romaszkan Feiqian Zhu Puneet Gupta

Researchers have long touted a vision of the future enabled by proliferation internet-of-things devices, including smart sensors, homes, and cities. Increasingly, embedding intelligence in such devices involves use deep neural networks. However, their storage processing requirements make them prohibitive for cheap, off-the-shelf platforms. Overcoming those is necessary enabling widely-applicable devices. While many ways making models smaller more efficient been developed, there lack...

10.48550/arxiv.2310.07940 preprint EN cc-by arXiv (Cornell University) 2023-01-01

SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration

OPENALEX - Publications

Shurui Li Wojciech Romaszkan Alexander Graening Puneet Gupta

Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient inference acceleration delivering improved storage compression through an offline weight decomposition scheduling algorithm. can achieve up to 54.3% (19.8%) point accuracy improvement compared truncation when quantizing MobileNet-v2 4 (2) bits post-training (with...

10.48550/arxiv.2103.01308 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Coming Soon ...