Borivoje Nikolić

ORCID: 0000-0003-2324-1715
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Low-power high-performance VLSI design
  • Advancements in Semiconductor Devices and Circuit Design
  • Analog and Mixed-Signal Circuit Design
  • Semiconductor materials and devices
  • Radio Frequency Integrated Circuit Design
  • VLSI and FPGA Design Techniques
  • Parallel Computing and Optimization Techniques
  • VLSI and Analog Circuit Testing
  • Cooperative Communication and Network Coding
  • Advancements in PLL and VCO Technologies
  • Advanced Wireless Communication Techniques
  • Embedded Systems Design Techniques
  • Error Correcting Code Techniques
  • Advanced Memory and Neural Computing
  • Ferroelectric and Negative Capacitance Devices
  • Radiation Effects in Electronics
  • Advanced MIMO Systems Optimization
  • Integrated Circuits and Semiconductor Failure Analysis
  • Millimeter-Wave Propagation and Modeling
  • Microwave Engineering and Waveguides
  • Advanced Data Storage Technologies
  • Electromagnetic Compatibility and Noise Suppression
  • Wireless Communication Security Techniques
  • Algorithms and Data Compression
  • Interconnection Networks and Systems

University of California, Berkeley
2016-2025

Berkeley College
2009-2024

University of California System
2012-2022

Associazione Medici Diabetologi
2021

Institut Supérieur d'Électronique de Paris
2016

Agilent Technologies (United States)
2013

Berkeley Systems (United States)
2004

University of California, Davis
1997-2003

Texas Instruments (United States)
2001-2003

Institute for Technology of Nuclear and other Mineral Raw Materials
2000

Design and experimental evaluation of a new sense-amplifier-based flip-flop (SAFF) is presented. It was found that the main speed bottleneck existing SAFF's cross-coupled set-reset (SR) latch in output stage. The uses stage topology significantly reduces delay improves driving capability. performance this verified by measurements on test chip implemented 0.18 /spl mu/m effective channel length CMOS. Demonstrated places it among fastest flip-flops used state-of-the-art processors. Measurement...

10.1109/4.845191 article EN IEEE Journal of Solid-State Circuits 2000-06-01

Continued improvement in computing efficiency requires functional specialization of hardware designs. Agile design methodologies have been proposed to alleviate the increased costs custom silicon architectures, but their practice thus far has accompanied with challenges integration and validation complex systems-on-a-chip (SoCs). We present Chipyard framework, an integrated SoC design, simulation, implementation environment for specialized compute systems. includes configurable, composable,...

10.1109/mm.2020.2996616 article EN publisher-specific-oa IEEE Micro 2020-05-22

We present FireSim, an open-source simulation platform that enables cycle-exact microarchitectural of large scale-out clusters by combining FPGA-accelerated silicon-proven RTL designs with a scalable, distributed network simulation. Unlike prior tools, FireSim runs on Amazon EC2 F1, public cloud FPGA platform, which greatly improves usability, provides elasticity, and lowers the cost large-scale FPGA-based experiments. describe design implementation show how it can provide sufficient...

10.1109/isca.2018.00014 article EN 2018-06-01

DNN accelerators are often developed and evaluated in isolation without considering the cross-stack, system-level effects real-world environments. This makes it difficult to appreciate impact of Systemon-Chip (SoC) resource contention, OS overheads, programming-stack inefficiencies on overall performance/energy-efficiency. To address this challenge, we present Gemmini, an open-source, full-stack accelerator generator. Gemmini generates a wide design-space efficient ASIC from flexible...

10.1109/dac18074.2021.9586216 article EN 2021-11-08

A 1.8-V 14-b 12-MS/s pseudo-differential pipeline analog-to-digital converter (ADC) using a passive capacitor error-averaging technique and nested CMOS gain-boosting is described. The optimized for low-voltage low-power applications by applying an optimum stage-scaling algorithm at the architectural level opamp comparator sharing circuit level. Prototyped in 0.18-/spl mu/m 6M-1P process, this achieves peak signal-to-noise plus distortion ratio (SNDR) of 75.5 dB 103-dB spurious-free dynamic...

10.1109/jssc.2004.836232 article EN IEEE Journal of Solid-State Circuits 2004-11-30

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> The class of low-density parity-check (LDPC) codes is attractive, since such can be decoded using practical message-passing algorithms, and their performance known to approach the Shannon limits for suitably large block lengths. For intermediate lengths relevant in applications, however, many LDPC exhibit a so-called "error floor," corresponding significant flattening curve that relates...

10.1109/tit.2009.2034781 article EN IEEE Transactions on Information Theory 2010-01-01

We present an adaptive digital technique to calibrate pipelined analog-to-digital converters (ADCs). Rather than achieving linearity by adjustment of analog component values, the new approach infers errors from conversion results and applies postprocessing correct those results. The scheme proposed here draws close analogy channel equalization problem commonly encountered in communications. show that, with help a slow but accurate ADC, code-domain finite-impulse-response filter is sufficient...

10.1109/tcsi.2003.821306 article EN IEEE Transactions on Circuits and Systems I Fundamental Theory and Applications 2004-01-01

This paper presents methods for efficient energy-performance optimization at the circuit and micro-architectural levels. The optimal balance between energy performance is achieved when sensitivity of to a change in equal all design variables. sensitivity-based optimizations minimize subject delay constraint. Energy savings about 65% can be without penalty with equalization sensitivities sizing, supply, threshold voltage 64-bit adder, compared reference sized minimum delay. Circuit effective...

10.1109/jssc.2004.831796 article EN IEEE Journal of Solid-State Circuits 2004-07-27

This paper presents a power- and area-efficient 24-way time-interleaved successive-approximation-register (SAR) analog-to-digital converter (ADC) that achieves 2.8 GS/s 8.1 ENOB in 65 nm CMOS. To minimize the power area, capacitors capacitive DAC are sized to meet thermal noise requirements rather than matching requirements, leading LSB capacitance of 50 aF. An on-chip digital background calibration is used calibrate capacitor mismatches individual ADC channels, as well inter-channel offset,...

10.1109/jssc.2013.2239005 article EN IEEE Journal of Solid-State Circuits 2013-01-28

Increased process variability presents a major challenge for future SRAM scaling. Fast and accurate validation of read stability writeability margins is crucial estimating yield in large arrays. Conventional read/write metrics are characterized through test structures that able to provide limited hardware measurement data cannot be used investigate cell bit fails functional This work method large-scale characterization arrays using direct bit-line measurements. A chip implemented 45 nm CMOS...

10.1109/jssc.2009.2032698 article EN IEEE Journal of Solid-State Circuits 2009-11-01

Large arrays of radios have been exploited for beamforming and null steering in both radar communication applications, but cost form factor limitations precluded their use commercial systems. This paper discusses how to build that enable multiuser massive multiple-input-multiple-output (MIMO) aggressive spatial multiplexing with many users sharing the same spectrum. The focus is energy- cost-efficient realization these order new applications. Distributed algorithms are proposed, optimum...

10.1109/jproc.2015.2492539 article EN publisher-specific-oa Proceedings of the IEEE 2015-12-17

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> A grouped-parallel low-density parity-check (LDPC) decoder is designed for the (2048,1723) Reed-Solomon-based LDPC (RS-LDPC) code suitable 10GBASE-T Ethernet. two-step decoding scheme reduces wordlength to 4 bits while lowering error floor below 10<formula formulatype="inline"> <tex Notation="TeX">$^{-14}~$</tex></formula>BER. The proposed post-processor conveniently integrated with decoder,...

10.1109/jssc.2010.2042255 article EN IEEE Journal of Solid-State Circuits 2010-03-24

Domain specialization under energy constraints in deeply-scaled CMOS has been driving the need for agile development of Systems on a Chip (SoCs). While digital subsystems have design flows that are conducive to rapid iterations from specification layout, analog and mixed-signal modules face challenge long human-in-the-middle iteration loop requires expert intuition verify post-layout circuit parameters meet original specification. Existing automated solutions optimize given target...

10.23919/date48585.2020.9116200 article EN Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2015 2020-03-01

The final phase of CMOS technology scaling provides continued increases in already vast transistor counts, but only minimal improvements energy efficiency, thus requiring innovation circuits and architectures. However, even huge teams are struggling to complete large, complex designs on schedule using traditional rigid development flows. This article presents an agile hardware methodology, which the authors adopted for 11 RISC-V microprocessor tape-outs modern 28-nm 45-nm processes past five...

10.1109/mm.2016.11 article EN IEEE Micro 2016-03-01

We present BAG2, a framework for the development of process-portable Analog and Mixed Signal (AMS) circuit generators. Such generators are parametrized design procedures that produce schematics, layouts, verification testbenches given input specifications. This paper expands on previous work by introducing universal AMS into as well two new layout engines, XBase Laygo, enable have developed various complex driving examples, including time-interleaved SAR ADC SerDes transceiver frontend....

10.1109/cicc.2018.8357061 article EN 2022 IEEE Custom Integrated Circuits Conference (CICC) 2018-04-01

The design and experimental evaluation of a clocked adiabatic logic (GAL) is described in this paper. CAL dual-rail that operates from single-phase AC power-clock supply. This new low-energy makes it possible to integrate all power control circuitry on the chip, resulting better system efficiency, lower cost, simpler distribution. can also be operated DC supply nonenergy-recovery mode compatible with standard CMOS logic. In mode, waveform generated using an on-chip switching transistor small...

10.1109/92.863629 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2000-08-01

Intrinsic variations and challenging leakage control in today's bulk-Si MOSFETs limit the scaling of SRAM. Design tradeoffs six-transistor (6-T) four-transistor (4-T) SRAM cells are presented this work. It is found that 6-T 4-T FinFET-based designed with built-in feedback achieve significant improvements cell static noise margin (SNM) without area penalty. Up to 2x improvement SNM can be achieved cells. A sub-100pA per-cell standby current offer similar as feedback, making them attractive...

10.1145/1077603.1077607 article EN 2005-01-01

Dual-supply voltage design using a clustered scaling (CVS) scheme is an effective approach to reduce chip power. The optimal CVS relies on level converter implemented in flip-flop minimize energy, delay, and area penalties due conversion. Additionally, circuit robustness against supply bounce key property that differentiates good design. Novel flip-flops presented this paper incorporate half-latch precharged converter. These are optimized the energy-delay space achieve over 30% reduction of...

10.1109/tvlsi.2003.821548 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2004-02-01

Two decoding schedules and the corresponding serialized architectures for low-density parity-check (LDPC) decoders are presented. They applied to codes with matrices generated either randomly or using geometric properties of elements in Galois fields. Both have low computational requirements. The original concurrent schedule has a large storage requirement that is dependent on total number edges underlying bipartite graph, while new, staggered which uses an approximation belief propagation,...

10.1109/glocom.2001.965981 article EN 2002-11-13

Many classes of high-performance low-density parity-check (LDPC) codes are based on parity check matrices composed permutation submatrices. We describe the design a parallel-serial decoder architecture that can be used to map any LDPC code with such structure hardware emulation platform. High-throughput allows for exploration low bit-error rate (BER) region and provides statistics error traces, which illuminate causes floors (2048, 1723) Reed-Solomon (RS-LDPC) (2209, 1978) array-based code....

10.1109/tcomm.2009.11.080105 article EN IEEE Transactions on Communications 2009-11-01

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> A methodology for energy–delay optimization of digital circuits is presented. This applied to minimizing the delay representative carry-lookahead adders under energy constraints. Impact various design choices, including tree structure and logic style, are analyzed in space verified through optimization. The result demonstrated on a fastest adder found, 240-ps Ling sparse domino 1 V, 90 nm CMOS....

10.1109/jssc.2008.2010795 article EN IEEE Journal of Solid-State Circuits 2009-01-29

The error-correcting performance of low-density parity check (LDPC) codes, when decoded using practical iterative decoding algorithms, is known to be close Shannon limits for codes with suitably large blocklengths. A substantial limitation the use finite-length LDPC presence an error floor in low frame rate (FER) region. This paper develops a deterministic method predicting floors, based on high signal-to-noise ratio (SNR) asymptotics, applied absorbing sets within structured codes. approach...

10.1109/jsac.2009.090809 article EN IEEE Journal on Selected Areas in Communications 2009-07-29

<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> A test-chip in a low-power 45 nm technology, featuring uniaxial strained-Si, has been built to study variability CMOS circuits. Systematic layout-induced variation, die-to-die (D2D), wafer-to-wafer (W2W) and within-die (WID) measured over multiple wafers, analyzed attributed likely causes the manufacturing process. Delay is characterized using an array of ring oscillators transistor leakage...

10.1109/jssc.2009.2022217 article EN IEEE Journal of Solid-State Circuits 2009-07-28
Coming Soon ...