- Low-power high-performance VLSI design
- Semiconductor materials and devices
- Advancements in Semiconductor Devices and Circuit Design
- 3D IC and TSV technologies
- Parallel Computing and Optimization Techniques
- Electromagnetic Compatibility and Noise Suppression
- VLSI and Analog Circuit Testing
- Radio Frequency Integrated Circuit Design
- Advancements in PLL and VCO Technologies
- VLSI and FPGA Design Techniques
- Embedded Systems Design Techniques
- Interconnection Networks and Systems
- Integrated Circuits and Semiconductor Failure Analysis
- Analog and Mixed-Signal Circuit Design
- Semiconductor materials and interfaces
- Silicon Carbide Semiconductor Technologies
- Ferroelectric and Negative Capacitance Devices
- Mechanical and Optical Resonators
- Electrostatic Discharge in Electronics
- Copper Interconnects and Reliability
- Distributed and Parallel Computing Systems
- Advanced Memory and Neural Computing
- Advanced Battery Technologies Research
- Radiation Effects in Electronics
- Force Microscopy Techniques and Applications
IBM (United States)
2010-2025
IBM Research - Thomas J. Watson Research Center
1997-2022
IBM Research - Austin
2002-2014
University of California, Berkeley
2003
University of Illinois Urbana-Champaign
1982-1986
Short, medium, and long on-chip interconnections having linewidths of 0.45-52 /spl mu/m are analyzed in a five-metal-layer structure. We study capacitive coupling for short lines, inductive medium-length inductance resistance the current return path power buses, line resistive losses global wiring. Design guidelines technology changes proposed to achieve minimum delay contain crosstalk local Conditional expressions given determine when transmission-line effects important accurate prediction.
A global clock distribution strategy used on several microprocessor chips is described. The network consists of buffered tunable trees or treelike networks, with the final level all driving a single common grid covering most chip. This topology combines advantages both and grids. new tuning method was required to efficiently tune such large strongly connected interconnect consisting up 6 m wire modeled 50000 resistors, capacitors, inductors. Variations are described handle different...
A shift-and-ratio method for extracting MOSFET channel length is presented. In this method, mobility can be any function of gate voltage, and high source-drain resistance does not affect extraction results. It shown to yield more accurate consistent lengths deep-submicrometer CMOS devices at room low temperatures. also found that, both nFET pFET, the essentially independent temperature from 300 77 K.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML"...
The advances in the growth of pseudomorphic silicon-germanium epitaxial layers combined with strong need for high-speed complementary circuits have led to increased interest silicon-based heterojunction field-effect transistors. Metal-oxide-semiconductor transistors (MOSFET's) SiGe channels are guided by different design rules than state-of-the-art silicon MOSFET's. selection transistor gate material, optimization channel profile, method threshold voltage adjustment, and silicon-cap...
The first device performance results are presented from experiments designed to assess FET technology feasibility in the 0.1-µm gate-length regime. Low-temperature design considerations for these dimensions lead a 0.15-V threshold and 0.6-V power supply, with forward-biased substrate. Self-aligned almost fully scaled devices simple circuits were fabricated by direct-write electron-beam lithography at all levels, gate lengths down 0.07 µm. Measured characteristics yielded over 750-mS/mm...
A novel subsurface SiGe-channel p-MOSFET is demonstrated in which modulation doping used to control the threshold voltage without degrading channel mobility. device design consisting of a graded SiGe channel, an n/sup +/ polysilicon gate, and p/sup used. boron-doped layer located underneath undoped minimize process sensitivity maximize transconductance. Low-field hole mobilities 220 cm/sup 2//V-s at 300 K 980 82 were achieved functional submicrometer p-MOSFETs.< <ETX...
The IBM POWER4 processor is a 174-million-transistor chip that runs at clock frequency of greater than 1.3 GHz. It contains two microprocessor cores, high-speed buses, and an on-chip memory subsystem. complexity size POWER4, together with its high operating frequency, presented number significant challenges for multi-site design team. This paper describes the circuit physical gives results were achieved. Emphasis placed on aspects methodology, distribution, circuits, power, integration,...
POWER5 offers significantly increased performance over previous POWER designs by incorporating simultaneous multithreading, an enhanced memory subsystem, and extensive RAS power management support. The 276M transistor processor is implemented in 130nm silicon-on-insulator technology with 8-level of Cu metallization operates at >1.5 GHz.
The fourth-generation POWER processor chip contains 170M transistors and includes 2 microprocessor cores, shared L2, directory for an off-chip L3, all logic needed to interconnect multiple chips form SMP. It is implemented in a 0.18 /spl mu/m SOI technology, with 7 layers of Cu interconnect, functions systems at 1.1 GHz, dissipates 115 W 1.5 V.
Clock distribution has become an increasingly challenging problem for VLSI designs, consuming increasing fraction of resources such as wiring, power, and design time. Unwanted differences or uncertainties in clock network delays degrade performance cause functional errors. Three dramatically different strategies being used the industry to address these challenges are compared. Novel modeling measurement techniques investigate on-chip transmission-line effects that important high networks.
The POWER6trade is a dual-core microprocessor fabricated in 65nm SOI process with 10 levels of low-k copper interconnects. Chips split- and connected-core power supplies are fabricated, modeled, tested, showing both the advantages disadvantages each. On-chip noise measurements compared to simulation. simulation show that shorted core grid design has less higher maximum frequency.
With the advances in speed of high-performance chips, inductance effects some on-chip interconnects have become significant. Specific networks such as clock distributions and other highly optimized circuits are especially impacted by inductance. Several difficult aspects to be overcome obtain valid waveforms for problems where inductances contribute significantly. Mainly, geometries very complex interactions between capacitive inductive currents taken into account simultaneously. In this...
A resonant global clock-distribution network operating at 4.6GHz is designed in a 90nm 1.0V CMOS technology. Unique to this approach the set of on-chip spiral inductors that resonate with clock capacitance, resulting 20% recycling power.
This work presents a new approach to global clock distribution in which tree-driven grids are augmented with on-chip spiral inductors resonate the capacitance. In this scheme, energy of fundamental frequency resonates between electric and magnetic forms, reduced admittance network allowing for significantly lower gain requirements buffering network. The substantial improvements jitter power resulting from presented using measurement results two test chips, one fabricated 90-nm other...
Clock network synthesis (CNS) is one of the most important design challenges in high performance synchronized VLSI designs. However, without appropriate problem examples and real-world objectives, research can become less relevant to industrial flows. To address need community, we organize a clock contest set benchmark suite released. Since full-specification physical electrical requirements leading-edge processor distribution would be cumbersome impractical for this contest, make...
The 12-core 649mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> POWER8™ leverages IBM's 22nm eDRAM SOI technology [1], and microarchitectural enhancements to deliver up 2.5× the socket performance [2] of its 32nm predecessor, POWER7+™ [3]. POWER8 contains 4.2B transistors 31.5μF deep-trench decoupling capacitance. Three thin-oxide transistor V <inf xmlns:xlink="http://www.w3.org/1999/xlink">t</inf> s are used for power/performance...
POWER8™ is a 12-core processor fabricated in IBM's 22 nm SOI technology with core and cache improvements driven by big data applications, providing 2.5× socket performance over POWER7+™. Core throughput supported 7.6 Tb/s of off-chip I/O bandwidth which provided three primary interfaces, including two new variants Elastic Interface as well embedded PCI Gen-3. Power efficiency improved several techniques. An on-chip controller based on an PowerPC™ 405 applies per-core DVFS adjusting DPLLs...
Increasing transistor counts in modern processors can create instantaneous changes current, driving nanosecond-speed supply voltage (V <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DD</sub> ) droops that require extra guardband for correct product operation. The POWER9 processor uses an adaptive clock strategy to reduce timing margin needed during power droop events by embedding analog voltage-droop monitors (VDMs) direct a digital...
SiGe HBTs have achieved record peak f/sub T/ values and impressive digital circuit ECL RO delays but no analog results been reported. In this work we investigate the leverage of for circuits by optimizing Ge-profile a high /spl beta/V/sub A/ product under constraint breakdown voltage effective strain layer. Analytical calculations beta/, V/sub A/, SiGe-HBTs as function Ge profile predict largest performance advantage over Si BJTs most steeply graded profile. SiGe-HBT transistors are...
Short, medium and long on-chip interconnections having line widths of 0.45-52 /spl mu/m are analyzed in a five-metal-layer structure. We study capacitive coupling for short lines, inductive medium-length inductance resistance the current return path power buses resistive losses global wiring. Design guidelines technology changes proposed to achieve minimum delay contain crosstalk local Conditional expressions given determine when transmission-line effects important accurate prediction.
The clock distribution on the Power4 dual-processor chip supplies a single critical 1.5 GHz from one SOI-optimized PLL to 15,200 pins large with 20 ps skew and 35 jitter. network contains 64 tuned trees driving grid, specialized tools achieve targets schedule no adjustment circuitry.
Several models for 1/f noise in silicon which give identical predictions spectra were found to distinct non-Gaussian effects as shown by Monte Carlo simulations. Measurements on silicon-on-sapphire resistors ranging area less than 1 (\ensuremath{\mu}m${)}^{2}$ revealed both and sample-to-sample spectral variations. The results qualitatively similar those expected a simple superposition of two-level trapping systems dissimilar random walk potential. However, some modulation the was found.
Timing uncertainty in microprocessors is comprised of several sources including PLL jitter, clock distribution skew and across chip device variations, power supply noise. The on-chip measurement macro called SKITTER (SKew+jITTER) was designed to measure timing from all combined by measuring the number logic stages that complete a cycle. This completed delay has proven be very sensitive monitor noise, which emerged as dominant component uncertainty. paper describes Skitter experiences IBM...
CMOS emitter-coupled logic (ECL) receiver circuits consisting of a differential-amplifier stage and inverter are shown to convert 100-mV input signals on-chip levels even with worst-case parameter variations in 5-V 1- mu m technology. Two different used cover range power supply options; third circuit provides comparison case. The differential amplifiers feature built-in feedback compensation for common-mode variations. devices designed large widths, minimum channel lengths, an interleaved...