- Low-power high-performance VLSI design
- Atomic and Molecular Physics
- Advanced Memory and Neural Computing
- Advanced Chemical Physics Studies
- Parallel Computing and Optimization Techniques
- Analog and Mixed-Signal Circuit Design
- Neuroscience and Neural Engineering
- Atmospheric Ozone and Climate
- Semiconductor materials and devices
- CCD and CMOS Imaging Sensors
- Advancements in Semiconductor Devices and Circuit Design
- Advancements in PLL and VCO Technologies
- VLSI and Analog Circuit Testing
- Ferroelectric and Negative Capacitance Devices
- X-ray Spectroscopy and Fluorescence Analysis
- Embedded Systems Design Techniques
- Advanced Measurement and Metrology Techniques
- VLSI and FPGA Design Techniques
- Spectroscopy and Laser Applications
- Atomic and Subatomic Physics Research
- Electromagnetic Compatibility and Noise Suppression
- Cold Atom Physics and Bose-Einstein Condensates
- Physical Unclonable Functions (PUFs) and Hardware Security
- Photochemistry and Electron Transfer Studies
- Interconnection Networks and Systems
Northwestern University
2016-2025
Northwest Normal University
2023
General Motors (United States)
2015-2019
General Motors (Poland)
2016
The University of Tokyo
2008-2015
Tokyo Medical and Dental University
2015
Shihezi University
2015
MaxLinear (United States)
2010-2014
Texas Instruments (United States)
2009-2011
University of Minnesota
2005-2009
Computing-In-Memory (CIM) techniques which incorporate analog computing inside memory macros have shown significant advantages in efficiency for deep learning applications. While earlier CIM were limited by lower bit precision, e.g. binary weights [1], recent works 4-to-8b precision the weights/inputs and up to 20b output values [2, 3]. Sparsity application features also been exploited at system level further improve computation [4, 5]. To enable higher bit-wise operations commonly utilized...
Cross sections and rate coefficients for total fine-structure resolved charge transfer in collisions of O+ with H H+ O are presented collision energies between 0.1 meV/u 10 MeV/u temperatures 107 K. The results obtained utilizing new quantal semiclassical molecular-orbital close-coupling, classical trajectory Monte Carlo, continuum distorted wave calculations conjunction previous experimental theoretical data. Applications to various astrophysical atmospheric environments discussed.
Despite recent progress on building highly efficient deep neural network (DNN) accelerators, few works have targeted improving the end-to-end performance of deep-learning tasks, where inter-layer pre/post-processing, data alignment and movement across memory processing units often dominate execution time. An improvement to computation requires cohesive cooperation between accelerator CPU with flow management. Figure 15.2.1 shows most commonly used heterogeneous architecture, containing a...
Processors for next generation mobile devices will need to operate across a wide supply voltage range in order support both high performance and power efficiency modes of operation. However, the effects local transistor threshold ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">V</i> <sub xmlns:xlink="http://www.w3.org/1999/xlink">T</sub> ) variation, already significant issue today's advanced process technologies, further exacerbated at low...
The high cost of IC design has made chip protection one the first priorities semiconductor industry. Although there is a common impression that combinational circuits must be designed without any cycles, with cycles can as well. Such cyclic used to reliably lock ICs. Moreover, since memristor compatible CMOS structure, it possible efficiently obfuscate using polymorphic memristor-CMOS gates. In this case, layouts different functionalities look exactly identical, making impossible even for an...
Abstract Memristive systems present a low‐power alternative to silicon‐based electronics for neuromorphic and in‐memory computation. 2D materials have been increasingly explored memristive applications due their novel biomimetic functions, ultrathin geometry ultimate scaling limits, potential fabricating large‐area, flexible, printed devices. While the switching mechanism in memristors based on single nanosheets is similar conventional oxide memristors, nanosheet composite films complicated...
This article reports on a new charging process and Coulomb-force-directed assembly of nanoparticles onto charged surface areas with sub-100-nm resolution. The is accomplished using flexible nanostructured thin silicon electrode. Electrical nanocontacts have been created as small 50 nm by placing the electrode an electret surface. used to inject charge into sized areas. Nanoparticles were assembled patterns, lateral resolution 60 has observed for first time. A comparison nanoparticle patterns...
Time-series classification (TSC) is a challenging problem in machine learning and significant efforts have been made to improve its speed computation efficiency. Among various approaches, dynamic time warping (DTW) algorithm one of the most prevalent methods for TSC due succinctness generality. To throughput operation, this work presents mixed-signal DTW accelerator utilizing time-domain (TD) computing where signals are encoded processed using pulses. A pipelined operation enabled by...
This paper describes an interconnect technique for subthreshold circuits to improve global wire delay and reduce the variation due process-voltage-temperature (PVT) fluctuations. By internally boosting gate voltage of driver transistors, operating region is shifted from super-threshold enhancing performance improving tolerance PVT variations. Simulations a clock distribution network using proposed shows 66%-76% reduction in 3sigma skew value 84%-88% tree compared conventional drivers. A...
On-chip resonant supply noise in the mid-frequency range (i.e., 50-300 MHz) has been identified as dominant component modern microprocessors. To overcome limited efficiency of conventional decoupling capacitors reducing noise, this paper proposes a low-power digital switched capacitor circuit. By adaptively switching connectivity decaps according to measured amount charge provided by is dramatically boosted leading an increased damping on-chip network. Analysis on transfer during events...
Control of on-chip power supply noise has become a major challenge for continuous scaling CMOS technology. Conventional passive decoupling capacitors (decaps) exhibit significant area and leakage penalties. To improve the efficiency regulation, this paper proposes distributed active decap circuit use in digital integrated circuits (ICs). The proposed design uses an operational amplifier to boost performance conventional decaps. Simulations proved its enhanced effect comparison with also...
The high cost of IC design has made chip protection one the first priorities semiconductor industry. In addition, with growing number untrusted foundries, possibility inside foundry attack is escalating. However, by taking advantage polymorphic gates, layouts circuits different functionalities look exactly identical, making it impossible even for an attacker to distinguish defined functionality looking at its layout. Moreover, since memristor compatible CMOS structure, possible efficiently...
Charge transfer rate coefficients for collisions of C+ with H and H+ C are presented temperatures from 30,000 to 107 K 10 K, respectively. The were calculated recommended cross sections deduced in a recent theoretical experimental investigation that took into account previous measurements. Nonadiabatic radial coupling is the dominant mechanism both reactions above ~50,000 but lower reaction proceeds primarily by radiative charge transfer. Implications, due magnitude coefficients, various...
This paper presents a statistical leakage estimation method for FinFET devices considering the unique width quantization property. Monte Carlo simulations show that conventional approach underestimates average current of by as much 43% while proposed gives precise with an error less than 5%. Design example on dynamic logic circuits shows effectiveness
A multimedia applications processor is fabricated using a 28nm low-power process technology for ultra-low-power applications. Based on 4-issue, 32 register version of the TMS320C64X+ VLIW DSP, this System Chip (SoC) includes 32kB L1 and 128kB L2 caches, I2S, SPI, UART, MultiMediaCard, external memory interfaces (Fig. 7.5.1). The design incorporates over 600k instances custom low-voltage logic cells 43 (1.6 Mb) 6T SRAM. Utilizing ultra-low-voltage (ULV) optimized standard-cell libraries SRAM...
This work presents the first 3D/4D sparse CNN (SCNN) accelerator for point cloud image recognition on low power devices. A special hopping-index rule book method and efficient data search technique were developed to mitigate overhead of coordinate management SCNN. 65nm test chip images was demonstrated with 7.09–13.6 TOPS/W efficiency state-of-the-art frame rate.
While neural network (NN) accelerators are being significantly developed in recent years, CPU is still essential for data management and pre-/post-processing of a commonly used heterogeneous architecture, which usually contains an NN accelerator processor core with transfer performed by direct memory access (DMA) engine. This work presents special processor, referred to as systolic (SNCPU), unified architecture combining deep learning general-purpose computing fifth-generation reduced...
Virtual Reality (VR) and Mixed (MR) systems, e.g., Meta Quest Apple Vision Pro, have recently gained significant interest in consumer electronics, creating a new wave of developments metaverse for gaming, social networking, workforce assistance, online shopping, etc. Strong technological innovations AI computing multi-modular human activity tracking control produced immersive virtual realistic user experiences. However, most existing VR headsets only rely on traditional joysticks or...
The demand for real-time computing on edge devices from emerging applications, e.g. AI, has exploded in recent years. Lately, physics-based scientific also drawn significant interests driven by the growth of e.g., VR, IoT, robotics, etc. Fig. 20.4.1 shows examples computation including structural deformation photorealistic VR/MR, robot dynamic control, temperature monitoring additive manufacturing, and leak-gas tracking. Unfortunately, hardware support numerical is relatively poor, hindering...