NFDI4DS | UHH-SEMS - Publication Details

Yuan Xie

ORCID: 0000-0003-2093-1788

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100385336

Research Areas

Parallel Computing and Optimization Techniques
Advanced Memory and Neural Computing
Interconnection Networks and Systems
Ferroelectric and Negative Capacitance Devices
Advanced Data Storage Technologies
Semiconductor materials and devices
3D IC and TSV technologies
Advanced Neural Network Applications
Low-power high-performance VLSI design
Advancements in Semiconductor Devices and Circuit Design
Embedded Systems Design Techniques
VLSI and FPGA Design Techniques
Advanced Graph Neural Networks
Graph Theory and Algorithms
Radiation Effects in Electronics
Adversarial Robustness in Machine Learning
Quantum Computing Algorithms and Architecture
VLSI and Analog Circuit Testing
Machine Learning and ELM
Domain Adaptation and Few-Shot Learning
Energy Harvesting in Wireless Networks
Integrated Circuits and Semiconductor Failure Analysis
Advanced Image and Video Retrieval Techniques
CCD and CMOS Imaging Sensors
Quantum Information and Cryptography

Hong Kong University of Science and Technology
2023-2025

University of Hong Kong
2023-2025

Xinjiang Agricultural University
2024-2025

Hohai University
2022-2025

Westlake University
2025

Tsinghua University
2014-2024

Alibaba Group (China)
2020-2024

Hunan University
2022-2024

East China Normal University
2021-2024

Tumor Hospital of Guangxi Medical University
2024

PRIME

OPENALEX - Publications

Ping Chi Shuangchen Li Cong Xu Tao Zhang Jishen Zhao and 3 more

Processing-in-memory (PIM) is a promising solution to address the "memory wall" challenges for future computer systems. Prior proposed PIM architectures put additional computation logic in or near memory. The emerging metal-oxide resistive random access memory (ReRAM) has showed its potential be used main Moreover, with crossbar array structure, ReRAM can perform matrix-vector multiplication efficiently, and been widely studied accelerate neural network (NN) applications. In this work, we...

10.1145/3007787.3001140 article EN ACM SIGARCH Computer Architecture News 2016-06-18

NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory

OPENALEX - Publications

Xiangyu Dong Cong Xu Yuan Xie Norman P. Jouppi

Various new nonvolatile memory (NVM) technologies have emerged recently. Among all the investigated NVM candidate technologies, spin-torque-transfer (STT-RAM, or MRAM), phase-change random-access (PCRAM), and resistive (ReRAM) are regarded as most promising candidates. As ultimate goal of this research is to deploy them into multiple levels in hierarchy, it necessary explore wide design space find proper implementation at different hierarchy from highly latency-optimized caches density-...

10.1109/tcad.2012.2185930 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2012-06-14

Towards artificial general intelligence with hybrid Tianjic chip architecture

OPENALEX - Publications

Jing Pei Lei Deng Sen Song Mingguo Zhao Youhui Zhang and 19 more

10.1038/s41586-019-1424-8 article EN Nature 2019-07-31

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey

OPENALEX - Publications

Lei Deng Guoqi Li Song Han Luping Shi Yuan Xie

Domain-specific hardware is becoming a promising topic in the backdrop of improvement slow down for general-purpose processors due to foreseeable end Moore's Law. Machine learning, especially deep neural networks (DNNs), has become most dazzling domain witnessing successful applications wide spectrum artificial intelligence (AI) tasks. The incomparable accuracy DNNs achieved by paying cost hungry memory consumption and high computational complexity, which greatly impedes their deployment...

10.1109/jproc.2020.2976475 article EN publisher-specific-oa Proceedings of the IEEE 2020-03-20

Direct Training for Spiking Neural Networks: Faster, Larger, Better

OPENALEX - Publications

Yujie Wu Lei Deng Guoqi Li Jun Zhu Yuan Xie and 1 more

Spiking neural networks (SNNs) that enables energy efficient implementation on emerging neuromorphic hardware are gaining more attention. Yet now, SNNs have not shown competitive performance compared with artificial (ANNs), due to the lack of effective learning algorithms and programming frameworks. We address this issue from two aspects: (1) propose a neuron normalization technique adjust selectivity develop direct algorithm for deep SNNs. (2) Via narrowing rate coding window converting...

10.1609/aaai.v33i01.33011311 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

A novel architecture of the 3D stacked MRAM L2 cache for CMPs

OPENALEX - Publications

Guangyu Sun Xiangyu Dong Yuan Xie Jian Li Yiran Chen

Magnetic random access memory (MRAM) is a promising technology, which has fast read access, high density, and non-volatility. Using 3D heterogeneous integrations, it becomes feasible cost-efficient to stack MRAM atop conventional chip multiprocessors (CMPs). However, one disadvantage of its long write latency energy. In this paper, we first MRAM-based L2 caches directly CMPs compare against SRAM counterparts in terms performance We observe that the direct stacking might harm due...

10.1109/hpca.2009.4798259 article EN 2009-02-01

PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory

OPENALEX - Publications

Ping Chi Shuangchen Li Cong Xu Tao Zhang Jishen Zhao and 3 more

10.1109/isca.2016.13 article EN 2016-06-01

Pinatubo

OPENALEX - Publications

Shuangchen Li Cong Xu Qiaosha Zou Jishen Zhao Yu Lu and 1 more

Processing-in-memory (PIM) provides high bandwidth, massive parallelism, and energy efficiency by implementing computations in main memory, therefore eliminating the overhead of data movement between CPU memory. While most recent work focused on PIM DRAM memory with 3D die-stacking technology, we propose to leverage unique features emerging non-volatile (NVM), such as resistance-based storage current sensing, enable efficient design NVM. We Pinatubo1, a <u>P</u>rocessing <u>I</u>n...

10.1145/2897937.2898064 article EN 2016-05-25

Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices

OPENALEX - Publications

Gushu Li Yufei Ding Yuan Xie

Due to little considerations in the hardware constraints, e.g., limited connections between physical qubits enable two-qubit gates, most quantum algorithms cannot be directly executed on Noisy Intermediate-Scale Quantum (NISQ) devices. Dynamically remapping logical compiler is needed gates algorithm, which introduces additional operations and inevitably reduces fidelity of algorithm. Previous solutions finding such suffer from high complexity, poor initial mapping quality, flexibility...

10.1145/3297858.3304023 article EN 2019-04-04

Design and Management of 3D Chip Multiprocessors Using Network-in-Memory

OPENALEX - Publications

Feihui Li Chrysostomos Nicopoulos T.D. Richardson Yuan Xie Vijaykrishnan Narayanan and 1 more

Long interconnects are becoming an increasingly important problem from both power and performance perspectives. This motivates designers to adopt on-chip network-based communication infrastructures three-dimensional (3D) designs where multiple device layers stacked together. Considering the current trends towards increasing use of chip multiprocessing, it is timely consider 3D multiprocessor design memory networking issues, especially in context data management large L2 caches. The overall...

10.1145/1150019.1136497 article EN ACM SIGARCH Computer Architecture News 2006-05-01

Hybrid cache architecture with disparate memory technologies

OPENALEX - Publications

Xiaoxia Wu Jian Li Lixin Zhang Evan W. Speight Ram Rajamony and 1 more

Caching techniques have been an efficient mechanism for mitigating the effects of processor-memory speed gap. Traditional multi-level SRAM-based cache hierarchies, especially in context chip multiprocessors (CMPs), present many challenges area requirements, core-to-cache balance, power consumption, and design complexity. New advancements technology enable caches to be built from other technologies, such as Embedded DRAM (EDRAM), Magnetic RAM (MRAM), Phase-change (PRAM), both 2D chips or 3D...

10.1145/1555754.1555761 article EN 2009-06-20

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement

OPENALEX - Publications

Xiangyu Dong Xiaoxia Wu Guangyu Sun Yuan Xie Hai Li and 1 more

Magnetic Random Access Memory (MRAM) has been considered as a promising memory technology due to many attractive properties. Integrating MRAM with CMOS logic may incur extra manufacture cost, its hybrid magnetic-CMOS fabrication process. Stacking on top of logics using 3D integration is way minimize this cost overhead. In paper, we discuss the circuit design issues for MRAM, and present cache model. Based model, compare against SRAM DRAM in terms area, performance, energy. Finally conduct...

10.1145/1391469.1391610 article EN 2008-06-08

Overcoming the challenges of crossbar resistive memory architectures

OPENALEX - Publications

Cong Xu Dimin Niu Naveen Muralimanohar Rajeev Balasubramonian Tao Zhang and 2 more

The scalability of DRAM faces challenges from increasing power consumption and the difficulty building high aspect ratio capacitors. Consequently, emerging memory technologies including Phase Change Memory (PCM), Spin-Transfer Torque RAM (STT-RAM), Resistive (ReRAM) are being actively pursued as replacements for memory. Among these candidates, ReRAM has superior characteristics such density, low write energy, endurance, making it a very attractive cost-efficient alternative to DRAM. In this...

10.1109/hpca.2015.7056056 article EN 2015-02-01

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

OPENALEX - Publications

Chao Wang Lei Gong Qi Yu Xi Li Yuan Xie and 1 more

As the emerging field of machine learning, deep learning shows excellent ability in solving complex problems. However, size networks becomes increasingly large scale due to demands practical applications, which poses significant challenge construct a high performance implementations neural networks. In order improve as well maintain low power cost, this paper we design accelerator unit (DLAU), is scalable architecture for large-scale using field-programmable gate array (FPGA) hardware...

10.1109/tcad.2016.2587683 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2016-01-01

DRISA

OPENALEX - Publications

Shuangchen Li Dimin Niu Krishna T. Malladi Hongzhong Zheng Bob Brennan and 1 more

Data movement between the processing units and memory in traditional von Neumann architecture is creating "memory wall" problem. To bridge gap, two approaches, memory-rich processor (more on-chip memory) compute-capable (processing-in-memory) have been studied. However, first one has strong computing capability but limited capacity/bandwidth, whereas second exact opposite.

10.1145/3123939.3123977 article EN 2017-10-14

A Survey of Accelerator Architectures for Deep Neural Networks

OPENALEX - Publications

Yiran Chen Yuan Xie Linghao Song Fan Chen Tianqi Tang

Recently, due to the availability of big data and rapid growth computing power, artificial intelligence (AI) has regained tremendous attention investment. Machine learning (ML) approaches have been successfully applied solve many problems in academia industry. Although explosion applications is driving development ML, it also imposes severe challenges processing speed scalability on conventional computer systems. Computing platforms that are dedicatedly designed for AI considered, ranging...

10.1016/j.eng.2020.01.007 article EN cc-by-nc-nd Engineering 2020-01-29

HyGCN: A GCN Accelerator with Hybrid Architecture

OPENALEX - Publications

Mingyu Yan Lei Deng Xing Hu Ling Liang Yujing Feng and 4 more

Inspired by the great success of neural networks, graph convolutional networks (GCNs) are proposed to analyze data. GCNs mainly include two phases with distinct execution patterns. The Aggregation phase, behaves as processing, showing a dynamic and irregular pattern. Combination acts more like presenting static regular hybrid patterns require design that alleviates irregularity exploits regularity. Moreover, achieve higher performance energy efficiency, needs leverage high intra-vertex...

10.1109/hpca47549.2020.00012 article EN 2020-02-01

Cache revive

OPENALEX - Publications

Adwait Jog Asit K. Mishra Cong Xu Yuan Xie Vijaykrishnan Narayanan and 2 more

High density, low leakage and non-volatility are the attractive features of Spin-Transfer-Torque-RAM (STT-RAM), which has made it a strong competitor against SRAM as universal memory replacement in multi-core systems. However, STT-RAM suffers from high write latency energy impeded its widespread adoption. To this end, we look at trading-off STT-RAM's property (data-retention-time) to overcome these problems. We formulate relationship between retention-time write-latency, find optimal for...

10.1145/2228360.2228406 article EN 2012-05-31

Rethinking the performance comparison between SNNS and ANNS

OPENALEX - Publications

Lei Deng Yujie Wu Xing Hu Ling Liang Yufei Ding and 4 more

10.1016/j.neunet.2019.09.005 article EN Neural Networks 2019-09-19

Kiln

OPENALEX - Publications

Jishen Zhao Sheng Li Doe Hyun Yoon Yuan Xie Norman P. Jouppi

Persistent memory is an emerging technology which allows in-memory persistent data objects to be updated at much higher throughput than when using disks as storage. Previous designs use logging or copy-on-write mechanisms update data, unfortunately reduces the system performance roughly half that of a native with no persistence support. One great challenges in this application class therefore how efficiently enable atomic, consistent, and durable updates ensure survives and/or failures. Our...

10.1145/2540708.2540744 article EN 2013-12-07

Architecture exploration for ambient energy harvesting nonvolatile processors

OPENALEX - Publications

Kaisheng Ma Zheng Yang Shuangchen Li Karthik Swaminathan Xueqing Li and 4 more

Energy harvesting has been widely investigated as a promising method of providing power for ultra-low-power applications. Such energy sources include solar energy, radio-frequency (RF) radiation, piezoelectricity, thermal gradients, etc. However, the supplied by these is highly unreliable and dependent upon ambient environment factors. Hence, it necessary to develop specialized systems that are tolerant this variation, also capable making forward progress on computation tasks. The simulation...

10.1109/hpca.2015.7056060 article EN 2015-02-01

NVMain 2.0: A User-Friendly Memory Simulator to Model (Non-)Volatile Memory Systems

OPENALEX - Publications

Matthew Poremba Tao Zhang Yuan Xie

In this letter, a flexible memory simulator - NVMain 2.0, is introduced to help the community for modeling not only commodity DRAMs but also emerging technologies, such as die-stacked DRAM caches, non-volatile memories (e.g., STT-RAM, PCRAM, and ReRAM) including multi-level cells (MLC), hybrid plus systems. Compared existing simulators, 2.0 features user interface with compelling simulation speed capability of providing sub-array-level parallelism, fine-grained refresh, MLC data encoder...

10.1109/lca.2015.2402435 article EN IEEE Computer Architecture Letters 2015-02-10

Power-efficient neural network with artificial dendrites

OPENALEX - Publications

Xinyi Li Jianshi Tang Qingtian Zhang Bin Gao J. Joshua Yang and 9 more

10.1038/s41565-020-0722-5 article EN Nature Nanotechnology 2020-06-29

Design space exploration for 3D architectures

OPENALEX - Publications

Yuan Xie Gabriel H. Loh Bryan Black Kerry Bernstein

As technology scales, interconnects have become a major performance bottleneck and source of power consumption for microprocessors. Increasing interconnect costs make it necessary to consider alternate ways building modern One promising option is 3D architectures where stack multiple device layers with direct vertical tunneling through them are put together on the same chip. fabrication integrated circuits has viable, developing CAD tools architectural techniques imperative explore design...

10.1145/1148015.1148016 article EN ACM Journal on Emerging Technologies in Computing Systems 2006-04-01

Coming Soon ...