Dawen Xu

ORCID: 0000-0003-4204-0898
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Memory and Neural Computing
  • Advanced Neural Network Applications
  • Advanced Steganography and Watermarking Techniques
  • Low-power high-performance VLSI design
  • Prosthetics and Rehabilitation Robotics
  • CCD and CMOS Imaging Sensors
  • Advancements in Semiconductor Devices and Circuit Design
  • Muscle activation and electromyography studies
  • Advanced Image and Video Retrieval Techniques
  • Semiconductor materials and devices
  • Stroke Rehabilitation and Recovery
  • Chaos-based Image/Signal Encryption
  • Video Coding and Compression Technologies
  • Digital Media Forensic Detection
  • Parallel Computing and Optimization Techniques
  • Advanced Vision and Imaging
  • Ferroelectric and Negative Capacitance Devices
  • Radiation Effects in Electronics
  • Adversarial Robustness in Machine Learning
  • Neural Networks and Reservoir Computing
  • Advanced Image Processing Techniques
  • Mechanical Circulatory Support Devices
  • Graphene research and applications
  • Image and Video Stabilization
  • Machine Learning and ELM

Hefei University of Technology
2019-2023

Nanjing University of Aeronautics and Astronautics
2019-2021

Institute of Computing Technology
2019-2021

Chinese Academy of Sciences
2019-2021

Wan Fang Hospital
2006

This article presents a novel reconfigurable torque-controllable variable stiffness actuator (RVSA) for knee exoskeleton. The concept of reconfiguring the pulley block is proposed to make work in different torque and ranges. reconfigurability allows achieve variety passive behaviors, namely softening, linear, hardening behaviors. Compared with existing actuators based on principle changing spring preload, an advantage RVSA that can wider range joint stiffness. Moreover, model established...

10.1109/tmech.2021.3063374 article EN IEEE/ASME Transactions on Mechatronics 2021-03-03

Prior works typically conducted the fault analysis of neural network accelerator computing arrays with simulation and focused on prediction accuracy loss models. There is still a lack systematic acceleration system that considers both degradation exceptions, such as stall running overtime. To end, we implemented representative corresponding injection modules Xilinx ARM-FPGA platform evaluated reliability under different rates when series typical models are deployed system. The entire...

10.1109/tvlsi.2020.3046075 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2021-01-10

Convolutional neural networks (CNNs) have become the state-of-the-art technique in many classification tasks IoT system. However, low-power and area-constraint edge devices are unable to afford expensive cost of CNNs. Resistive random access memory (RRAM) is attractive for establishing CNN accelerator at end due features scalability, in-situ dot-product. mapping a network architecture onto general-purpose RRAM suffers severe issue resource underutilization. The quantization offers an...

10.1109/dac18072.2020.9218724 article EN 2020-07-01

Regular 2D computing array is widely utilized for the processing of major neural network operations in many deep learning accelerators (DLAs). Hardware failures on can lead to considerable errors and prediction accuracy loss. Prior works proposed add homogeneous redundant PEs each row or column regular mitigate faulty PEs, but they may fail recover from faults when number a exceeds corresponding column. The problem gets worse are not evenly distributed across array. To address problem, we...

10.1109/iccd50377.2020.00087 article EN 2022 IEEE 40th International Conference on Computer Design (ICCD) 2020-10-01

Hardware faults on the regular 2-D computing array of a typical deep learning accelerator (DLA) can lead to dramatic prediction accuracy loss. Prior redundancy design approaches typically have each homogeneous redundant processing element (PE) mitigate faulty PEs for limited region rather than entire avoid excessive hardware overhead. However, they fail recover when number in any exceeds same region. The mismatch problem deteriorates fault injection rate rises and are unevenly distributed....

10.1109/tcad.2021.3124763 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2021-11-02

Purpose Hand motor dysfunction has seriously reduced people’s quality of life. The purpose this paper is to solve problem; different soft exoskeleton robots have been developed because their good application prospects in assistance. In paper, a new hand designed help people conduct rehabilitation training. Design/methodology/approach proposed an under-actuated cable-driven mechanism, which optimizes the force transmission path and many local structures. Specifically, optimized cables are...

10.1108/ir-06-2020-0127 article EN Industrial Robot the international journal of robotics research and application 2020-09-03

With the advancements of neural networks, customized accelerators are increasingly adopted in massive AI applications. To gain higher energy efficiency or performance, many hardware design optimizations such as near-threshold logic overclocking can be utilized. In these cases, computing errors may happen and difficult to captured by conventional training on general purposed processors (GPPs). Applying offline trained network models with directly lead considerable prediction accuracy loss....

10.1109/asap.2019.00-23 article EN 2019-07-01

The increasing hardware failures caused by the shrinking semiconductor technologies pose substantial influence on neural accelerators and improving resilience of network execution becomes a great design challenge especially to mission-critical applications such as self-driving medical diagnose. reliability analysis is key step understand failures, thus highly demanded. Prior works typically conducted fault with simulation concentrated prediction accuracy loss models. There still lack...

10.1109/asap49362.2020.00024 article EN 2020-07-01

Generative neural network is a new category of networks and it has been widely utilized in many applications such as content generation, unsupervised learning, segmentation, pose estimation. It typically involves massive computing-intensive deconvolution operations that cannot be fitted to conventional processors directly. However, prior works mainly investigated specialized hardware architectures through intensive modifications the existing deep learning accelerate together with...

10.1109/tc.2020.3001033 article EN IEEE Transactions on Computers 2020-01-01

Robot-assisted cooperative rehabilitation training has shown superiority in helping the individuals with motion impairment problems to regain their motor functions. This paper presents development and evaluation of a new control scheme for an end-effector-type robot provide upper extremity desired compliance intensity. Firstly, overall mechanical structure real-time system are introduced. Secondly, integral fuzzy sliding mode impedance strategy combined time-delay estimation (IFSMIC-TDE) is...

10.1109/access.2019.2949197 article EN cc-by IEEE Access 2019-01-01

Neural networks especially the convolution neural (CNN) have become prevalent and numerous CNN accelerators been developed to achieve higher performance. While clock frequency determines operation speed has direct influence on performance of accelerators, we propose apply overclocking, a circuit optimization approach that enables frequency, general accelerators. This technique brings significant improvement, but it leads moderate timing errors, wrong computing results low prediction...

10.1109/itc-asia.2019.00039 article EN 2019-09-01

Artificial Intelligence of Things (AIoT) processors fabricated with newer technology nodes suffer rising soft errors due to the shrinking transistor sizes and lower power supply. Soft on AIoT particularly deep learning accelerators (DLAs) massive computing may cause substantial errors. These are difficult be captured by conventional training general-purposed such as CPUs GPUs in a server. Applying offline trained neural network models edge directly lead considerable prediction accuracy loss....

10.1109/tvlsi.2021.3089224 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2021-07-26

Deformable convolution networks (DCNs) proposed to address image recognition with geometric or photometric variations typically involve deformable that convolves on arbitrary locations of input features. The change different inputs and induce considerable dynamic irregular memory accesses cannot be handled by classic neural network accelerators (NNAs). Moreover, bilinear interpolation (BLI) operation, which is required obtain deformed features in DCNs, also deployed existing NNAs directly....

10.1145/3597431 article EN ACM Transactions on Design Automation of Electronic Systems 2023-05-15

Carbon Nanotube field-effect transistors (CNFET) emerge as a promising alternative to the conventional CMOS for much higher speed and power efficiency. It is particularly suitable building power-hungry last level cache (LLC). However, process variation (PV) in CNFET substantially affects operation stability thus worst-case timing, which limits LLC frequency dramatically given fully synchronous design. To address this problem, we developed variation-aware such that each part of can run at its...

10.1145/3287624.3287700 article EN Proceedings of the 28th Asia and South Pacific Design Automation Conference 2019-01-18

The recent proposed Deformable Convolutional Networks (DCNs)greatly enhance the performance of conventional Neural (CNNs) on vision recognition tasks by allowing flexible input sampling during inference runtime. DCNs introduce an additional convolutional layer for adaptive offset generation, followed a bilinear interpolation (BLI) algorithm to integerize generated non-integer values. Finally, regular convolution is performed loaded pixels. Compared with CNNs, DCN demonstrated significantly...

10.1145/3453688.3461480 article EN Proceedings of the Great Lakes Symposium on VLSI 2022 2021-06-18

Deformable convolution networks (DCNs) proposed to address the image recognition with geometric or photometric variations typically involve deformable that convolves on arbitrary locations of input features. The change different inputs and induce considerable dynamic irregular memory accesses which cannot be handled by classic neural network accelerators (NNAs). Moreover, bilinear interpolation (BLI) operation is required obtain deformed features in DCNs also deployed existing NNAs directly....

10.48550/arxiv.2107.02547 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Carbon Nanotubu field-effect transistor (CNFET) that promises both higher clock speed and energy efficiency becomes an attractive alternative to the conventional power-hungry CMOS cache. We observe CNFET-based cache constructed with typical SRAM cells has distinct consumption when reading/writing 0 1 from/to it. For instance, of writing cell is almost 10X than 0. With this observation, we propose energy-efficient design called CNT-Cache take advantage feature. It predicts line access pattern...

10.23919/date48585.2020.9116395 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2020-03-01
Coming Soon ...