- Parallel Computing and Optimization Techniques
- Interconnection Networks and Systems
- Low-power high-performance VLSI design
- Data Mining Algorithms and Applications
- Embedded Systems Design Techniques
- Network Security and Intrusion Detection
- VLSI and FPGA Design Techniques
- Analog and Mixed-Signal Circuit Design
- Security and Verification in Computing
- Radiation Effects in Electronics
- Matrix Theory and Algorithms
- Advanced Malware Detection Techniques
- VLSI and Analog Circuit Testing
- Photonic and Optical Devices
- Mathematical Inequalities and Applications
- Error Correcting Code Techniques
- Advancements in PLL and VCO Technologies
- Software Engineering Research
University of Washington
2018-2021
Seattle University
2019-2020
Silicon Labs (United States)
2018-2020
Cornell University
2019
Qiqihar University
2012
Rapidly emerging workloads require rapidly developed chips. The Celerity 16-nm open-source SoC was implemented in nine months using an architectural trifecta to minimize development time: a general-purpose tier comprised of Linux-capable RISC-V cores, massively parallel tiled manycore array that can be scaled arbitrary sizes, and specialization uses high-level synthesis (HLS) create algorithmic neural-network accelerator. These tiers are tied together with efficient heterogeneous remote...
This article introduces BlackParrot, which aims to be the default open-source, Linux-capable, cache-coherent, 64-bit RISC-V multicore used by world. In executing this goal, our research advance world's knowledge about "software engineering of hardware." Although originally bootstrapped University Washington and Boston via DARPA funding, BlackParrot strives community driven infrastructure agnostic; a is Pareto optimal in terms power, performance, area, complexity. order ensure easy use,...
This letter presents a 16-nm 496-core RISC-V network-onchip (NoC). The mesh achieves 1.4 GHz at 0.98 V, yielding peak throughput of 695 Giga instructions/s (GRVIS), energy efficiency 314.89 GRVIS/W, and record 825320 CoreMark benchmark score. Unlike previously reported [1], this new score was obtained without modifying the core code. main feature is NoC architecture, which uses only 1881 μm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>...
This paper presents a 16 nm 496-core RISC-V network-on-chip (NoC). The mesh achieves 1.4 GHz at 0.98 V, yielding peak of 695 Giga instructions/s (GRVIS) and record 812,350 CoreMark benchmark score. main feature is the NoC architecture, which uses only 1881 μm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> per router node, enables highly scalable dense compute, provides up to 361 Tb/s aggregate bandwidth.
A sparse matrix-matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40-nm CMOS. The compute fabric consists of dedicated floating-point units, general-purpose Arm Cortex-M0 Cortex-M4 cores. on-chip reconfigures scratchpad or cache, depending on the phase algorithm. units are interconnected synthesizable coalescing crossbars for efficient access. 2.0-mm × 2.6-mm chip exhibits 12.6× (8.4×) energy efficiency gain, 11.7×...
Network-On-Chip design has been an active area of academic research for two decades, but many proposed ideas have not adopted in real chips because they complex behavior or create significant risks chip implementation. For this reason, existing just employ fast, replicated vanilla dimension-ordered mesh NoCs. However, these networks do come close to utilizing the full available VLSI wiring capabilities, and propagate packets at speeds that are significantly below raw speed wires. The ideal...
A Sparse Matrix-Matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40 nm CMOS. On-chip memories are reconfigured as scratchpad or cache interconnected synthesizable coalescing crossbars for efficient access each phase of the algorithm. The 2.0 mm×2.6 mm chip exhibits 12.6× (8.4×) energy efficiency gain, 11.7× (77.6×) off-chip bandwidth gain 17.1× (36.9×) compute density against high-end CPU (GPU) across diverse set...
Conventional wisdom states that Network-on-Chip router area grows quadratically with the channel width, and this perception has fundamentally shaped assumptions of thousands NoC papers have been written to date, many chip designs. However, assumption is not entirely true. Simple analysis empirical data from paper shows that, in modern standard cell technology, a router's logic actually only linearly; it solely wire routing quadratically.If we think as standalone block done hierarchical VLSI...
A Sparse Matrix-Matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40 nm CMOS. On-chip memories are reconfigured as scratchpad or cache interconnected synthesizable coalescing crossbars for efficient access each phase of the algorithm. The <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$2.0\ \text{mm}\times 2.6\ \text{mm}$</tex> chip exhibits...
Developed a new system model of software vulnerability discovering, which was based on fuzzing, feature matching API sequences and data mining. Overcame the disadvantages old techniques, this method effectively improves detection potential unknown security vulnerabilities in software. Besides, is more automated performs better finding vulnerabilities.
This paper proposes a fully synthesizable digital low-dropout (DLDO) regulator using an automatic offset control and reuse technique implemented in 65-nm CMOS technology. To realize the DLDO design, all components of core blocks are made with standard logic cells. The proposed is adopted to cancel voltage from cells automatically, provide adaptive equivalent thresholds for comparison window, speed up dropout response. Besides, modified comparator-triggered oscillator introduces output...
Considering the efficiency problem of software vulnerability discovering in Linux system, a new system program with data mining algorithm is proposed this paper. An improved REL based on one-dimensional linked list proposed, and sequence call algorithm, then we do analysis detection vulnerabilities. A model LRE was designed. Finally, experimental results show validity terms reducing false alarm rate, improving security discovering.