- Ferroelectric and Negative Capacitance Devices
- Advanced Graph Neural Networks
- Embedded Systems Design Techniques
- Graph Theory and Algorithms
- Multimodal Machine Learning Applications
- Parallel Computing and Optimization Techniques
- Topic Modeling
- Natural Language Processing Techniques
- VLSI and FPGA Design Techniques
- Interconnection Networks and Systems
- Advanced Memory and Neural Computing
University of California, Los Angeles
2022-2025
Carnegie Mellon University
2021
Graph convolutional networks (GCNs) have been introduced to effectively process non-Euclidean graph data. However, GCNs incur large amounts of irregularity in computation and memory access, which prevents efficient use traditional neural network accelerators. Moreover, existing dedicated GCN accelerators demand high volumes are difficult implement onto resource limited edge devices. In this work, we propose LW-GCN, a lightweight FPGA-based accelerator with software-hardware co-designed...
Linear algebra computations can be greatly accelerated using spatial accelerators on FPGAs. As a standard building block of linear applications, BLAS covers wide range compute patterns that vary vastly in data reuse, bottleneck resources, matrix storage layouts, and types. However, existing implementations routines FPGAs are stuck the dilemma productivity performance. They either require extensive human effort or fail to leverage properties for acceleration. We introduce Lasa, framework...
Graph Convolutional Networks (GCNs) have shown great results but come with large computation costs and memory overhead. Recently, sampling-based approaches been proposed to alter input sizes, which allows GCN workloads align hardware constraints. Motivated by this flexibility, we propose an FPGA-based accelerator, named SkeletonGCN, along multiple software-hardware co-optimizations improve training efficiency. We first quantize all feature adjacency matrices of from FP32 SINT16. then...
We investigate the use of multimodal information contained in images as an effective method for enhancing commonsense Transformer models text generation. perform experiments using BART and T5 on concept-to-text generation, specifically task generative reasoning, or CommonGen. call our approach VisCTG: Visually Grounded Concept-to-Text Generation. VisCTG involves captioning representing appropriate everyday scenarios, these captions to enrich steer generation process. Comprehensive evaluation...
Linear algebra can often be significantly expedited by spatial accelerators on FPGAs. As a broadly-adopted linear library, BLAS requires extensive optimizations for routines that vary vastly in data reuse, bottleneck resources, matrix storage layouts, and types. Existing solutions are stuck the dilemma of productivity performance. We introduce Lasa, framework composed programming model compiler, addresses abstracting (for productivity) specializing performance) architecture accelerator. Lasa...
Graph convolutional networks (GCNs) have been introduced to effectively process non-euclidean graph data. However, GCNs incur large amounts of irregularity in computation and memory access, which prevents efficient use traditional neural network accelerators. Moreover, existing dedicated GCN accelerators demand high volumes are difficult implement onto resource limited edge devices. In this work, we propose LW-GCN, a lightweight FPGA-based accelerator with software-hardware co-designed...