NFDI4DS | UHH-SEMS - Publication Details

Xiaolong Ma

ORCID: 0000-0003-3753-7648

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5016070401

Research Areas

Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Adversarial Robustness in Machine Learning
Advanced Memory and Neural Computing
Human Pose and Action Recognition
Stochastic Gradient Optimization Techniques
Machine Learning and ELM
Neural Networks and Applications
Generative Adversarial Networks and Image Synthesis
Anomaly Detection Techniques and Applications
Advanced Image and Video Retrieval Techniques
Video Analysis and Summarization
Artificial Intelligence in Games
CCD and CMOS Imaging Sensors
Error Correcting Code Techniques
Parallel Computing and Optimization Techniques
Video Surveillance and Tracking Methods
IoT and Edge/Fog Computing
Sparse and Compressive Sensing Techniques
Cell Image Analysis Techniques
Ferroelectric and Negative Capacitance Devices
Advanced Data Compression Techniques
Neural dynamics and brain function
Speech Recognition and Synthesis
Network Security and Intrusion Detection

Clemson University
2022-2024

Ocean University of China
2024

Northeastern University
2007-2023

Universidad del Noreste
2019-2022

Northwest University
2022

William & Mary
2019

Syracuse University
2017-2018

Chinese Academy of Sciences
2018

Institute of Computing Technology
2018

Stony Brook University
2006-2007

C ir CNN

OPENALEX - Publications

Caiwen Ding Siyu Liao Yanzhi Wang Zhe Li Ning Liu and 11 more

Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical improve energy efficiency performance while maintaining accuracy. For DNNs, model an important factor affecting performance, scalability efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) irregular network structure after pruning, which affects throughput; 2) increased training complexity; 3) lack rigirous guarantee...

10.1145/3123939.3124552 preprint EN 2017-10-14

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning

OPENALEX - Publications

Wei Niu Xiaolong Ma Sheng Lin Shihao Wang Xuehai Qian and 3 more

With the emergence of a spectrum high-end mobile devices, many applications that formerly required desktop-level computation capability are being transferred to these devices. However, executing Deep Neural Networks (DNNs) inference is still challenging considering high and storage demands, specifically, if real-time performance with accuracy needed. Weight pruning DNNs proposed, but existing schemes represent two extremes in design space: non-structured fine-grained, accurate, not hardware...

10.1145/3373376.3378534 preprint EN 2020-03-09

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

OPENALEX - Publications

Ning Liu Xiaolong Ma Zhiyuan Xu Yanzhi Wang Jian Tang and 1 more

Structured weight pruning is a representative model compression technique of DNNs to reduce the storage and computation requirements accelerate inference. An automatic hyperparameter determination process necessary due large number flexible hyperparameters. This work proposes AutoCompress, an structured framework with following key performance improvements: (i) effectively incorporate combination schemes in process; (ii) adopt state-of-art ADMM-based as core algorithm, propose innovative...

10.1609/aaai.v34i04.5924 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-Time Execution on Mobile Devices

OPENALEX - Publications

Xiaolong Ma Fu-Ming Guo Wei Niu Xue Lin Jian Tang and 3 more

Model compression techniques on Deep Neural Network (DNN) have been widely acknowledged as an effective way to achieve acceleration a variety of platforms, and DNN weight pruning is straightforward method. There are currently two mainstreams methods representing extremes regularity: non-structured, fine-grained can high sparsity accuracy, but not hardware friendly; structured, coarse-grained exploits hardware-efficient structures in pruning, suffers from accuracy drop when the rate high. In...

10.1609/aaai.v34i04.5954 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Non-Structured DNN Weight Pruning—Is It Beneficial in Any Platform?

OPENALEX - Publications

Xiaolong Ma Sheng Lin Shaokai Ye Zhezhi He Linfeng Zhang and 8 more

Large deep neural network (DNN) models pose the key challenge to energy efficiency due significantly higher consumption of off-chip DRAM accesses than arithmetic or SRAM operations. It motivates intensive research on model compression with two main approaches. Weight pruning leverages redundancy in number weights and can be performed a non-structured, which has flexibility rate but incurs index irregular weights, structured manner, preserves full matrix structure lower rate. quantization...

10.1109/tnnls.2021.3063265 article EN publisher-specific-oa IEEE Transactions on Neural Networks and Learning Systems 2021-03-18

CHEX: CHannel EXploration for CNN Model Compression

OPENALEX - Publications

Zejiang Hou Minghai Qin Fei Sun Xiaolong Ma Kun Yuan and 5 more

Channel pruning has been broadly recognized as an effective technique to reduce the computation and memory cost of deep convolutional neural networks. However, conventional methods have limitations in that: they are restricted process only, require a fully pre-trained large model. Such may lead sub-optimal model quality well excessive training cost. In this paper, we propose novel Exploration methodology, dubbed CHEX, rectify these problems. As opposed pruning-only strategy, repeatedly prune...

10.1109/cvpr52688.2022.01197 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization

OPENALEX - Publications

Mengshu Sun Zhengang Li Alec Lu Yanyu Li Sung-En Chang and 3 more

With the trend to deploy Deep Neural Network (DNN) inference models on edge devices with limited resources, quantization techniques have been widely used reduce on-chip storage and improve computation throughput. However, existing DNN work deploying below 8-bit may be either suffering from evident accuracy loss or facing a big gap between theoretical improvement of throughput practical speedup. In this work, we propose general framework, called FILM-QNN, quantize accelerate multiple across...

10.1145/3490422.3502364 article EN 2022-02-11

Reversible Gates and Testability of One Dimensional Arrays of Molecular QCA

OPENALEX - Publications

Xiaolong Ma Jing Huang Cecilia Metra Fabrizio Lombardi

10.1007/s10836-007-5042-2 article EN Journal of Electronic Testing 2008-01-07

StructADMM: Achieving Ultrahigh Efficiency in Structured Pruning for DNNs

OPENALEX - Publications

Tianyun Zhang Shaokai Ye Xiaoyu Feng Xiaolong Ma Kaiqi Zhang and 7 more

Weight pruning methods of deep neural networks (DNNs) have been demonstrated to achieve a good model rate without loss accuracy, thereby alleviating the significant computation/storage requirements large-scale DNNs. Structured weight proposed overcome limitation irregular network structure and actual GPU acceleration. However, in prior work, (degree sparsity) acceleration are limited (to less than 50%) when accuracy needs be maintained. In this we these limitations by proposing unified,...

10.1109/tnnls.2020.3045153 article EN publisher-specific-oa IEEE Transactions on Neural Networks and Learning Systems 2021-02-16

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

OPENALEX - Publications

Yanzhi Wang Caiwen Ding Zhe Li Geng Yuan Siyu Liao and 6 more

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia. The aim this paper is to achieve ultra-high energy efficiency performance for hardware implementations neural networks (DNNs). An algorithm-hardware co-optimization framework developed, which applicable different DNN types, sizes, application scenarios. algorithm part adopts the general block-circulant matrices a fine-grained tradeoff accuracy compression ratio. It applies both...

10.1609/aaai.v32i1.11653 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-29

ResNet Can Be Pruned 60×: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

OPENALEX - Publications

Xiaolong Ma Geng Yuan Sheng Lin Zhengang Li Hao Sun and 1 more

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on framework resources. To mitigate the challenges, weight pruning techniques has been studied. However, accuracy solution extreme structured that combines different types of sparsity still waiting unraveling due to extremely reduced weights in networks. In this paper, we propose a two (filter column prune) by incorporating alternating direction method multipliers...

10.1109/nanoarch47378.2019.181304 article EN 2019-07-01

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

OPENALEX - Publications

Geng Yuan Xiaolong Ma Wei Niu Zhengang Li Zhenglun Kong and 11 more

Recently, a new trend of exploring sparsity for accelerating neural network training has emerged, embracing the paradigm on edge. This paper proposes novel Memory-Economic Sparse Training (MEST) framework targeting accurate and fast execution edge devices. The proposed MEST consists enhancements by Elastic Mutation (EM) Soft Memory Bound (&S) that ensure superior accuracy at high ratios. Different from existing works sparse training, this current work reveals importance schemes performance...

10.48550/arxiv.2110.14032 preprint EN cc-by arXiv (Cornell University) 2021-01-01

CAN: Context-assisted full Attention Network for brain tissue segmentation

OPENALEX - Publications

Zhan Li Chunxia Zhang Yongqin Zhang Xiaofeng Wang Xiaolong Ma and 2 more

10.1016/j.media.2022.102710 article EN Medical Image Analysis 2022-12-21

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

OPENALEX - Publications

Tianyun Zhang Shaokai Ye Kaiqi Zhang Xiaolong Ma Ning Liu and 6 more

Weight pruning methods of DNNs have been demonstrated to achieve a good model rate without loss accuracy, thereby alleviating the significant computation/storage requirements large-scale DNNs. Structured weight proposed overcome limitation irregular network structure and actual GPU acceleration. However, in prior work (degree sparsity) acceleration are limited (to less than 50%) when accuracy needs be maintained. In this work,we these limitations by proposing unified, systematic framework...

10.48550/arxiv.1807.11091 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks

OPENALEX - Publications

Caiwen Ding Ao Ren Geng Yuan Xiaolong Ma Jiayu Li and 3 more

Both industry and academia have extensively investigated hardware accelerations. To address the demands in increasing computational capability memory requirement, this work, we propose structured weight matrices (SWM)-based compression technique for both Field Programmable Gate Array (FPGA) application-specific integrated circuit (ASIC) implementations. In algorithm part, SWM-based framework adopts block-circulant to achieve a fine-grained tradeoff between accuracy ratio. The can reduce...

10.1145/3194554.3194625 article EN Proceedings of the Great Lakes Symposium on VLSI 2022 2018-05-30

ADMM-based Weight Pruning for Real-Time Deep Learning Acceleration on Mobile Devices

OPENALEX - Publications

Hongjia Li Ning Liu Xiaolong Ma Sheng Lin Shaokai Ye and 4 more

Deep learning solutions are being increasingly deployed in mobile applications, at least for the inference phase. Due to large model size and computational requirements, compression deep neural networks (DNNs) becomes necessary, especially considering real-time requirement embedded systems. In this paper, we extend prior work on systematic DNN weight pruning using ADMM (Alternating Direction Method of Multipliers). We integrate regularization with masked mapping/retraining, thereby...

10.1145/3299874.3319492 article EN Proceedings of the Great Lakes Symposium on VLSI 2022 2019-05-13

Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning

OPENALEX - Publications

Masuma Akter Rumi Xiaolong Ma Yanzhi Wang Peng Jiang

Weight pruning is a popular technique to reduce the size and computation complexity of Convolutional Neural Networks (CNNs). Despite its success in reducing model size, weight has brought limited benefit CNN inference performance, due irregularity introduced sparse convolution operations. In this work, we aim improve performance convolutions on GPUs by mitigating irregularity. We find that existing optimization techniques for matrix computations fail accelerate convolutions, observe main...

10.1145/3410463.3414648 article EN 2020-09-30

A Unified DNN Weight Pruning Framework Using Reweighted Optimization Methods

OPENALEX - Publications

Tianyun Zhang Xiaolong Ma Zheng Zhan Shanglin Zhou Caiwen Ding and 2 more

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed generally fall into two categories, i.e., static regularization-based dynamic pruning. However, former method currently suffers either complex workloads or accuracy degradation, while latter one takes a long time to tune parameters achieve desired rate without loss. In this paper, we propose unified DNN framework with dynamically updated...

10.1109/dac18074.2021.9586152 article EN 2021-11-08

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

OPENALEX - Publications

Yifan Gong Zheng Zhan Zhengang Li Wei Niu Xiaolong Ma and 6 more

Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability mobile edge devices. However, previous methods mainly focus on reducing model size and/or improving performance without considering privacy user data. To mitigate this concern, we propose a privacy-preserving-oriented acceleration framework that does not require private training dataset. At algorithm level framework, systematic weight technique based alternating direction...

10.1145/3386263.3407650 article EN 2020-09-04

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

OPENALEX - Publications

Xiaolong Ma Sheng Lin Shaokai Ye Zhezhi He Linfeng Zhang and 8 more

10.48550/arxiv.1907.02124 preprint EN other-oa arXiv (Cornell University) 2019-01-01

PCNN: Pattern-based Fine-Grained Regular Pruning Towards Optimizing CNN Accelerators

OPENALEX - Publications

Zhanhong Tan Jiebo Song Xiaolong Ma Sia-Huat Tan Hongyang Chen and 6 more

Weight pruning is a powerful technique to realize model compression. We propose PCNN, fine-grained regular 1D method. A novel index format called Sparsity Pattern Mask (SPM) presented encode the sparsity in PCNN. Leveraging SPM with limited patterns and non-zero sequences equal length, PCNN can be efficiently employed hardware. Evaluated on VGG-16 ResNet-18, our achieves compression rate up 8.4× only 0.2% accuracy loss. also implement pattern-aware architecture 55nm process, achieving 9.0×...

10.1109/dac18072.2020.9218498 article EN 2020-07-01

Coming Soon ...