NFDI4DS | UHH-SEMS - Publication Details

CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts

DOI: 10.1007/s11432-024-4144-y Publication Date: 2024-09-26T09:02:13Z

Abstract Supplemental Material References Cited by

AUTHORS (9)

Shihao Han

Sishuo Liu

Shucheng Du

Mingzi Li

Zijian Ye

Xiaoxin Xu

Yi Li

Zhongrui Wang

Dashan Shang

ABSTRACT

AbstractArtificial intelligence (AI) has experienced substantial advancements recently, notably with the advent of large-scale language models (LLMs) employing mixture-of-experts (MoE) techniques, exhibiting human-like cognitive skills. As a promising hardware solution for edge MoE implementations, the computing-in-memory (CIM) architecture collocates memory and computing within a single device, significantly reducing the data movement and the associated energy consumption. However, due to diverse edge application scenarios and constraints, determining the optimal network structures for MoE, such as the expert’s location, quantity, and dimension on CIM systems remains elusive. To this end, we introduce a software-hardware co-designed neural architecture search (NAS) framework, CIM-based MoE NAS (CMN), focusing on identifying a high-performing MoE structure under specific hardware constraints. The results of the NYUD-v2 dataset segmentation on the RRAM (SRAM) CIM system reveal that CMN can discover optimized MoE configurations under energy, latency, and performance constraints, achieving 29.67× (43.10×) energy savings, 175.44×(109.89×) speedup, and 12.24× smaller model size compared to the baseline MoE-enabled Visual Transformer, respectively. This co-design opens up an avenue toward high-performance MoE deployments in edge CIM systems.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (43)

CITATIONS (0)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....