NFDI4DS | UHH-SEMS - Publication Details

Bayesian neural architecture search using a training-free performance metric

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Artificial Intelligence Recurrent neural network Computer Science - Neural and Evolutionary Computing Machine Learning (stat.ML) 02 engineering and technology Machine Learning (cs.LG) Architecture optimization Artificial Intelligence (cs.AI) Statistics - Machine Learning 0202 electrical engineering, electronic engineering, information engineering Neural and Evolutionary Computing (cs.NE) EO Data Science Bayesian optimization Neural architecture search

DOI: 10.1016/j.asoc.2021.107356 Publication Date: 2021-03-29T22:29:39Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Andrés Camero

Hao Wang

Enrique Alba

Thomas Bäck

ABSTRACT

Recurrent neural networks (RNNs) are a powerful approach for time series prediction. However, their performance is strongly affected by their architecture and hyperparameter settings. The architecture optimization of RNNs is a time-consuming task, where the search space is typically a mixture of real, integer and categorical values. To allow for shrinking and expanding the size of the network, the representation of architectures often has a variable length. In this paper, we propose to tackle the architecture optimization problem with a variant of the Bayesian Optimization (BO) algorithm. To reduce the evaluation time of candidate architectures the Mean Absolute Error Random Sampling (MRS), a training-free method to estimate the network performance, is adopted as the objective function for BO. Also, we propose three fixed-length encoding schemes to cope with the variable-length architecture representation. The result is a new perspective on accurate and efficient design of RNNs, that we validate on three problems. Our findings show that 1) the BO algorithm can explore different network architectures using the proposed encoding schemes and successfully designs well-performing architectures, and 2) the optimization time is significantly reduced by using MRS, without compromising the performance as compared to the architectures obtained from the actual training procedure.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (44)

CITATIONS (14)

EXTERNAL LINKS

OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

Bayesian neural architecture search using a training-free performance metric

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....