Multi-objective Q-learning-based hyper-heuristic with Bi-criteria selection for energy-aware mixed shop scheduling

Job shop Heuristics
DOI: 10.1016/j.swevo.2021.100985 Publication Date: 2021-09-16T09:29:45Z
ABSTRACT
Abstract Owning to diverse customer demands and enormous product varieties, mixed shop production systems are applied in practice to improve responsiveness and productivity along with energy-saving. This work addresses a mixture of job-shop and flow-shop production scheduling problem with a speed-scaling policy and no-idle time strategy. A mixed-integer linear programming model is formulated to determine the speed level of operations and the sequence of job-shop and flow-shop products, aiming at the simultaneous optimization of production efficiency and energy consumption. Then, a multi-objective Q-learning-based hyper-heuristic with Bi-criteria selection (QHH-BS) is developed to obtain a set of high-quality Pareto frontier solutions. In this algorithm, a new three-layer encoding is designed to represent the production sequence of job-shop and flow-shop products; the Pareto-based and indicator-based selection criteria are sequentially implemented to encourage diversity and convergence; Q-learning with a multi-objective metric-based reward mechanism is applied to select an optimizer from three prominent optimizers in each iteration for better exploration and exploitation. Three conclusions are drawn from extensive experiments: (1) Bi-criteria selection is superior to single-criterion selections; (2) Q-learning-based hyper-heuristic is more effective and robust than single optimizer-based algorithms and simple hyper-heuristics; (3) QHH-BS outperforms the existing state-of-the-art multi-objective algorithms in convergence and diversity.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (46)
CITATIONS (52)