Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach
Robustness
DOI:
10.48550/arxiv.2502.06832
Publication Date:
2025-02-05
AUTHORS (4)
ABSTRACT
Mixture of Experts (MoE) have shown remarkable success in leveraging specialized expert networks for complex machine learning tasks. However, their susceptibility to adversarial attacks presents a critical challenge deployment robust applications. This paper addresses the question how incorporate robustness into MoEs while maintaining high natural accuracy. We begin by analyzing vulnerability MoE components, finding that are notably more susceptible than router. Based on this insight, we propose targeted training technique integrates novel loss function enhance MoE, requiring only robustification one additional without compromising or inference efficiency. Building this, introduce dual-model strategy linearly combines standard model with our robustified using smoothing parameter. approach allows flexible control over robustness-accuracy trade-off. further provide theoretical foundations deriving certified bounds both single and dual-model. To push boundaries accuracy, joint JTDMoE enhances accuracy beyond what is achievable separate models. Experimental results CIFAR-10 TinyImageNet datasets ResNet18 Vision Transformer (ViT) architectures demonstrate effectiveness proposed methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....