The paper introduces NEO-KD, a novel knowledge distillation approach for adversarial training in multi-exit networks. It effectively reduces adversarial transferability and enhances test accuracy, demonstrated through comprehensive experiments across various setups and datasets.
(1) Seokil Ham, KAIST;

(2) Jungwuk Park, KAIST;

(3) Dong-Jun Han, Purdue University;

(4) Jaekyun Moon, KAIST.

Abstract and 1. Introduction

2. Related Works

3. Proposed NEO-KD Algorithm and 3.1 Problem Setup: Adversarial Training in Multi-Exit Networks

3.2 Algorithm Description

4. Experiments and 4.1 Experimental Setup

4.2. Main Experimental Results

4.3. Ablation Studies and Discussions

5. Conclusion, Acknowledgement and References

A. Experiment Details

B. Clean Test Accuracy and C. Adversarial Training via Average Attack

D. Hyperparameter Tuning

E. Discussions on Performance Degradation at Later Exits

F. Comparison with Recent Defense Methods for Single-Exit Networks

G. Comparison with SKD and ARD and H. Implementations of Stronger Attacker Algorithms

5 Conclusion

In this paper, we proposed a new knowledge distillation based adversarial training strategy for robust multi-exit networks. Our solution, NEO-KD, reduces adversarial transferability in the network while guiding the output of the adversarial examples to closely follow the ensemble outputs of the neighbor exits of the clean data, significantly improving the overall adversarial test accuracy. Extensive experimental results on both anytime and budgeted prediction setups using various datasets confirmed the effectiveness of our method, compared to baselines relying on existing adversarial training or knowledge distillation techniques for multi-exit networks.


This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2019R1I1A2A02061135), by the Center for Applied Research in Artificial Intelligence (CARAI) grant funded by DAPA and ADD (UD230017TD), and by IITP funds from MSIT of Korea (No. 2020-0-00626).


This paper is available on arxiv under CC 4.0 license.