Abstract
Deep neural networks (DNNs) are crucial in safety-critical applications but vulnerable to adversarial attacks, where subtle perturbations cause misclassification. Existing defense mechanisms struggle with small perturbations and face accuracy-robustness trade-offs. This study introduces the "Reverse Attack"method to address these challenges. Our approach uniquely reconstructs and classifies images by applying perturbations opposite to the attack direction, using a complementary "Revenant"classifier to maintain original image accuracy. The proposed method significantly outperforms existing strategies, maintaining clean image accuracy with only a 2.92% decrease while achieving over 70% robust accuracy against all benchmarked adversarial attacks. This contrasts with current mechanisms, which typically suffer an 18% reduction in clean image accuracy and only 36% robustness against adversarial examples. We evaluate our method on the CIFAR-10 dataset using ResNet50, testing against various attacks including PGD and components of Auto Attack. Although our approach incurs additional computational costs during reconstruction, our method represents a significant advancement in robust defenses against adversarial attacks while preserving clean input performance. This balanced approach paves the way for more reliable DNNs in critical applications. Future work will focus on optimization and exploring applicability to larger datasets and complex architectures.
Original language | English |
---|---|
Pages (from-to) | 176485-176497 |
Number of pages | 13 |
Journal | IEEE Access |
Volume | 12 |
DOIs | |
Publication status | Published - 2024 |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering