Skip to content

Commit

Permalink
Added documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Shreyasi2002 authored Dec 21, 2023
1 parent 46ff354 commit 3283800
Showing 1 changed file with 265 additions and 0 deletions.
265 changes: 265 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
<div id="top"></div>

<br />
<div align="center">

<h2 align="center">Defense Against Adversarial Attacks using Convolutional Auto-Encoders</h2>

<p align="center">
Official code implementation
<br />
<br />
<a href="https://arxiv.org/abs/2312.03520">View Paper</a>
·
<a href="https://github.com/Shreyasi2002/Adversarial_Attack_Defense/issues">Report Bug</a>
·
<a href="https://github.com/Shreyasi2002/Adversarial_Attack_Defense/pulls">Request Feature</a>
</p>
</div>


<!-- TABLE OF CONTENTS -->
<summary><b>Table of Contents</b></summary>
<ol>
<li>
<a href="#about">About</a>
</li>
<li>
<a href="#usage-instructions">Usage Instructions</a>
<ul>
<li><a href="#project-structure">Project Structure</a></li>
<li><a href="#install-dependencies">Install Dependencies</a></li>
<li><a href="#train-vgg16">Train VGG16</a></li>
<li><a href="#adverse-attacks">Adverse Attacks</a></li>
<li><a href="#train-and-test-autoencoder">Train and Test AutoEncoder</a></li>
</ul>
</li>
<li>
<a href="#results">Results</a>
</li>
<li>
<a href="#citation">Citation</a>
</li>
</ol>

## About
Deep learning models, while achieving state-of-the-art performance on many tasks, are susceptible to adversarial attacks that exploit inherent vulnerabilities in their architectures.
Adversarial attacks manipulate the input data with imperceptible perturbations, causing the model to misclassify the data or produce erroneous outputs. Szegedy et al. (https://arxiv.org/abs/1312.6199) discovered
that Deep Neural Network models can be manipulated into making wrong predictions by adding small perturbations to the input image.

<div align="center">
<img src="images/attack-dnn.png" width="80%"/>

Fig 1: Szegedy et al. were able to fool AlexNet by classifying a perturbed image of a dog into an ostrich
</div>
An U-shaped convolutional auto-encoder is used to reconstruct original input from the adversarial images generated by FGSM and PGD attacks, effectively removing the adversarial perturbations.
The goal of the autoencoder network is to minimise the mean squared error loss between the original unperturbed image and the reconstructed image,
which is generated using an adversarial example. While doing so, a random Gaussian noise is added after encoding the image so as to make the model more robust.
The idea behind adding the noise is to perturb the latent representation by a small magnitude, and then decode the perturbed latent representation,
akin to how the adversarial examples are generated in the first place.

<div align="center">
<img src="images/convAE.png" width="80%"/>

Fig 2: Architecture of the proposed Convolutional AutoEncoder
</div>

## Usage Instructions

### Project Structure
```
📂 Adversarial-Defense
|_📁 AElib
|_📄 VGG.py # VGG16 architecture for training on MNIST and Fashion-MNIST datasets
|_📄 autoencoder.py # Architecture of Convolutional AutoEncoder with GELU activation
|_📄 utils.py # Utility functions
|_📄 attacks.py # Implementation of PGD and FGSM Attacks
|_📁 images
|_📁 models # Trained VGG16 models and the AutoEncoder models for different attacks and datasets
|_📁 notebooks # Jupyter notebooks containing detailed explanations with visualisations
|_📄 fgsm-attack-on-mnist-and-fashion-mnist-dataset.ipynb
|_📄 pgd-attack-on-mnist-and-fashion-mnist.ipynb
|_📄 vgg16-on-mnist-and-fashion-mnist.ipynb
|_📄 defense-against-adversarial-attacks-autoencoder.ipynb
|_📄 adverse_attack.py # Applying Adversarial attacks on the desired dataset
|_📄 train_vgg16.py # Training VGG16 for multi-class classification
|_📄 LICENSE
|_📄 requirements.txt
|_📄 autoencoder.py # Training and Testing AutoEncoder for Adversarial Defense
|_📄 .gitignore
```

If you are more comfortable with Jupyter notebooks, you can refer to the `notebooks` folder in this repository :)

### Install dependencies
Run the following command -
```bash
pip install -r requirements.txt
```
Since the pre-trained VGG16 models were of large size, they were not uploaded in this repository. You can download them from here - https://www.kaggle.com/datasets/shreyasi2002/vgg16-models-mnist-fashion-mnist

**IMP**: This project uses GPU to train and test the models.

To test the availability of GPU, run the following command - `!nvidia-smi`
<br/>
If you see an output like this, you are good to go :)
```
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 38C P8 9W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
```


### Train VGG16
VGG16 is one of the popular algorithms for image classification and is easy to use with transfer learning.
<div align="center">
<img src="./images/vgg.png" width="80%" />

Fig 3: VGG16 Architecture
</div>

Run the following command to train the VGG16 model from scratch -
```bash
!python train_vgg16.py [-h] [--dataset {mnist,fashion-mnist}] [--lr LR] [--epochs EPOCHS]
```
1. `--dataset` argument to use the dataset of your choice from MNIST dataset or Fashion MNIST dataset. (default is MNIST)
2. `--lr` argument to set the learning rate (default is 0.001)
3. `--epochs` argument to set the number of epochs (default is 10)

You can also use the `-h` command to refer to the documentation.

Example Usage : `!python train_vgg16.py --dataset mnist --lr 0.0001 --epochs 20`

More information about this can also be found here - https://www.kaggle.com/code/shreyasi2002/vgg16-on-mnist-and-fashion-mnist


### Adverse Attacks
This project supports two types of attacks :
1. FGSM (Fast Gradient Sign Method) Adversarial Attack (https://www.kaggle.com/code/shreyasi2002/fgsm-attack-on-mnist-and-fashion-mnist-dataset)
2. PGD (Projected Gradient Descent) Attack (https://www.kaggle.com/code/shreyasi2002/pgd-attack-on-mnist-and-fashion-mnist)

To apply attack to any dataset, use the following command -
```bash
!python adverse_attack.py [-h] [--attack {fgsm,pgd}] [--dataset {mnist,fashion-mnist}]
[--epsilon EPSILON]
```
1. `--attack` argument to set the type of attack from PGD or FGSM (default is PGD since it is a stronger attack)
2. `--dataset` argument to use the dataset of your choice from MNIST dataset or Fashion MNIST dataset. (default is MNIST)
3. `--epsilon` argument to determine the strength of the attack. If FGSM attack is used, keep this value in the range [0, 0.8]. If PGD attack is used, keep this value in the range [0, 0.3].

You can also use the `-h` command to refer to the documentation.

Example Usage : `!python adverse_attack.py --attack pgd --dataset fashion-mnist --epsilon 0.3`

The higher the `epsilon (ε)` value, the stronger is the attack. As evident from Fig 4, using large epsilon (ε) values (here 1.0) leads to corruption of the label semantics making it impossible to retrieve the original image. Hence, it is recommended to keep the ε below 1.0
<div align="center">
<img src="./images/attack-mnist-1.0-1.png" width="80%" />

Fig 4: FSM Attack (ε = 1.0) on the MNIST dataset
</div>
Since the PGD attacked adversarial examples are more natural-looking as seen in Fig 5, I have created a dataset with the adversarial examples for the MNIST Dataset. Feel free to play around with it :)
<div align="center">
<img src="./images/attack-fashion-mnist-0.3-1.png" width="80%" />

Fig 5: PGD Attack (ε = 0.3) on the Fashion MNIST dataset
</div>
Link to Dataset - https://www.kaggle.com/datasets/shreyasi2002/corrupted-mnist


### Train and Test AutoEncoder
Use the following command to train the autoencoder model from scratch -
```bash
!python autoencoder.py [-h] [--attack {fgsm,pgd}]
[--dataset {mnist,fashion-mnist}]
--action train
[--use_pretrained {True,False}] [--epsilon EPSILON]
[--epochs EPOCHS]
```
Use the following command to test the model -
```bash
!python autoencoder.py [-h] [--attack {fgsm,pgd}]
[--dataset {mnist,fashion-mnist}]
--action test
--use_pretrained True
```
1. `--attack` argument to set the type of attack from PGD or FGSM (default is PGD since it is a stronger attack)
2. `--dataset` argument to use the dataset of your choice from MNIST dataset or Fashion MNIST dataset. (default is MNIST)
3. `--action` argument is to either train the model or test a pre-trained model.
4. `--use_pretrained` argument is set to True if you want to test your model or train a model from an existing pre-trained model
5. `--epsilon` argument to determine the strength of the attack (only during training)
6. `--epochs` argument to set the number of epochs (only during training)

Example Usages:
1. `!python autoencoder.py --attack fgsm --dataset fashion-mnist --action train --use_pretrained False --epsilon 0.6 --epochs 10`
2. `!python autoencoder.py --attack pgd --dataset mnist --action test --use_pretrained True`


## Results
The AutoEncoder successfully reconstructs the images almost similar to the original images as shown below -
<div align="center">
<img src="./images/reconstruction_fgsm_mnist_2.png" width="80%" />

<img src="./images/reconstruction_pgd_mnist_3.png.png" width="80%" />

Fig 6: Comparison of the adversarial image, reconstructed image and the original image
</div>
The accuracy of the pre-trained VGG-16 classifier on the MNIST and Fashion-MNIST dataset with FGSM attack increases by 65.61% and 59.76% respectively. For the PGD attack, the accuracy increases by 89.88% and 43.49%. This shows the efficacy of our model in defending the adversarial attacks with high accuracy.
<div align="center">
<table>
<thead>
<tr>
<td><b>Attack</b></td>
<td colspan=2><b>Accuracy (w/o defense)</b></td>
<td colspan=2><b>Accuracy (with defense)</b></td>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>MNIST</td>
<td>Fashion-MNIST</td>
<td>MNIST</td>
<td>Fashion-MNIST</td>
</tr>
<tr>
<td>FGSM (ε = 0.60)</td>
<td>0.2648</td>
<td>0.1417</td>
<td>0.9209</td>
<td>0.7393</td>
</tr>
<tr>
<td>PGD (ε = 0.15)</td>
<td>0.0418</td>
<td>0.2942</td>
<td>0.9406</td>
<td>0.7291</td>
</tr>
</tbody>
</table>
</div>


## Citation
Please cite this paper as -
```
Mandal, Shreyasi. "Defense Against Adversarial Attacks using
Convolutional Auto-Encoders." arXiv preprint arXiv:2312.03520 (2023).
```

0 comments on commit 3283800

Please sign in to comment.