Adversarial Examples Repository

This repository dedicated to exploring adversarial examples using the ResNet18v1 computer vision classification model.

"An adversarial example is an instance with small, intentional feature perturbations that cause a machine learning model to make a false prediction." - Interpretable Machine Learning

Original Objective

The primary objective was to craft a function capable of taking any given image along with a specified target class and generating a modified version of the image. This modified image, infused with adversarial noise, would be misclassified by the model as the chosen target class, despite the content of the original image. Importantly, the modifications to the original image should remain imperceptible to the human eye.

Repository Contents

The repository features two main notebooks:

notebooks/00_adv_noise_exp_sandbox.ipynb: This is a sandbox for various experimental approaches, including a noteworthy experiment with Patch-based attacks and GradCAM, which visualizes the significance of image features. For insights into this technique, refer to the notebook's end.
notebooks/01_adv_noise_cleanup.ipynb: This notebook showcases the creation of counterfactual examples that successfully trick the model into misclassifying images to a user-specified class. This method was validated using a subset of 50 images from the original ImageNet dataset, with all instances successfully deceiving the model.
notebooks/02_fixed_01_adv_noise_cleanup.ipynb: Fixed adversarial example fitting. The primary correction involved removing the clamp from the fitting image process, which was producing artifacts and preventing the images from being denormalized.

Environment Setup Instructions

To begin experimenting with adversarial examples, set up your environment by following these steps:

Create a new conda environment:

conda create --name advexamples python=3.11

Activate the environment:
```
conda activate advexamples
```
Install the necessary dependencies:
```
pip install -r requirements.txt
```

Start Jupyter Lab to access the notebooks:

jupyter lab --no-browser --ip 0.0.0.0 --port 8888 --allow-root --notebook-dir=.

Literature Review and Methods

The development of this repository was informed by a comprehensive review of literature and various methods in the field of adversarial machine learning. Below are key resources and methods that have inspired and guided our experiments:

Literature:

Possible Methods:

Szegedy et al. (2013) “Intriguing Properties of Neural Networks” - An iterative white box approach using gradient descent to fit epsilon perturbations.
Goodfellow et al. (2014) “Explaining and harnessing adversarial examples” introduced the “Fast Gradient Sign Method (FGSM)” - a one-pass method detailed in A Deep Dive into the Fast Gradient Sign Method. Additional resources:
Athalye et al. (2017) propose the “Expectation Over Transformation (EOT)” method, which incorporates augmentation.
Su et al. (2019) “The 1-pixel attack” utilizes differential evolution to modify images.
Brown et al. (2017) “Adversarial Patch” - A particularly interesting method that explores whether an adversarial patch, such as an anime girl, could encapsulate a class like a rifle and be placed on an image to mislead the model.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
data/example_images		data/example_images
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adversarial Examples Repository

Original Objective

Repository Contents

Environment Setup Instructions

Literature Review and Methods

Literature:

Possible Methods:

About

Languages

nerlfield/adversarial-examples

Folders and files

Latest commit

History

Repository files navigation

Adversarial Examples Repository

Original Objective

Repository Contents

Environment Setup Instructions

Literature Review and Methods

Literature:

Possible Methods:

About

Resources

Stars

Watchers

Forks

Languages