Repo for the 'Artificial Neural Networks and Deep Learning' course (2019/2020)

This repository contains the final python notebooks that were employed in 3 Kaggle challenges that were proposed during the course. We have exploited Colab and Kaggle servers to train our models. Since they offer the possibility to keep track of the file history, sometimes we didn't remember to update this repository and therefore you might miss some intermediate modification we have made on the files. The datasets that were used in each of the challenge are contained in a separate repo which is imported as a submodule. Artificial Neural Networks have shown impressing results in a broad range of application domains. The challenges are nothing else than a set of problems taken from image processing. The order in which they were presented was set to progressively increase the complexity of the tasks.

The repo is organzed as follow:

DL-CompetitionsDatasets, contains the datasets;

dataSetStatistics.py, was used to evaluate some characteristics of the datasets;

image_classification.ipynb, python notebook for the first challenge;

image_segmentation.ipynb, python notebook for the second challenge;

question_answering.ipynb, python notebook for the third challenge;

resize_on_disk.ipynb, python notebook to transform the dataset of the third challenge;

The challenges

Image classification

The first competition consists of a classification problem. In an image classification problem, given an image, the goal is to predict the correct class to which the image belongs. The task request to categorize 307 images in 20 different classes.

In this challenge we have used: Convolutional Neural Netowrks, basic data augmentation techniques (zoom, rotation, horizontal and vertical flip), transfer learning with an without fine tuning ( Resnet, DenseNet201, InceptionV3, InceptionResNetV2, InceptionResNetV2, Xception) and ensembles with K-folding.

For more information on the competition or in the techniques applied take a look on the two links below.

to the Kaggle competition ; notebook
Aerial image segmentation

In this second challenge we were requested to segment an image. Image segmentation can be seen as a classification problem applied to each pixel in the figure provided as input. The dataset, that with high probability was a subset of the Inria dataset, contains aerial ~~orthorectified~~ color images (you can see an example below). The challenge consists in determining which of the pixels belonged to a building.

In this challenge we have used: U-Net models, transfer learning using pretrained networks such as DenseUNet and ResUNet, data augmentation( horizontal/vertical/zoom), preprocessing and postprocessing using techniques taken from image analysis and computer vision ( histogram equalization, Gaussian filters, morphological Transformations provided by OpenCV), we tried to increase the number of channels adding what can be obtained trough Laplacian filter and we tried a custom data augmentation, aimed at enriching the dataset by creating synthetic aerial images

For more information on the competition or in the techniques applied take a look on the two links below.

to the Kaggle competition; notebook
Visual question answering

This was the most difficult challenge we faced. In this task the network takes two inputs: i) a synthetic scene in which are presented several objects with different geometric shapes and/or finishes (colour, material) ii) and a question about the existence of something in the scene (e.g., Is there a yellow thing?') or about counting (e.g., How many big objects are there?'). The network has to produce a suitable answer by choosing between a set of predefined sentence: yes, no, 0, 1, ..., 9. So in a certain sense, it can be seen as a classification problem.

An example.

Q: What number of other matte objects are the same shape as the small rubber object?

A: 1

Even if the challenge was a subset of CLEVR, the dataset was huge : more than 12 GB. As a consequence, the first thing we did was to accelerate the training procedure (a batch of 64 elements took 2 seconds to be processed). After reading A simple neural network module for relational reasoning, it became clear that the task could be solved using images with lower resolution. In this way, we were able to reduce by around 8 times the time taken to process a batch and this allows to exploit more efficient caching mechanisms.

The basic architecture that we used was a combination of three NNs. A CNN processed the image, while embedding + LSTM examined the question. The two outputs were then transformed by a dense layer to output a 1-hot encoded answer.

We have tried several approach: tackling with different networks counting and boolean questions, GRU, different pre-trained feature extractors, pretrained word embedding, attention mechanisms and we designed a custom data generator to provide evenly distributed batches.

For more information on the competition or in the techniques applied take a look on the two links below.

to the Kaggle competition; notebook

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
DL-CompetitionsDatasets @ 1287d30		DL-CompetitionsDatasets @ 1287d30
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
dataSetStatistics.py		dataSetStatistics.py
image_classification.ipynb		image_classification.ipynb
image_segmentation.ipynb		image_segmentation.ipynb
question_answering.ipynb		question_answering.ipynb
resize_on_disk.ipynb		resize_on_disk.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repo for the 'Artificial Neural Networks and Deep Learning' course (2019/2020)

The challenges

Image classification

Aerial image segmentation

Visual question answering

About

Releases

Packages

Contributors 2

Languages

An example.

Q: What number of other matte objects are the same shape as the small rubber object?
A: 1

mett29/DL-Competition

Folders and files

Latest commit

History

Repository files navigation

Repo for the 'Artificial Neural Networks and Deep Learning' course (2019/2020)

The challenges

Image classification

Aerial image segmentation

Visual question answering

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages