tensorflow-mnist-manifolds

Exploration of different dimensionality reduction techniques on the MNIST dataset. The objective is to reduce MNIST data samples to a 2-dimensional representation and analyze the obtained new representations using several appreaches:

Variational AutoEncoders (VAE)
Dense NN and CNN classifiers (with 2-neuron bottleneck)
Linear approaches: PCA, LDA
Non-linear approaches: UMAP, t-SNE

MNIST Dataset

VAE latent representations

Two different VAEs are tested: one using flat pixel features (only Dense layers), and another using 2d images as input (using Conv2d layers).

Dimensionality reduction (2D representation), z-dim:
- We observe that data samples are embedded to a latent representation where visually similar classes (e.g. 0 and 6) have a close representation (near in distance) in the latent 2d space.

Learnt representations from VAEs (variation of mu and log_var on the axis)
- Surprisingly, using flat features can lead to good VAE latent representations.

NN and CNN classifiers

A feed-forward neural network (shallowing flattened pixels) and a CNN (taking 2d images as input), both containing an inner layer with 2 neurons, are trained to classify the images into numbers. We take a look at the 2-neuron layer to observe which compressed representation each network has learnt.

Learnt representations after training: Feed-forward NN (flat features) vs. CNN (2D features):

Evolution of the learned MNIST manifold (2D latent space) along batches while learning classification task of the feed-forward NN. We can observe how the neural network tries to find the most suitable compressed 2D representation to easily discriminate the different categories, making the datasamples almost linearly separable (visually better than VAE latent representations):

Linear dimensionality reduction approaches

PCA

LDA

Non-linear dimensionality reduction approaches

UMAP

t-SNE

Conclusions

Manifolds and latent representation learnt by NN and CNN classifiers is qualitatively and visually better, since the networks have the specific task of class separation (at least more interpretable than VAEs).
Linear approaches struggle to find a good low-dimensional representation.
UMAP provides interesting embeddings, and helps to easily identify outliers with acceptable computation time.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
imgs		imgs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
tf-mnist-manifolds.ipynb		tf-mnist-manifolds.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tensorflow-mnist-manifolds

MNIST Dataset

VAE latent representations

NN and CNN classifiers

Linear dimensionality reduction approaches

Non-linear dimensionality reduction approaches

Conclusions

About

Releases

Packages

Languages

License

gonzalorecio/MNIST-latent-representations

Folders and files

Latest commit

History

Repository files navigation

tensorflow-mnist-manifolds

MNIST Dataset

VAE latent representations

NN and CNN classifiers

Linear dimensionality reduction approaches

Non-linear dimensionality reduction approaches

Conclusions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages