IPU-Net

We propose an identity-preserving end-to-end image-to-image translation deep neural network which is capable of super-resolving very low-resolution faces to their high-resolution counterparts while preserving identity-related information. We achieved this by training a very deep convolutional encoder-decoder network with a symmetric contracting path between corresponding layers. This network was trained with a combination of a reconstruction and an identity-preserving loss, on multi-scale low-resolution conditions.

Important Links

Paper
FaceNet model
VggFace2 dataset

Model Architecture

Our proposed method includes two deep neural networks which are designed to receive low-resolution faces and produce the super-resolution counterparts which are optimized for both visual quality and face recognition. The first network is a deep convolutional encoder-decoder architecture with symmetric skip-connections between the corresponding layers (UNet-based architecture) which helps to map the low-resolution and high-resolution face pairs more accurately through learning richer semantic representations and sharing low-level information across the network and the other network is a pre-trained face recognition model with an Inception-ResNet architecture.

Our identity-preserveing encoder-decoder network consists of 7 downsampling and 7 upsampling blocks with symmetric contracting paths (skip connections) for better localization of pixels between low-resolution and high-resolution pairs. In each downsampling block, we applied strided convolution with stride 2 followed by batch normalization and leaky relu layers. In each upsampling block, we used transposed convolution for doubling the spatial dimensions and also batch normalization, dropout, and relu activation functions. We added skip connections between each layer i in the encoder and layer (n - i + 1) in the decoder, where n is the total number of layers. Each skip connection simply concatenates all channels at layer i with those at layer (n - i + 1). These skip-connections helped to localize the feature maps better through sharing information between the encoder and the decoder.

Below figure depicts the proposed model architecture. It consists of an identity-preserving face hallucination image-to-image translation model and a pre-trained face recognition network. The deep convolutional face hallucination model translates the the bicubic upsampled low-resolution faces to super-resolved version during training (i.e. image-to-image translation). The pre-trained face recognition model produces discriminative, low-dimensional embeddings for both super-resolved and high-resolution faces in order to compute the identity loss.

Requirements:

Python 3.7
TensorFlow 2.0+
Pillow
Scikit-learn
OpenCV

How to train

To Train the model, run this:

python train.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
figures		figures
.gitignore		.gitignore
README.md		README.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IPU-Net

Important Links

Model Architecture

Requirements:

How to train

Result

About

Releases

Packages

Languages

vahid0001/IPU-Net

Folders and files

Latest commit

History

Repository files navigation

IPU-Net

Important Links

Model Architecture

Requirements:

How to train

Result

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages