This code pattern will enable you to build an application that predicts how "unique" or "memorable" images are. You'll do this through the Keras deep learning library, using the MemNet architecture. The dataset this neural network will be trained on is called "LaMem" (Large-scale Image Memorability), by MIT. In order to process the 45,000 training images and 10,000 testing images (227x227 RGB) efficiently, we'll be training the neural network on a PowerAI machine on NIMBIX, enabling us to benefit from NVLink (direct CPU-GPU memory interconnect) without needing any extra code. Once the model has been trained on PowerAI, we'll convert it to a CoreML model and expose it via a web application written in Swift, running on a Kitura server on macOS.
When the reader has completed this pattern, they'll understand how to:
- Train a Keras model on PowerAI.
- Use a custom loss function with a Keras model.
- Convert tf.keras models that deal with images to CoreML models.
- Use the Apple Vision framework with a CoreML model in Swift to get
VNCoreMLFeatureValueObservation
s. - Host a Web Server with Kitura
- Expose a Mustache HTTP template through Kitura
TODO: add flow diagram
- A Keras model is trained with the LaMem dataset.
- The Keras model is converted to a CoreML model.
- The user uploads their image to the kitura web app.
- The Kitura web app uses the CoreML model for predictions.
- The user recieves the neural network's prediction.
- IBM Power Systems: A server built with open technologies and designed for mission-critical applications.
- IBM PowerAI: A software platform that makes deep learning, machine learning, and AI more accessible and better performing.
- Kitura: Kitura is a free and open-source web framework written in Swift, developed by IBM and licensed under Apache 2.0. It’s an HTTP server and web framework for writing Swift server applications.
- Artificial Intelligence: Artificial intelligence can be applied to disparate solution spaces to deliver disruptive technologies.
- Swift on the Server: Build powerful, fast and secure server side Swift apps for the Cloud.
- If you don't already have a PowerAI server, you can acquire one from Nimbix or from the PowerAI offering on IBM Cloud.
- macOS 10.13 (High Sierra) or later
- Clone the repo
- Download the LaMem data
- Train the Keras model
- Convert the Keras model to a CoreML model
- Run the Kitura web app
Clone the powerai-image-memorability
repo onto both your PowerAI server and local macOS machine. In a terminal, run:
git clone https://www.github.com/IBM/powerai-image-memorability
To download the LaMem dataset, head over to the powerai_serverside
directory, and run the following command:
wget http://memorability.csail.mit.edu/lamem.tar.gz
Once the dataset is done downloading, run the following command to extract that data:
tar -xvf lamem.tar.gz
To train the Keras model, run the following command inside of the powerai_serverside
directory:
python train.py
Once Python script is done running, you'll see a memnet_model.h5
model in the powerai_serverside
directory. Copy that over to the webapp
directory on the macOS machine that you'd like to run the frontend on.
Inside of the webapp
directory on your macOS machine, run the following Python script to convert your Keras model to a CoreML model:
python convert_model.py memnet_model.h5
This may take a few minutes, but when you're done, you should see a lamem.mlmodel
file in the webapp
directory.
Then, you're ready to roll! Run the following command to build & run your application:
swift build && swift run
Now, you can head over to localhost:3333
in your favourite web browser, upload an image, and calculate its memorability.
TODO: add screenshot