Skip to content

ucsd-ml-arts/ml-art-final-bizarred

Repository files navigation

Final Project - One Film, Four Flavors

Yudi Wang, yuw043@ucsd.edu

Abstract Proposal

I do love the the famous bamboo fighting scene from the film Crouching Tiger, Hidden Dragon and feel it can be understood from different angles. In this project, I combined the video-style-transfer and the audio-style-transfer. I trim 4 clips from the original movie and try to tranfer it into 4 different style: playful, sad, angry and horrible. In order to do this, I adopt different stylized video and stylized audio created by the algorithm.

Basic idea of video style transfer is first dividng the video into a sequence of frames, then apply the arbitrary style transfer algorithm on these frames one by one. The last simple step is combining these stylized frames together and thus we have a style transferred ballet dance video. For the style transfer step, I choose the arbitray-style-transfer model which is flexible and manageable. The original video format is mp4 and the reansferred one is avi which has small memory assumption.

Basic idea of audio style transfer is synthesizes audio "content" and "style" independently using the magnitudes of a short time Fourier transform, shallow convolutional networks with randomly initialized filters, and iterative phase reconstruction with Griffin-Lim. In order to make comparsion between different style, I adopt same music content with different style. The result is quite interesting.

Project Report

ECE188_Final_Project_Report.pdf

Model/Data

  • For the video style transfer part: Video_Style_Transfer-2.ipynb

The adopted model is arbitrary style transfer algorithm which is based on pretrained VGG-19. For the style picture, I collected via google and after try all of them, only reserve the good ones. The final picked style picture are shown as below:

Video document is scrapped from Youtube
  1. https://www.youtube.com/watch?v=FEPf1M0n5SM&t=139s

  2. https://www.youtube.com/watch?v=KXIJv1NoXmo (Crouching Tiger, Hidden Dragon (7/8) Movie CLIP - Bamboo Forest Fight (2000) HD)

Code

Following code is run on colab platform:

  • trained models - Video_Style_Transfer.ipynb & Audio_Style_Transfer.ipynb

Results

For audio style transfer:

Process can be found in the following link:

https://drive.google.com/file/d/19J2Poh7G0ao1kJXAwahaEhx9if50gdFH/view?usp=sharing

For video style transfer:

In order to get the best result, I tried tens of painting style images. Final result are shown as below:

The final version of the video can be viewed at the below link:

https://drive.google.com/file/d/12pjGmoAiRq0AkjolQnozrbZgFUJSfF16/view?usp=sharing

Technical Notes

Reference

[1]. D. Ulyanov and V. Lebedev, “Audio texture synthesis and style transfer,” 2016

[2]. P. K. Mital. Time Domain Neural Audio Style Transfer. Presented at the Workshop on Machine Learning for Creativity and Design at the Neural Information Processing Systems Conference 2017 (NIPS2017), December 3 – 9, 2017. arxiv workshop

About

ml-art-final-bizarred created by GitHub Classroom

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published