Yifan Hou, yihou@ucsd.edu
(Your teammate's contact info, if appropriate)
When people see some images, they will imagine sounds in their minds. In this project, I want to generate different background music for different images.When people choose or upload a image, the program will generate corresponding audio for the image. To complete this, first I need one well-established, pre-trained image recognition model to calssify images. Then I use another convolutional neural network reads the audio as spectrogram images, evolving so that the distribution of its output gets as close as possible to that of the first one. Once trained, the two networks allow us to generate the best-matched sound for a scene.
Briefly describe the files that are included with your repository:
- two trained models(pre-trained image recognition modle and SoundNet model)
- training data (train sounds)
Your code for generating your project:
- Jupyter notebooks: main.ipynb
Documentation of your results in an appropriate format, both links to files and a brief description of their contents:
- beach.mp3 and beach2.mp3 for beach image
- train.mps for train image
- town.mp3 and town2.mp3 for town image
Python 3 Jupyter notebook keras scipy
References to any papers, techniques, repositories you used: