Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Kinetics 700 #8

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Kinetics 700 #8

wants to merge 3 commits into from

Conversation

daniel-j-h
Copy link

Related to #6: let's get the Kinetics-700 in.

@Showmax I updated

classes.json
kinetics_test.json
kinetics_train.json
kinetics_val.json

based on the official Kinetics 700 dataset. But the json files are becoming too big to host them in the Github repository (remote: error: File resources/kinetics_train.json is 158.91 MB; this exceeds GitHub's file size limit of 100.00 MB). Where do you prefer me putting them? Into git lfs?

Then a quick question regarding categories.json: I could not find this file in the Kinetics 700 dataset; how did you generate it? Manually? How can we update it to Kinetics 700?

Opening this pull request already so we can discuss here.

Thanks!

@ondrejbiza
Copy link
Collaborator

Thanks for the pull request! Maybe we can write a script that downloads the jsons and generates the metadata? I think I made categories.json with a modified version of list_categories.py.

@daniel-j-h
Copy link
Author

The json (and other files) are all bundled up here - It'd be best to host the files we care about e.g. on this Github repository (as release files), then we could download from it easily.

https://storage.googleapis.com/deepmind-media/research/Kinetics_700.zip

@ondrejbiza
Copy link
Collaborator

I've never done that before but I can take a look over the weekend--hopefully.

@ondrejbiza
Copy link
Collaborator

Hi Daniel,
I think we don't actually need to release any big files. The train.json from Kinetics-700 can be compressed into 20 MB. We just need to add a simple bash script to decompress it or instructions to do it by hand.

However, there are three small issues:

  1. We don't have the classes.json file for Kinetics-700. When I made it for Kinetics-400, I think I just copied all the classes from the paper and formatted them to json by hand. Could you do the same for Kinetics-700?

  2. I guess categories aren't really defined for Kinetics-700. We could probably just put everything into a single category "all" to be backwards compatible with the scripts.

  3. Do you think it's worth it to keep the option to download Kinetics-400 only? I can code up a switch between the two datasets after you make the classes.json.

Thanks,
Ondrej

@karenli1995
Copy link

karenli1995 commented Sep 9, 2020

@ondrejba Here is the classes.json for Kinetics700:
https://gist.github.com/karenli1995/fc410a304ab2a856631fa73f6a02801f

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants