-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 4e9e13b
Showing
85 changed files
with
12,480 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
*.pt | ||
*.pth | ||
*.pkl | ||
*.pyc | ||
*.txt | ||
__pycache__ | ||
det_results | ||
.vscode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,167 @@ | ||
# YOWOv2 | ||
|
||
## Requirements | ||
- We recommend you to use Anaconda to create a conda environment: | ||
```Shell | ||
conda create -n yowo python=3.6 | ||
``` | ||
|
||
- Then, activate the environment: | ||
```Shell | ||
conda activate yowo | ||
``` | ||
|
||
- Requirements: | ||
```Shell | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## Visualization | ||
|
||
Comming soon ... | ||
|
||
# Dataset | ||
You can download **UCF24** from the following links: | ||
|
||
## UCF101-24: | ||
* Google drive | ||
|
||
Link: https://drive.google.com/file/d/1Dwh90pRi7uGkH5qLRjQIFiEmMJrAog5J/view?usp=sharing | ||
|
||
* BaiduYun Disk | ||
|
||
Link: https://pan.baidu.com/s/11GZvbV0oAzBhNDVKXsVGKg | ||
|
||
Password: hmu6 | ||
|
||
## AVA | ||
You can use instructions from [here](https://github.com/yjh0410/AVA_Dataset) to prepare **AVA** dataset. | ||
|
||
# Experiment | ||
* UCF101-24 | ||
|
||
| Model | Clip | GFLOPs | Params | F-mAP | V-mAP | FPS | Weight | | ||
|----------------|--------|--------|---------|-------|-------|---------|--------------| | ||
| YOWOv2-Nano | 16 | 2.6 | 3.5 M | 78.8 | 48.0 | 42 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_nano_ucf24.pth) | | ||
| YOWOv2-Tiny | 16 | 5.8 | 10.9 M | 80.5 | 51.3 | 50 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_tiny_ucf24.pth) | | ||
| YOWOv2-Medium | 16 | 24.1 | 52.0 M | 83.1 | 50.7 | 42 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_medium_ucf24.pth) | | ||
| YOWOv2-Large | 16 | 107.1 | 109.7 M | 85.2 | 52.0 | 30 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_large_ucf24.pth) | | ||
| YOWOv2-Nano | 32 | 4.0 | 3.5 M | 79.4 | 49.0 | 42 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_nano_ucf24_k32.pth) | | ||
| YOWOv2-Tiny | 32 | 9.0 | 10.9 M | 83.0 | 51.2 | 50 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_tiny_ucf24_k32.pth) | | ||
| YOWOv2-Medium | 32 | 27.3 | 52.0 M | 83.7 | 52.5 | 40 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_medium_ucf24_k32.pth) | | ||
| YOWOv2-Large | 32 | 183.9 | 109.7 M | 87.0 | 52.8 | 22 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_large_ucf24_k32.pth) | | ||
|
||
* AVA v2.2 | ||
|
||
| Model | Clip | mAP | FPS | weight | | ||
|----------------|------------|-----------|---------|--------------| | ||
| YOWOv2-Nano | 16 | 12.6 | 40 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_nano_ava.pth) | | ||
| YOWOv2-Tiny | 16 | 14.9 | 49 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_tiny_ava.pth) | | ||
| YOWOv2-Medium | 16 | 18.4 | 41 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_medium_ava.pth) | | ||
| YOWOv2-Large | 16 | 20.2 | 29 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_large_ava.pth) | | ||
| YOWOv2-Nano | 32 | | | | | ||
| YOWOv2-Tiny | 32 | 15.6 | 49 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_tiny_ava_k32.pth) | | ||
| YOWOv2-Medium | 32 | 18.4 | 40 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_medium_ava_k32.pth) | | ||
| YOWOv2-Large | 32 | 21.7 | 22 | [ckpt](https://github.com/yjh0410/YOWOv2/releases/download/yowo_v2_weight/yowo_v2_large_ava_k32.pth) | | ||
|
||
|
||
## Train YOWOv2 | ||
* UCF101-24 | ||
|
||
```Shell | ||
python train.py --cuda -d ucf24 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 8 --lr_epoch 2 3 4 5 --lr 0.0001 -ldr 0.5 -bs 8 -accu 16 | ||
``` | ||
|
||
or you can just run the script: | ||
|
||
```Shell | ||
sh train_ucf.sh | ||
``` | ||
|
||
* AVA | ||
```Shell | ||
python train.py --cuda -d ava_v2.2 --root path/to/dataset -v yowo_v2_nano --num_workers 4 --eval_epoch 1 --max_epoch 10 --lr_epoch 3 4 5 6 --lr 0.0001 -ldr 0.5 -bs 8 -accu 16 --eval | ||
``` | ||
|
||
or you can just run the script: | ||
|
||
```Shell | ||
sh train_ava.sh | ||
``` | ||
|
||
## Test YOWOv2 | ||
* UCF101-24 | ||
For example: | ||
|
||
```Shell | ||
python test.py --cuda -d ucf24 -v yowo_v2_nano --weight path/to/weight -size 224 --show | ||
``` | ||
|
||
* AVA | ||
For example: | ||
|
||
```Shell | ||
python test.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight -size 224 --show | ||
``` | ||
|
||
## Test YOWOv2 on AVA video | ||
For example: | ||
|
||
```Shell | ||
python test_video_ava.py --cuda -d ava_v2.2 -v yowo_v2_nano --weight path/to/weight --video path/to/video --show | ||
``` | ||
|
||
Note that you can set ```path/to/video``` to other videos in your local device, not AVA videos. | ||
|
||
## Evaluate YOWOv2 | ||
* UCF101-24 | ||
For example: | ||
|
||
```Shell | ||
# Frame mAP | ||
python eval.py \ | ||
--cuda \ | ||
-d ucf24 \ | ||
-v yowo_v2_nano \ | ||
-bs 16 \ | ||
-size 224 \ | ||
--weight path/to/weight \ | ||
--cal_frame_mAP \ | ||
``` | ||
|
||
```Shell | ||
# Video mAP | ||
python eval.py \ | ||
--cuda \ | ||
-d ucf24 \ | ||
-v yowo_v2_nano \ | ||
-bs 16 \ | ||
-size 224 \ | ||
--weight path/to/weight \ | ||
--cal_video_mAP \ | ||
``` | ||
|
||
* AVA | ||
|
||
Run the following command to calculate frame mAP@0.5 IoU: | ||
|
||
```Shell | ||
python eval.py \ | ||
--cuda \ | ||
-d ava_v2.2 \ | ||
-v yowo_v2_nano \ | ||
-bs 16 \ | ||
--weight path/to/weight | ||
``` | ||
|
||
## Demo | ||
```Shell | ||
# run demo | ||
python demo.py --cuda -d ucf24 -v yowo_v2_nano -size 224 --weight path/to/weight --video path/to/video --show | ||
-d ava_v2.2 | ||
``` | ||
|
||
## References | ||
If you are using our code, please consider citing our paper. | ||
|
||
Comming soon ... |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
from .dataset_config import dataset_config | ||
from .yowo_v2_config import yowo_v2_config | ||
|
||
|
||
def build_model_config(args): | ||
print('==============================') | ||
print('Model Config: {} '.format(args.version.upper())) | ||
|
||
if 'yowo_v2_' in args.version: | ||
m_cfg = yowo_v2_config[args.version] | ||
|
||
return m_cfg | ||
|
||
|
||
def build_dataset_config(args): | ||
print('==============================') | ||
print('Dataset Config: {} '.format(args.dataset.upper())) | ||
|
||
d_cfg = dataset_config[args.dataset] | ||
|
||
return d_cfg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# Dataset configuration | ||
|
||
|
||
dataset_config = { | ||
'ucf24': { | ||
# dataset | ||
'gt_folder': './evaluator/groundtruths_ucf_jhmdb/groundtruths_ucf/', | ||
# input size | ||
'train_size': 224, | ||
'test_size': 224, | ||
# transform | ||
'jitter': 0.2, | ||
'hue': 0.1, | ||
'saturation': 1.5, | ||
'exposure': 1.5, | ||
'sampling_rate': 1, | ||
# cls label | ||
'multi_hot': False, # one hot | ||
# optimizer | ||
'optimizer': 'adamw', | ||
'momentum': 0.9, | ||
'weight_decay': 5e-4, | ||
# warmup strategy | ||
'warmup': 'linear', | ||
'warmup_factor': 0.00066667, | ||
'wp_iter': 500, | ||
# class names | ||
'valid_num_classes': 24, | ||
'label_map': ( | ||
'Basketball', 'BasketballDunk', 'Biking', 'CliffDiving', | ||
'CricketBowling', 'Diving', 'Fencing', 'FloorGymnastics', | ||
'GolfSwing', 'HorseRiding', 'IceDancing', 'LongJump', | ||
'PoleVault', 'RopeClimbing', 'SalsaSpin', 'SkateBoarding', | ||
'Skiing', 'Skijet', 'SoccerJuggling', 'Surfing', | ||
'TennisSwing', 'TrampolineJumping', 'VolleyballSpiking', 'WalkingWithDog' | ||
), | ||
}, | ||
|
||
'ava_v2.2':{ | ||
# dataset | ||
'frames_dir': 'frames/', | ||
'frame_list': 'frame_lists/', | ||
'annotation_dir': 'annotations/', | ||
'train_gt_box_list': 'ava_v2.2/ava_train_v2.2.csv', | ||
'val_gt_box_list': 'ava_v2.2/ava_val_v2.2.csv', | ||
'train_exclusion_file': 'ava_v2.2/ava_train_excluded_timestamps_v2.2.csv', | ||
'val_exclusion_file': 'ava_v2.2/ava_val_excluded_timestamps_v2.2.csv', | ||
'labelmap_file': 'ava_v2.2/ava_action_list_v2.2_for_activitynet_2019.pbtxt', # 'ava_v2.2/ava_action_list_v2.2.pbtxt', | ||
'class_ratio_file': 'config/ava_categories_ratio.json', | ||
'backup_dir': 'results/', | ||
# input size | ||
'train_size': 224, | ||
'test_size': 224, | ||
# transform | ||
'jitter': 0.2, | ||
'hue': 0.1, | ||
'saturation': 1.5, | ||
'exposure': 1.5, | ||
'sampling_rate': 1, | ||
# cls label | ||
'multi_hot': True, # multi hot | ||
# train config | ||
'optimizer': 'adamw', | ||
'momentum': 0.9, | ||
'weight_decay': 5e-4, | ||
# warmup strategy | ||
'warmup': 'linear', | ||
'warmup_factor': 0.00066667, | ||
'wp_iter': 500, | ||
# class names | ||
'valid_num_classes': 80, | ||
'label_map': ( | ||
'bend/bow(at the waist)', 'crawl', 'crouch/kneel', 'dance', 'fall down', # 1-5 | ||
'get up', 'jump/leap', 'lie/sleep', 'martial art', 'run/jog', # 6-10 | ||
'sit', 'stand', 'swim', 'walk', 'answer phone', # 11-15 | ||
'brush teeth', 'carry/hold (an object)', 'catch (an object)', 'chop', 'climb (e.g. a mountain)', # 16-20 | ||
'clink glass', 'close (e.g., a door, a box)', 'cook', 'cut', 'dig', # 21-25 | ||
'dress/put on clothing', 'drink', 'drive (e.g., a car, a truck)', 'eat', 'enter', # 26-30 | ||
'exit', 'extract', 'fishing', 'hit (an object)', 'kick (an object)', # 31-35 | ||
'lift/pick up', 'listen (e.g., to music)', 'open (e.g., a window, a car door)', 'paint', 'play board game', # 36-40 | ||
'play musical instrument', 'play with pets', 'point to (an object)', 'press','pull (an object)', # 41-45 | ||
'push (an object)', 'put down', 'read', 'ride (e.g., a bike, a car, a horse)', 'row boat', # 46-50 | ||
'sail boat', 'shoot', 'shovel', 'smoke', 'stir', # 51-55 | ||
'take a photo', 'text on/look at a cellphone', 'throw', 'touch (an object)', 'turn (e.g., a screwdriver)', # 56-60 | ||
'watch (e.g., TV)', 'work on a computer', 'write', 'fight/hit (a person)', 'give/serve (an object) to (a person)', # 61-65 | ||
'grab (a person)', 'hand clap', 'hand shake', 'hand wave', 'hug (a person)', # 66-70 | ||
'kick (a person)', 'kiss (a person)', 'lift (a person)', 'listen to (a person)', 'play with kids', # 71-75 | ||
'push (another person)', 'sing to (e.g., self, a person, a group)', 'take (an object) from (a person)', # 76-78 | ||
'talk to (e.g., self, a person, a group)', 'watch (a person)' # 79-80 | ||
), | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Model configuration | ||
|
||
|
||
yowo_v2_config = { | ||
'yowo_v2_nano': { | ||
# backbone | ||
## 2D | ||
'backbone_2d': 'yolo_free_nano', | ||
'pretrained_2d': True, | ||
'stride': [8, 16, 32], | ||
## 3D | ||
'backbone_3d': 'shufflenetv2', | ||
'model_size': '1.0x', | ||
'pretrained_3d': True, | ||
'memory_momentum': 0.9, | ||
# head | ||
'head_dim': 64, | ||
'head_norm': 'BN', | ||
'head_act': 'lrelu', | ||
'num_cls_heads': 2, | ||
'num_reg_heads': 2, | ||
'head_depthwise': True, | ||
}, | ||
|
||
'yowo_v2_tiny': { | ||
# backbone | ||
## 2D | ||
'backbone_2d': 'yolo_free_tiny', | ||
'pretrained_2d': True, | ||
'stride': [8, 16, 32], | ||
## 3D | ||
'backbone_3d': 'shufflenetv2', | ||
'model_size': '2.0x', | ||
'pretrained_3d': True, | ||
'memory_momentum': 0.9, | ||
# head | ||
'head_dim': 64, | ||
'head_norm': 'BN', | ||
'head_act': 'lrelu', | ||
'num_cls_heads': 2, | ||
'num_reg_heads': 2, | ||
'head_depthwise': False, | ||
}, | ||
|
||
'yowo_v2_medium': { | ||
# backbone | ||
## 2D | ||
'backbone_2d': 'yolo_free_large', | ||
'pretrained_2d': True, | ||
'stride': [8, 16, 32], | ||
## 3D | ||
'backbone_3d': 'shufflenetv2', | ||
'model_size': '2.0x', | ||
'pretrained_3d': True, | ||
'memory_momentum': 0.9, | ||
# head | ||
'head_dim': 128, | ||
'head_norm': 'BN', | ||
'head_act': 'silu', | ||
'num_cls_heads': 2, | ||
'num_reg_heads': 2, | ||
'head_depthwise': False, | ||
}, | ||
|
||
'yowo_v2_large': { | ||
# backbone | ||
## 2D | ||
'backbone_2d': 'yolo_free_large', | ||
'pretrained_2d': True, | ||
'stride': [8, 16, 32], | ||
## 3D | ||
'backbone_3d': 'resnext101', | ||
'pretrained_3d': True, | ||
'memory_momentum': 0.9, | ||
# head | ||
'head_dim': 256, | ||
'head_norm': 'BN', | ||
'head_act': 'silu', | ||
'num_cls_heads': 2, | ||
'num_reg_heads': 2, | ||
'head_depthwise': False, | ||
}, | ||
|
||
} |
Empty file.
Oops, something went wrong.