SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations

🔥🔥 Recent news please check Project page !

Dataset Download

Please download the dataset from Zhaorun/SafeWatch-Bench.

Video Guardrail Inference

First run the following script to get the model outputs for all the videos under the real and genai subsets. Make sure the dataset paths are correct:

python eval_benchmark.py -m model_name -c checkpoint_path

checkpoint_path is optional, if not provided, the model will be loaded from the default checkpoint specified by the model name.

📝 Evaluation Scripts

For evaluating per-category guardrail performance (ACC) and overall guardrail performance (Avg ACC, F1, AUPRC), please run the following scripts:

python metrics/eval_guardrail.py

For evaluating explanation quality, please first sample a subset of videos for evaluation by running:

python eval_judge/sample_exp_eval.py

Then run the following script to get the judge outputs:

python eval_judge/explanation_judge.py

Finally, run the following script to get the evaluation results:

python metrics/eval_explanation.py

📖 Acknowledgement

Please cite the paper as follows if you use the data or code from SafeWatch:

@article{chen2024safewatch,
  title={Safewatch: An efficient safety-policy following video guardrail model with transparent explanations},
  author={Chen, Zhaorun and Pinto, Francesco and Pan, Minzhou and Li, Bo},
  journal={arXiv preprint arXiv:2412.06878},
  year={2024}
}

📖 Contact

Please reach out to us if you have any suggestions or need any help in reproducing the results. You can submit an issue or pull request, or send an email to zhaorun@uchicago.edu.

🔑 License

This repository is under MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
eval_judge		eval_judge
metrics		metrics
model_finetune		model_finetune
utility		utility
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset_backup.py		dataset_backup.py
eval_benchmark.py		eval_benchmark.py
sanity_check.py		sanity_check.py
update_json.py		update_json.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations

Dataset Download

Video Guardrail Inference

📝 Evaluation Scripts

📖 Acknowledgement

📖 Contact

🔑 License

About

Releases

Packages

Languages

License

BillChan226/SafeWatch

Folders and files

Latest commit

History

Repository files navigation

SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations

Dataset Download

Video Guardrail Inference

📝 Evaluation Scripts

📖 Acknowledgement

📖 Contact

🔑 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages