Skip to content

[ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations"

License

Notifications You must be signed in to change notification settings

BillChan226/SafeWatch

Repository files navigation

🔥🔥 Recent news please check Project page !

Project Page Arxiv License: MIT Hits GitHub Stars

Dataset Download

Please download the dataset from Zhaorun/SafeWatch-Bench.

Video Guardrail Inference

First run the following script to get the model outputs for all the videos under the real and genai subsets. Make sure the dataset paths are correct:

python eval_benchmark.py -m model_name -c checkpoint_path

checkpoint_path is optional, if not provided, the model will be loaded from the default checkpoint specified by the model name.

📝 Evaluation Scripts

For evaluating per-category guardrail performance (ACC) and overall guardrail performance (Avg ACC, F1, AUPRC), please run the following scripts:

python metrics/eval_guardrail.py

For evaluating explanation quality, please first sample a subset of videos for evaluation by running:

python eval_judge/sample_exp_eval.py

Then run the following script to get the judge outputs:

python eval_judge/explanation_judge.py

Finally, run the following script to get the evaluation results:

python metrics/eval_explanation.py

📖 Acknowledgement

Please cite the paper as follows if you use the data or code from SafeWatch:

@article{chen2024safewatch,
  title={Safewatch: An efficient safety-policy following video guardrail model with transparent explanations},
  author={Chen, Zhaorun and Pinto, Francesco and Pan, Minzhou and Li, Bo},
  journal={arXiv preprint arXiv:2412.06878},
  year={2024}
}

📖 Contact

Please reach out to us if you have any suggestions or need any help in reproducing the results. You can submit an issue or pull request, or send an email to zhaorun@uchicago.edu.

🔑 License

This repository is under MIT License.

About

[ICLR 2025] Official implementation for "SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages