Skip to content

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

License

Notifications You must be signed in to change notification settings

hbseong97/HarmAug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

This repository contains code for reproducing HarmAug introduced in

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee*, Haebin Seong*, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang (*: Equal contribution)

[arXiv link]
[Model link]
[Dataset link]

concept_figure

overall_comparison_broken

Reproduction Steps

First, we recommend to create a conda environment with python 3.10.

conda create -n harmaug python=3.10
conda activate harmaug

After that, install the requirements.

pip install -r requirements.txt

Then, download necessary files from Google Drive and put them into their appropriate folders.

mv kd_dataset@harmaug.json ./data

Finally, you can start the knowledge distillation process.

bash script/kd.sh

Reference

To cite our paper, please use this BibTex

@inproceedings{
lee2025harmaug,
title={HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models},
author={Seanie Lee and Haebin Seong and Dong Bok Lee and Minki Kang and Xiaoyin Chen and Dominik Wagner and Yoshua Bengio and Juho Lee and Sung Ju Hwang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=y3zswp3gek}
}

About

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published