HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

This repository contains code for reproducing HarmAug introduced in

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee*, Haebin Seong*, Dong Bok Lee, Minki Kang, Xiaoyin Chen, Dominik Wagner, Yoshua Bengio, Juho Lee, Sung Ju Hwang (*: Equal contribution)

[arXiv link]
[Model link]
[Dataset link]

Reproduction Steps

First, we recommend to create a conda environment with python 3.10.

conda create -n harmaug python=3.10
conda activate harmaug

After that, install the requirements.

pip install -r requirements.txt

Then, download necessary files from Google Drive and put them into their appropriate folders.

mv kd_dataset@harmaug.json ./data

Finally, you can start the knowledge distillation process.

bash script/kd.sh

Reference

To cite our paper, please use this BibTex

@inproceedings{
lee2025harmaug,
title={HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models},
author={Seanie Lee and Haebin Seong and Dong Bok Lee and Minki Kang and Xiaoyin Chen and Dominik Wagner and Yoshua Bengio and Juho Lee and Sung Ju Hwang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=y3zswp3gek}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
script		script
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Reproduction Steps

Reference

About

Releases

Packages

Contributors 2

Languages

License

hbseong97/HarmAug

Folders and files

Latest commit

History

Repository files navigation

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Reproduction Steps

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages