MILE: A Mutation Testing Framework of In-Context Learning Systems (SETTA 2024)

Zeming Wei, Yihao Zhang, and Meng Sun.

Accepted by SETTA 2024. Preprint: https://arxiv.org/abs/2409.04831

Usage

Download SST2, AGnews, mrpc, QNLI, RTE, WMT datasets and move them into the folder ./data. You can directly copy the data folder from BatchICL.
Edit the paths to your LLMs in paths.py.
Calculate the accuracy with eval_acc.py. Example:

python eval_acc.py --model vicuna --task all --shots 20 --test-example 250

Create folder ./results and run the mutation testing with main.py. The log will be saved in ./results. Example:

python main.py --model vicuna --mutants 20 --test-example 250 --shots 20 --task SST2

Calculate Standard and Group-wise Mutation Scores with analysis.py and mutator_analysis.py (complete log for all models and tasks required). Example:

python analysis.py --num 50
python mutator_analysis.py

Citation

@InProceedings{wei2024mile,
    title     = {MILE: A Mutation Testing Framework of In-Context Learning Systems},
    author    = {Wei, Zeming and Zhang, Yihao and Sun, Meng},
    booktitle = {SETTA},
    year      = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
MILE.pdf		MILE.pdf
README.md		README.md
analysis.py		analysis.py
eval_acc.py		eval_acc.py
main.py		main.py
mutator_analysis.py		mutator_analysis.py
paths.py		paths.py
test.py		test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MILE: A Mutation Testing Framework of In-Context Learning Systems (SETTA 2024)

Usage

Citation

About

Releases

Packages

Languages

weizeming/MILE

Folders and files

Latest commit

History

Repository files navigation

MILE: A Mutation Testing Framework of In-Context Learning Systems (SETTA 2024)

Usage

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages