To obtain the candidate and validated patches generated by all the code language models (referring to RQ1 and RQ2 in the paper), simply downloading the artifact is enough.
To reproduce the result or re-finetune the models, the Ubuntu system is recommended.
- Developers also need to have Docker installed in order to run the code in our docker image.
- Also, to re-generate patches with all the ten code language models evaluated in the paper (or to re-finetune them), developers need to have at least 8 GPUs with 8*12=96Gb GPU memory in total (especially the CodeGen-6B and InCoder-6B models are resource-costly). But, to only re-generate patches with the smaller models (e.g., CodeT5-large, PLBART-larger, etc.), only one GPU is needed (referring to RQ3 in the paper).