A Code Discriminator Integrating General Semantics with Code Details
Paper: https://arxiv.org/abs/2412.17429
The Condor consists of two main components: contrastive learning at the embedding level to capture code details (upper section), and data-level augmentation through intermediate code, which supplements code details that are not recorded in existing datasets (lower section).
You can find the CodeNanoFix dataset from /data
.
You can find the model with fine-tuned weights from /models
.
You can find the training details from /src
.
Training entry -> GoGoGoxx.py
python GoGoGoSingleDS.py