diff --git a/README.md b/README.md index 9269a04..4e95231 100644 --- a/README.md +++ b/README.md @@ -104,7 +104,7 @@ OUTPUT_PATH= # 测试集预测输出.xlsx文件位置 + (2022/11/29) 根据`GMago123`的反馈([ISSUE#3](https://github.com/xlxwalex/FCGEC/issues/3)) ,为了便于提交评测,我们在Joint的输出XLSX文件中加入了对应的UID字段 + (2022/12/06) 根据`谭*环`(邮件)的建议,为了方便对比Seq2Seq类型的模型,我们提供了可将FCGEC数据的编辑操作转换为序列到序列数据的脚本[convert_fcgec_to_seq2seq.py](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/scripts/convert_fcgec_to_seq2seq.py) ,使用方式请见[`scripts`](https://github.com/xlxwalex/FCGEC/tree/main/model/STG-correction/scripts) 下的README文件 + (2023/01/25) 根据`Candice52`的反馈([ISSUE#7](https://github.com/xlxwalex/FCGEC/issues/7)),我们提供了将其他Seq2Seq格式的数据转换为本工作中的操作格式的脚本(论文中的Algorithm 1代码) [convert_seq2seq_to_operation.py](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/scripts/convert_seq2seq_to_operation.py) ,使用方式请见[`scripts`](https://github.com/xlxwalex/FCGEC/tree/main/model/STG-correction/scripts) 下的README文件 -+ (2023/03/03) 根据`Helenailse1`的建议([ISSUE#10](https://github.com/xlxwalex/FCGEC/issues/10))),我们新增了对标点、字母等非中文符号的映射 ++ (2023/03/03) 根据`Helenailse1`的建议([ISSUE#10](https://github.com/xlxwalex/FCGEC/issues/10)),我们新增了对标点、字母等非中文符号的映射 ### 模型参数文件 我们提供在FCGEC语料上训练好的模型checkpoint文件,以方便大家使用测试: diff --git a/README_EN.md b/README_EN.md index 6988bd2..b2d7f6b 100644 --- a/README_EN.md +++ b/README_EN.md @@ -105,9 +105,10 @@ PLM_PATH= # Pre-trained model path OUTPUT_PATH= # Prediction output of .xlsx file for STG model ``` ### New Features -+ (2022/11/29) Add `uuid` column to the output xlsx file of [`run_stg_tti.sh`](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/run_stg_tti.sh). ++ (2022/11/29) Add `uuid` column to the output xlsx file of [`run_stg_joint.sh`](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/run_stg_tti.sh). + (2022/12/06) Add a script file [convert_fcgec_to_seq2seq.py](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/scripts/convert_fcgec_to_seq2seq.py) which can convert our FCGEC data to seq2seq format. More details are shown in the README of [`scripts`](https://github.com/xlxwalex/FCGEC/tree/main/model/STG-correction/scripts) folder. + (2023/01/25) Add a script file [convert_seq2seq_to_operation.py](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/scripts/convert_seq2seq_to_operation.py) which can convert our seq2seq format data to operation format data. More details are shown in the README of [`scripts`](https://github.com/xlxwalex/FCGEC/tree/main/model/STG-correction/scripts) folder. ++ (2023/03/03) Add new codes for processing special charaters (e.g., punctuations, alphabets) for [`run_stg_joint.sh`](https://github.com/xlxwalex/FCGEC/blob/main/model/STG-correction/run_stg_joint.sh). ### Checkpoint File We provide checkpoint files of our STG trained on the FCGEC corpus to facilitate testing: