Evaluation results are inconsistent during training and after saving the trained model #405

YUXIN-commit · 2024-12-18T04:27:48Z

Hi MONAI Team,
First of all, thumbs up for your great work. I am facing the issue of inconsistent results of the SwinUNETR model while evaluating during the training and testing phases.
During training, I am getting good results but when I saved the pre-trained SwinUNETR model I am getting very bad results. I didn't change anything, just download your GitHub repo and trained the SwinUNETR model but the results are inconsistent. Here I am uploading the screenshots when I have trained the model for 100 epochs. Please help me to figure out the issue. Waiting for your response. Thank you in advance.

training results:

when I saved the pre-trained SwinUNETR model I am getting very bad results:

YUXIN-commit · 2024-12-18T13:13:04Z

I encountered another issue, which is 'Too many false positives on test-data' when using test.py. Has anyone encountered this problem before? Could you please let me know how to resolve it? "
Let me know if you'd like me to refine it further!

As shown in the image:

ras-lyg · 2025-01-15T06:31:55Z

Hello, I've encountered the same issue. When I downloaded the author’s pre-trained weights to test on the BTCV dataset, the results were very good. However, when I used my own brain supplement dataset for training, the testing performance was very poor despite achieving high accuracy during the training process. Have you managed to resolve this issue? Looking forward to your reply.

YUXIN-commit · 2025-01-15T06:41:25Z

Hello, I've encountered the same issue. When I downloaded the author’s pre-trained weights to test on the BTCV dataset, the results were very good. However, when I used my own brain supplement dataset for training, the testing performance was very poor despite achieving high accuracy during the training process. Have you managed to resolve this issue? Looking forward to your reply.

Hello, I have solved the issue of overly fragmented test datasets, which was mainly due to the strict requirements for sliding window inference size. You can adjust the roi_size and overlap to resolve this problem. Additionally, I found that setting include_background to False in DiceMetric during training makes the performance closer to the test performance.

ras-lyg · 2025-01-15T10:24:53Z

Thank you for your response. However, it seems that our issues are somewhat different. I am using SwinUNETR to train on other datasets to obtain weights, and then using the author's test.py for testing. Despite achieving an accuracy of 0.865 during training, the testing performance is not good, with many classes having a Dice score of only 0.

YUXIN-commit · 2025-01-17T07:56:26Z

Thank you for your response. However, it seems that our issues are somewhat different. I am using SwinUNETR to train on other datasets to obtain weights, and then using the author's test.py for testing. Despite achieving an accuracy of 0.865 during training, the testing performance is not good, with many classes having a Dice score of only 0.

I feel the same way, but I think the high Dice score during training is due to the include_background setting being set to True, which includes the background class (class 0) in the calculations, leading to a higher Dice score

AbdulRehman0004 · 2025-02-07T18:26:43Z

Hi. Has anyone tried this model with Brast training? I am facing a very bad Dice score nearly 0.04. I cannot get what is the reason.
Can anyone help please?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation results are inconsistent during training and after saving the trained model #405

Evaluation results are inconsistent during training and after saving the trained model #405

YUXIN-commit commented Dec 18, 2024

YUXIN-commit commented Dec 18, 2024

ras-lyg commented Jan 15, 2025

YUXIN-commit commented Jan 15, 2025

ras-lyg commented Jan 15, 2025

YUXIN-commit commented Jan 17, 2025

AbdulRehman0004 commented Feb 7, 2025

Evaluation results are inconsistent during training and after saving the trained model #405

Evaluation results are inconsistent during training and after saving the trained model #405

Comments

YUXIN-commit commented Dec 18, 2024

YUXIN-commit commented Dec 18, 2024

ras-lyg commented Jan 15, 2025

YUXIN-commit commented Jan 15, 2025

ras-lyg commented Jan 15, 2025

YUXIN-commit commented Jan 17, 2025

AbdulRehman0004 commented Feb 7, 2025