Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation results are inconsistent during training and after saving the trained model #405

Open
YUXIN-commit opened this issue Dec 18, 2024 · 6 comments

Comments

@YUXIN-commit
Copy link

Hi MONAI Team,
First of all, thumbs up for your great work. I am facing the issue of inconsistent results of the SwinUNETR model while evaluating during the training and testing phases.
During training, I am getting good results but when I saved the pre-trained SwinUNETR model I am getting very bad results. I didn't change anything, just download your GitHub repo and trained the SwinUNETR model but the results are inconsistent. Here I am uploading the screenshots when I have trained the model for 100 epochs. Please help me to figure out the issue. Waiting for your response. Thank you in advance.

training results:
d3ac2c7b01903615d7987a069c2c835

when I saved the pre-trained SwinUNETR model I am getting very bad results:
d9db8c911b79b18400c2aaccba17d61

@YUXIN-commit
Copy link
Author

I encountered another issue, which is 'Too many false positives on test-data' when using test.py. Has anyone encountered this problem before? Could you please let me know how to resolve it? "
Let me know if you'd like me to refine it further!

As shown in the image:
d67c008739e5a688ce1fbe6dafdd19b

@ras-lyg
Copy link

ras-lyg commented Jan 15, 2025

Hello, I've encountered the same issue. When I downloaded the author’s pre-trained weights to test on the BTCV dataset, the results were very good. However, when I used my own brain supplement dataset for training, the testing performance was very poor despite achieving high accuracy during the training process. Have you managed to resolve this issue? Looking forward to your reply.

@YUXIN-commit
Copy link
Author

Hello, I've encountered the same issue. When I downloaded the author’s pre-trained weights to test on the BTCV dataset, the results were very good. However, when I used my own brain supplement dataset for training, the testing performance was very poor despite achieving high accuracy during the training process. Have you managed to resolve this issue? Looking forward to your reply.

Hello, I have solved the issue of overly fragmented test datasets, which was mainly due to the strict requirements for sliding window inference size. You can adjust the roi_size and overlap to resolve this problem. Additionally, I found that setting include_background to False in DiceMetric during training makes the performance closer to the test performance.

@ras-lyg
Copy link

ras-lyg commented Jan 15, 2025

Thank you for your response. However, it seems that our issues are somewhat different. I am using SwinUNETR to train on other datasets to obtain weights, and then using the author's test.py for testing. Despite achieving an accuracy of 0.865 during training, the testing performance is not good, with many classes having a Dice score of only 0.

@YUXIN-commit
Copy link
Author

Thank you for your response. However, it seems that our issues are somewhat different. I am using SwinUNETR to train on other datasets to obtain weights, and then using the author's test.py for testing. Despite achieving an accuracy of 0.865 during training, the testing performance is not good, with many classes having a Dice score of only 0.

I feel the same way, but I think the high Dice score during training is due to the include_background setting being set to True, which includes the background class (class 0) in the calculations, leading to a higher Dice score

@AbdulRehman0004
Copy link

Hi. Has anyone tried this model with Brast training? I am facing a very bad Dice score nearly 0.04. I cannot get what is the reason.
Can anyone help please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants