Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on Training from Scratch or Fine-Tuning #164

Open
alfausa1 opened this issue Jan 15, 2025 · 12 comments
Open

Guidance on Training from Scratch or Fine-Tuning #164

alfausa1 opened this issue Jan 15, 2025 · 12 comments
Labels
documentation Improvements or additions to documentation

Comments

@alfausa1
Copy link

Hi,
I would like to ask how many images you would recommend for training a model from scratch, and what weights you would suggest starting with.

My use case is object segmentation on plain backgrounds. The general model currently works quite well for most cases, but there are a few specific scenarios that could be improved. This is why I’m considering training or fine-tuning.

I have a dataset of around 7,000 images at 2K resolution. What would you recommend in this case?

Thank you in advance for your help!

@ZhengPeng7
Copy link
Owner

For common cases with no extremely complicated shapes, 500-1,000 images should be enough for training from scratch.
If your cases are very different from the training sets I used to train the general version weights, I suggest training from scratch when you have enough images. Otherwise, fine-tuning could be a better way.

In your case, I recommend training from scratch. BTW, you can check the model efficiency part in README; use FP16 + compile==True + PyTorch==2.5.1 to try to save GPU memory to do less downscaling on your 2K data.

@Roshan-digi5
Copy link

Hello,

First of all, thank you for your incredible work and contributions!

I want to train a model specifically for removing backgrounds from car images. I have a dataset of approximately 80,000 images. Could you guide me on the best practices to follow, which model and settings would be most suitable, and whether there are any tutorials available for training or fine-tuning a model?

@ZhengPeng7
Copy link
Owner

I've made a guideline of fine-tuning in my README. For settings of fine-tuning, you can use the default settings except for the epochs. If you still have a problem after following it, plz tell me.

@Roshan-digi5
Copy link

Thank you will let you know in case of any issue.

@alfausa1
Copy link
Author

Hi,

Thank you so much for taking the time to reply!

I wanted to ask specifically about the configurations, losses and backbone you would recommend for my use case. Are there any particular hyperparameters or architectures you find especially suitable for this type of task? Any additional guidance would be greatly appreciated.

Thanks again for your support!

@ZhengPeng7
Copy link
Owner

In my mind, car segmentation should have fewer contour details or the need for transparency. If so, you can train the model with fewer epochs and higher weights of IoU loss to accelerate the convergence.
I may come up with more points in the future, but currently that's all.

@alfausa1
Copy link
Author

Sorry for not specifying earlier, my use case is object segmentation on a plain background (not cars). Many objects do have transparencies and some small details like tiny holes.

@ZhengPeng7
Copy link
Owner

That would be a general case. I'm not sure about it (otherwise, I would have added the updates to the default settings).

@alfausa1
Copy link
Author

alfausa1 commented Jan 21, 2025

Thank you very much again! The model trained with DIS performs really well in most cases, but we have identified some corner cases where it fails. Would you recommend fine-tuning only with those specific cases where it fails (not the entire 7k, just the problematic ones) or fine-tuning the entire dataset instead?

How much VRAM should I need? I have read around 25GB with FP16?

@ZhengPeng7
Copy link
Owner

If you find it works worse on some specific cases, training only on them would help a lot. Hard negative samples usually teach the model more about it.

Yeah, following the setting there with compile, FP16, and batch_size == 2, the training would take ~25GB.

@Roshan-digi5
Copy link

I'm following the guidelines you created but still unable to understand i have updated my dataset paths in same way as there said till step2 after that what changes needs to be done in config.py as well as in train.py any more guidance or any Collab demo for fine-tunning

@ZhengPeng7
Copy link
Owner

OK, thanks for the suggestion. I'll try to record a video of ~1 min to start basic fine-tuning.

@ZhengPeng7 ZhengPeng7 added the documentation Improvements or additions to documentation label Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants