-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
switch logistic classifier to random forest as default classifier #1031
base: main
Are you sure you want to change the base?
Conversation
All benchmarks (diff):
|
All benchmarks (diff):
|
All benchmarks (diff):
|
@NickCrews, there's a lot of noise in the recall/precision differences. Should we increase the repetitions? |
You know you can run these benchmarks locally, right? That might help with any sort of debugging that you need to do. Look at the workflow to see what to do, I can help if needed. Seems easy enough to try increasing the reps and seeing if the metrics become more stable. Play around locally and see how many you need to do to get stability? I think this is done by playing with https://asv.readthedocs.io/en/stable/benchmarks.html#benchmark-attributes It seems inherent in the fact that we are using non-deterministic algorithms that we are seeing this variation, so I don't see it as a problem with our testing methodology. It could be considered a problem with the actual implementation though: If we get such variation between runs, then should dedupe actually do a bunch of trials and then choose the settings from the best run? This smells of overfitting to me? Somewhat related, but a nice-to-have would be if we could pass in random_state into all the classes and functions to make them deterministic, like sklearn etc all do. Doing this would sort of make things better for this problem: It would make two different benchmark runs more comparable. BUT, if we happen to choose a random_seed that isn't representative, then these benchmarks won't be very helpful for predicting real life (maybe I'm thinking about this wrong). Anyway, I think yes increasing reps for benchmarks would be the best bet. |
would close #990