Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated. GeneticEngine to latest version (2024 competition) #189

Merged
merged 27 commits into from
Oct 14, 2024

Conversation

alcides
Copy link

@alcides alcides commented Oct 1, 2024

This updates GeneticEngine to the latest version.

It also includes three other algorithms:

  • Random Search
  • 1+1
  • HillClimbing

As previously described, these are fundamental different algorithms, not just different configurations of hyper parameters.

@alcides alcides changed the title Updated. GeneticEngine to latest version Updated. GeneticEngine to latest version (2024 competition) Oct 1, 2024
@gAldeia
Copy link

gAldeia commented Oct 9, 2024

Hi @alcides,
Thanks for the PR! I’ve been working on setting up GitHub Actions to build your Docker container and test the new features. Sorry for the delay.

It seems you’re using Python 3.12, which isn’t compatible with the specified version of setuptools. Would it make sense to loosen the version of setuptools and pip, so mamba can figure it out based on the fixed python version?

Also, another change: can you move algorithms/<your method>/regressor.py to experiment/methods/<your algorithm>/regressor.py?. We previously stored everything together, but this caused issues with automation scripts copying files, leading to errors from file duplication. The new structure should be cleaner and less error-prone.

Could you try making these changes and commit them to your PR? It will trigger the github actions.

Thanks again!

@alcides
Copy link
Author

alcides commented Oct 9, 2024

Yes, I will try to address this until tomorrow. Thanks!

@alcides
Copy link
Author

alcides commented Oct 11, 2024

@gAldeia I believe the tests are not passing, because the MAX_TIME signal is killing the fit method, before it becomes ready for prediction.

In the geneticengine regressor, we are setting max_time to 8 hours minus 1 minute. This seems to be too long for the CI test. My suggestion would be for the script to set the max_time accordingly. I assumed that's how it works nowadays, but I can't find it. Some Regressors adjust the max_time, based on the number of instances.

Should we standardize on a specific max_time for testing purposes (CI), and for the larger runs, it is set up dynamically?

@alcides
Copy link
Author

alcides commented Oct 13, 2024

I believe it now passes all necessary tests.

@gAldeia
Copy link

gAldeia commented Oct 14, 2024

Hi @alcides,

I believe the tests are not passing because the MAX_TIME signal is killing the fit method before it becomes ready for prediction.

Thanks for pointing that out! I agree that having a max_time limit for the CI makes sense, but we'll need to implement it in a fault-tolerant way. It should also work with regressors that don't have a max_time attribute. I'll start working on this idea.

Also, I appreciate your effort in making your algorithm fully compatible with SRBench, especially as we work on redesigning everything.

Just a minor issue regarding the Operon change you made. I am not confident that I understood what you were trying to do so I did not suggested a way to fix it, but I wrote a comment on that

@alcides
Copy link
Author

alcides commented Oct 14, 2024

I think having a recommended default max_time for CI (for those of us that are doing the integration with SRBench) would kind of solve the problem. Or instead have the scripts define a reasonable max_time (again, for those that support) that is slightly less than the signal-based timeout.

The operon bug was reverted. Thanks for pointing it out!

@gAldeia
Copy link

gAldeia commented Oct 14, 2024

Thanks @alcides for fixing it! I'll merge your commit now.


Continuing the discussion, I think I see what you mean.

For CI we should set a very small max_time (if the algorithm supports it), but be tolerant and avoid sending a kill signal if it takes longer (the idea should be to successfully run the algorithm).

For official SRBench experiments, we specify a max_time, but give a small margin to let algorithms wrap up what they are doing when the time is out. Also, when running the job in a cluster, make sure that the max time is regarding algorithm's fit method only.

@gAldeia gAldeia merged commit 105888b into cavalab:docker-compose Oct 14, 2024
17 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants