Releases: ymetz/rlhfblender
Releases · ymetz/rlhfblender
Release 0.3.1
Changes:
- New Demo Models without GIT-LFS dependency (to maximize compatability)
- Updated Correction Modal
- Updated UI State Logic
- Multi-Config Experiments (i.e. changing UI configs over training)
- Logging of meta events such as submit/reset for reaction time measurements
Release 0.3.0 - Text Feedback, Experiment Intro, Study URLs & more
Many updates & Code Cleanup:
This update adds text feedback as a new modality, brings a major re-design including an updated intro modal, color scheme, controls, etc. Saved setup configurations can now be accessed via custom URLs making deployment for custom study setups easy and scalable.
- Text Feedback
- Updated Design
- Setup Saving & Loading with custom URL for deployment
- Massively simplified data generation
- Prototype interface for keyboard shortcuts
- Code restructuring in frontend (reworked state management, and increased modaluarity)
- Updated demo models, compatible with gymnasium & newest StableBaselines3
- New & improved intro modal
- Updated Docs
Next steps:
- Finishing keyboard shortcuts
- Finishing input for multi-modal and text tasks/scenarios
- Reward model implementations
For questions and bug-reports, feel free to reach out to @ymetz
Pre-Release 0.3.0
First pre-release:
- Experiments to collect and log feedback (https://rlhfblender.readthedocs.io/en/latest/guide/quickstart.html)
- Registration of gym environments (https://rlhfblender.readthedocs.io/en/latest/guide/add_new_experiment.html)
- Fully functional user interface
To-Do's until full release:
- Reward Modeling components need testing and verificaiton
- User Tracking with Motomo
- Tutorial/Jupyter Notebook to showcase analyis of logged feedback