-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
load bf16 directly, and some "quality of life" handling of fp32/fp16/bf16 precisions #265
Commits on Apr 27, 2024
-
code to load bf16 weights directly, and also re-wire the position of …
…tensors to put the layernorms at the end. the training loop seems to work ok, and the tests pass and the loss and optimization looks ok, but the gradients don't match. which can't be right. so there is a bug, but it's a bit too late in the day for me to debug right now, creating a PR and going to sleep, will fix tomorrow
Configuration menu - View commit details
-
Copy full SHA for 09cd67e - Browse repository at this point
Copy the full SHA 09cd67eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 09d935c - Browse repository at this point
Copy the full SHA 09d935cView commit details -
Configuration menu - View commit details
-
Copy full SHA for d4a642b - Browse repository at this point
Copy the full SHA d4a642bView commit details -
Configuration menu - View commit details
-
Copy full SHA for e067a27 - Browse repository at this point
Copy the full SHA e067a27View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9d6fd30 - Browse repository at this point
Copy the full SHA 9d6fd30View commit details -
Configuration menu - View commit details
-
Copy full SHA for a58b8d5 - Browse repository at this point
Copy the full SHA a58b8d5View commit details -
fix a really bad bug in how i was checking the gradients, where i loa…
…ded them in the old order, so yeah...
Configuration menu - View commit details
-
Copy full SHA for 0062707 - Browse repository at this point
Copy the full SHA 0062707View commit details
Commits on Apr 28, 2024
-
bring back original ordering. i also had to bump the thresholds by 3X…
… for some tensors and i don't exactly know why sad
Configuration menu - View commit details
-
Copy full SHA for 9a91b40 - Browse repository at this point
Copy the full SHA 9a91b40View commit details -
Configuration menu - View commit details
-
Copy full SHA for 82d7907 - Browse repository at this point
Copy the full SHA 82d7907View commit details -
allow user to make different precisions, add prints and error handlin…
…g around precisions
Configuration menu - View commit details
-
Copy full SHA for 4f7d8d9 - Browse repository at this point
Copy the full SHA 4f7d8d9View commit details -
reshuffle the ifdefs to make bf16 the default if no PRECISION is requ…
…ested via defines
Configuration menu - View commit details
-
Copy full SHA for a3f5ad9 - Browse repository at this point
Copy the full SHA a3f5ad9View commit details -
profile and test only use bf16. but the train script can be run with …
…fp32 or bf16 or fp16. fp16 will error, though
Configuration menu - View commit details
-
Copy full SHA for 9d70d9a - Browse repository at this point
Copy the full SHA 9d70d9aView commit details