-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why Gelu is zero for u8/u32/i64? #878
Comments
might have to be |
How would you implement it in any reasonable way for integers? Getting even close to the standard tanh-derived approximation for GeLU u8/u32/i64 would either involve rough approximation or converting to f32/f64 and converting back, which defeats the point. It would probably be a "good enough" integer approximation for GeLU to just have a step function f(x) where f(x) = 0 if x <= 0, f(x) = x if x > 0. This would be limited to an i64 implementation, obviously. For u8/u32, it would be functionally equivalent to the identity function f(x) = x. Outside of this, I can't think of reasonably close to GeLU for u8/u32/i64. If this approach is amendable to the maintainers, I'd love to put a PR together for it. |
Since gelu itself is not defined as an integer to integer operation (it is a Real to Real or Float to float). This particular implementation is mapped into integers to integers, so a "correct" implementation is not possible. That being said, many implementations still could be possible that rounds the output value. The more important issue is that this is not well documented. If there is a plan to support this but it hasn't been decided yet it should be |
My suggested approximation above was such a simple implementation that I just threw it together real quick. |
candle/candle-core/src/op.rs
Lines 573 to 584 in 5f83c13
The text was updated successfully, but these errors were encountered: