Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why Gelu is zero for u8/u32/i64? #878

Open
npuichigo opened this issue Sep 17, 2023 · 4 comments
Open

Why Gelu is zero for u8/u32/i64? #878

npuichigo opened this issue Sep 17, 2023 · 4 comments

Comments

@npuichigo
Copy link
Contributor

#[inline(always)]
fn u8(_: u8) -> u8 {
0
}
#[inline(always)]
fn u32(_: u32) -> u32 {
0
}
#[inline(always)]
fn i64(_: i64) -> i64 {
0
}

@chris-ha458
Copy link

might have to be todo!

@shaneish
Copy link

shaneish commented Sep 27, 2023

How would you implement it in any reasonable way for integers? Getting even close to the standard tanh-derived approximation for GeLU u8/u32/i64 would either involve rough approximation or converting to f32/f64 and converting back, which defeats the point.

It would probably be a "good enough" integer approximation for GeLU to just have a step function f(x) where f(x) = 0 if x <= 0, f(x) = x if x > 0. This would be limited to an i64 implementation, obviously. For u8/u32, it would be functionally equivalent to the identity function f(x) = x. Outside of this, I can't think of reasonably close to GeLU for u8/u32/i64.

If this approach is amendable to the maintainers, I'd love to put a PR together for it.

@chris-ha458
Copy link

Since gelu itself is not defined as an integer to integer operation (it is a Real to Real or Float to float).

This particular implementation is mapped into integers to integers, so a "correct" implementation is not possible.

That being said, many implementations still could be possible that rounds the output value.

The more important issue is that this is not well documented.
Even if somebody does not intentionally invokes this, (they shouldn't), it could be invoked inadvertantly by type inference or type coercion.

If there is a plan to support this but it hasn't been decided yet it should be todo!
if it is agreed that it should not be invoked, then either the traitbounds can be changed so that invoking it becomes a compile time error or a panic! should be there so it becomes a runtime error (the former is more preferred imo)

@shaneish
Copy link

shaneish commented Sep 27, 2023

My suggested approximation above was such a simple implementation that I just threw it together real quick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants