Why Gelu is zero for u8/u32/i64? #878

npuichigo · 2023-09-17T14:12:07Z

Lines 573 to 584 in 5f83c13

    
           #[inline(always)] 
        
           fn u8(_: u8) -> u8 { 
        
               0 
        
           } 
        
           #[inline(always)] 
        
           fn u32(_: u32) -> u32 { 
        
               0 
        
           } 
        
           #[inline(always)] 
        
           fn i64(_: i64) -> i64 { 
        
               0 
        
           }

chris-ha458 · 2023-09-20T14:05:03Z

might have to be todo!

shaneish · 2023-09-27T00:37:03Z

How would you implement it in any reasonable way for integers? Getting even close to the standard tanh-derived approximation for GeLU u8/u32/i64 would either involve rough approximation or converting to f32/f64 and converting back, which defeats the point.

It would probably be a "good enough" integer approximation for GeLU to just have a step function f(x) where f(x) = 0 if x <= 0, f(x) = x if x > 0. This would be limited to an i64 implementation, obviously. For u8/u32, it would be functionally equivalent to the identity function f(x) = x. Outside of this, I can't think of reasonably close to GeLU for u8/u32/i64.

If this approach is amendable to the maintainers, I'd love to put a PR together for it.

chris-ha458 · 2023-09-27T00:58:17Z

Since gelu itself is not defined as an integer to integer operation (it is a Real to Real or Float to float).

This particular implementation is mapped into integers to integers, so a "correct" implementation is not possible.

That being said, many implementations still could be possible that rounds the output value.

The more important issue is that this is not well documented.
Even if somebody does not intentionally invokes this, (they shouldn't), it could be invoked inadvertantly by type inference or type coercion.

If there is a plan to support this but it hasn't been decided yet it should be todo!
if it is agreed that it should not be invoked, then either the traitbounds can be changed so that invoking it becomes a compile time error or a panic! should be there so it becomes a runtime error (the former is more preferred imo)

shaneish · 2023-09-27T01:09:19Z

My suggested approximation above was such a simple implementation that I just threw it together real quick.

shaneish mentioned this issue Sep 27, 2023

GeLU implementation for u8/u32/i64 #970

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why Gelu is zero for u8/u32/i64? #878

Why Gelu is zero for u8/u32/i64? #878

npuichigo commented Sep 17, 2023

chris-ha458 commented Sep 20, 2023

shaneish commented Sep 27, 2023 •

edited

Loading

chris-ha458 commented Sep 27, 2023

shaneish commented Sep 27, 2023 •

edited

Loading

Why Gelu is zero for u8/u32/i64? #878

Why Gelu is zero for u8/u32/i64? #878

Comments

npuichigo commented Sep 17, 2023

chris-ha458 commented Sep 20, 2023

shaneish commented Sep 27, 2023 • edited Loading

chris-ha458 commented Sep 27, 2023

shaneish commented Sep 27, 2023 • edited Loading

shaneish commented Sep 27, 2023 •

edited

Loading

shaneish commented Sep 27, 2023 •

edited

Loading