Replies: 2 comments 1 reply
-
There are several questions here. Why do we use f32 as the default scalar type parameter instead of f64? On 64 bit machine, there is no difference between the performance cost of the f32 and f64. However, using f32 will reduce the type size (typically half) in most of math or geometry types, which improves the cache locality and theoretically means better performance in bandwidth perspective. Another consideration of this choice is I want the numeric behaviour/stability of my code as same as the GPU side when you really care about the error tolerance.(for example the handling self intersection in raytracer) , because you can make sure you have f32 on GPU but not the f64. It will helpfull when you migrate your code to gpu side later. But of course maybe we should able to change our default scalar type by cargo feature config in the future. SIMD plan? Yes I did have simd optimization plan. The current plan is create custom "wide" types using std simd and impl another system of computation method on it. the "wide" version or the vector version is pair with the scalar version. This is a good reference https://github.com/fu5ha/ultraviolet. Also the current scalar version of math/geom types could have limited simd opt. I believe many of them have been covered by the compiler's auto vecterization. low priority i think. Multi threading? No. In primitive level, it's impossible and meaningless. |
Beta Was this translation helpful? Give feedback.
-
Cool~ For the simd part, I think exposing the type and usage like ultraviolet is kind of wired... I mean everytime I have to point out it's Vec3x8 and fill data into it as filling data into an array then the ultraviolet optimize it with SIMD is too...not friendly. In my mind, the user should not care anything about this, just define some Vec3 or Mat4, when the is not wide (such as float32, int32), inside will automatically figure it out and use SIMD to speed it up. But the ultraviolet could be a good reference of course~ finally, for the multithreading, I'm just thinking if we can use Rayon(https://docs.rs/rayon/latest/rayon/) to speed up the for loop~seems pretty handy |
Beta Was this translation helpful? Give feedback.
-
start from very scratch, start from reviewing the math crate now
most of the calculation should be f32 and most of the platform is x64 now
could consider add SIMD to speed up?
not quit sure if multicore could be applied here, didn't see a lot of for loop...
Beta Was this translation helpful? Give feedback.
All reactions