Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use FFM API instead of Unsafe #172

Closed
wants to merge 44 commits into from

Conversation

wendigo
Copy link
Contributor

@wendigo wendigo commented May 11, 2024

Raw version - will add some cleanup/refactors

@wendigo
Copy link
Contributor Author

wendigo commented May 12, 2024

cc @dain

@wendigo
Copy link
Contributor Author

wendigo commented May 13, 2024

MemoryCopyBenchmark.copy      ARRAY_COPY          0  thrpt   10  474491681.129 ±  3265306.927  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY         32  thrpt   10  285042279.681 ±  2462686.492  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY        128  thrpt   10  159996540.251 ± 15054813.264  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY        512  thrpt   10   72404594.618 ±   489402.830  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY       1024  thrpt   10   47357057.119 ±  1201945.213  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY    1048576  thrpt   10      41088.703 ±     4961.142  ops/s
MemoryCopyBenchmark.copy      ARRAY_COPY  134217728  thrpt   10        312.411 ±       12.321  ops/s
MemoryCopyBenchmark.copy           SLICE          0  thrpt   10  295805908.140 ±  2085091.805  ops/s
MemoryCopyBenchmark.copy           SLICE         32  thrpt   10  189399971.406 ±  3461827.652  ops/s
MemoryCopyBenchmark.copy           SLICE        128  thrpt   10  140171047.804 ±   977496.348  ops/s
MemoryCopyBenchmark.copy           SLICE        512  thrpt   10   76250723.985 ±  2047258.969  ops/s
MemoryCopyBenchmark.copy           SLICE       1024  thrpt   10   49895982.908 ±   543046.942  ops/s
MemoryCopyBenchmark.copy           SLICE    1048576  thrpt   10      55195.630 ±     2027.515  ops/s
MemoryCopyBenchmark.copy           SLICE  134217728  thrpt   10        226.097 ±        4.955  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP          0  thrpt   10  541104779.380 ±  4493537.557  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP         32  thrpt   10  265335751.007 ±  2960040.019  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP        128  thrpt   10  127626464.503 ± 15330290.165  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP        512  thrpt   10   52954182.844 ±  1601716.854  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP       1024  thrpt   10   31604216.249 ±   248654.945  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP    1048576  thrpt   10      26121.004 ±     2754.518  ops/s
MemoryCopyBenchmark.copy     CUSTOM_LOOP  134217728  thrpt   10        198.818 ±        3.548  ops/s
MemoryCopyBenchmark.copy          UNSAFE          0  thrpt   10  331847053.122 ±   539142.963  ops/s
MemoryCopyBenchmark.copy          UNSAFE         32  thrpt   10  187917069.206 ±  7990464.460  ops/s
MemoryCopyBenchmark.copy          UNSAFE        128  thrpt   10  133080275.413 ±   589945.484  ops/s
MemoryCopyBenchmark.copy          UNSAFE        512  thrpt   10   76809746.683 ±   565967.987  ops/s
MemoryCopyBenchmark.copy          UNSAFE       1024  thrpt   10   49797502.548 ±   480430.134  ops/s
MemoryCopyBenchmark.copy          UNSAFE    1048576  thrpt   10      52984.040 ±     3870.650  ops/s
MemoryCopyBenchmark.copy          UNSAFE  134217728  thrpt   10        221.023 ±       22.489  ops/s
MemoryCopyBenchmark.copy             FFM          0  thrpt   10  311388867.857 ±  1783603.147  ops/s
MemoryCopyBenchmark.copy             FFM         32  thrpt   10  212324332.813 ±  1894856.373  ops/s
MemoryCopyBenchmark.copy             FFM        128  thrpt   10  149091866.397 ± 17204132.219  ops/s
MemoryCopyBenchmark.copy             FFM        512  thrpt   10   72916935.083 ±   482069.949  ops/s
MemoryCopyBenchmark.copy             FFM       1024  thrpt   10   47854814.809 ±   350433.072  ops/s
MemoryCopyBenchmark.copy             FFM    1048576  thrpt   10      42325.853 ±     2703.085  ops/s
MemoryCopyBenchmark.copy             FFM  134217728  thrpt   10        310.401 ±       19.358  ops/s

@wendigo
Copy link
Contributor Author

wendigo commented May 13, 2024

I got it working but I'm worried about performance. @dain can you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants