Skip to content

Commit

Permalink
go1.24
Browse files Browse the repository at this point in the history
  • Loading branch information
nikolaydubina committed Feb 18, 2025
1 parent 4762126 commit 808efaf
Show file tree
Hide file tree
Showing 2 changed files with 72 additions and 85 deletions.
8 changes: 1 addition & 7 deletions 07-09_lru_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,6 @@ func (m *LRUCacheBasic) Hit(i uint8) {
func (m *LRUCacheBasic) LeastRecentlyUsed() uint8 { return m.vals[len(m.vals)-1] }

func BenchmarkLRU(b *testing.B) {
var out uint8

vs := []struct {
name string
f interface {
Expand All @@ -93,13 +91,9 @@ func BenchmarkLRU(b *testing.B) {
v.f.Hit(x)

if (i % 1000) == 0 {
out = v.f.LeastRecentlyUsed()
v.f.LeastRecentlyUsed()
}
}
})
}

if (out*2 - out - out) != 0 {
b.Fatal("never")
}
}
149 changes: 71 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,21 +26,13 @@ An interactive Go showcase of ["Hacker's Delight"](https://en.wikipedia.org/wiki

### Observations

* native `Abs` performance is the same
* native `Div` and `Mod` by small constants performance is better
* native `math/bits.Mul32` and `math/bits.Mul64`[^3] performance is the same
* native `math/bits.LeadingZeros` performance is better than `LeadingZeroes` algorithm
* native `math.Sqrt` performance is better
* native `math.Pow(x, 1./3)`[^4] performance is worse than `Cbrt` algorithm 💥
* native `math.Pow(x, 1./3)`[^4][^5] underflows but `Cbrt` algorithm is correct 💥
* native `math.Pow(x, n)`[^4] performance is worse than `Pow` algorithm 💥
* native `math.Log2(x)`[^4] performance is worse than `Log2` algorithm 💥
* native `math.Log10(x)`[^4] performance is worse than `Log10` algorithm 💥
* native `math.Log10(x)` [^4][^5] overflows for `math.MaxUint64` but `Log10` algorithm is correct 💥
* native `1/math.Sqrt(x)` performance is `1.5x` better than `RSqrtFloat32` algorithm
* native `switch` performance is better than `CycleThreeValues` algorithm
* native `crc32.Checksum` performance is `30x`~`500x` better than `CRC32` algorithms
* simplistic `LRUCache` performance is `3x` worse than `LRUCache` algorithm 💥
* simplistic `LRUCache` performance is `2x` worse than `LRUCache` algorithm 💥

<details><summary>Appendix: Benchmarks</summary>

Expand All @@ -49,81 +41,82 @@ $ go test -bench .
goos: darwin
goarch: arm64
pkg: github.com/nikolaydubina/go-hackers-delight
cpu: Apple M3 Max
BenchmarkNoop/---------------------------------16 1000000000 0.0000001 ns/op
BenchmarkAbs/basic-16 1000000000 0.9826 ns/op
BenchmarkAbs/Abs-16 1000000000 0.9647 ns/op
BenchmarkAbs/Abs2-16 1000000000 0.9943 ns/op
BenchmarkAbs/Abs3-16 1000000000 0.9819 ns/op
BenchmarkAbs/Abs4-16 1000000000 1.003 ns/op
BenchmarkAbs/AbsFastMul-16 1000000000 0.9598 ns/op
BenchmarkAvg/basic-16 973716225 2.045 ns/op
BenchmarkAvg/AvgFloor-16 602586224 2.050 ns/op
BenchmarkAvg/AvgCeil-16 582029594 2.054 ns/op
BenchmarkCycleThree/basic-16 767160418 1.560 ns/op
BenchmarkCycleThree/CycleThreeValues-16 438818894 2.729 ns/op
BenchmarkLeadingZeros/uint32/basic-16 1000000000 0.9419 ns/op
BenchmarkLeadingZeros/uint32/LeadingZerosUint32-16 1000000000 1.124 ns/op
BenchmarkLeadingZeros/uint64/basic-16 1000000000 0.9230 ns/op
BenchmarkLeadingZeros/uint64/LeadingZerosUint64-16 898095195 1.336 ns/op
BenchmarkCompress/Compress-16 100000000 10.60 ns/op
BenchmarkCompress/Compress2-16 55584826 21.52 ns/op
BenchmarkLRU/basic-16 246358870 4.870 ns/op
BenchmarkLRU/LRUCache-16 960896830 1.239 ns/op
BenchmarkMul/uint32/basic-16 593555838 1.892 ns/op
BenchmarkMul/uint32/MultiplyHighOrder32-16 951445552 2.046 ns/op
BenchmarkMul/uint64/basic-16 977065424 1.220 ns/op
BenchmarkMul/uint64/MultiplyHighOrder64-16 675693746 2.042 ns/op
BenchmarkDivMod/DivMod/3/basic-16 1000000000 0.8500 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed-16 605588445 1.970 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed2-16 1000000000 1.078 ns/op
BenchmarkDivMod/DivMod/7/basic-16 1000000000 0.8311 ns/op
BenchmarkDivMod/DivMod/7/DivMod7Signed-16 582087586 2.105 ns/op
BenchmarkDivMod/Div/3/basic-16 1000000000 0.8325 ns/op
BenchmarkDivMod/Div/3/Div3Signed-16 793883130 1.509 ns/op
BenchmarkDivMod/Div/3/Div3ShiftSigned-16 907116610 1.320 ns/op
BenchmarkDivMod/Div/7/basic-16 1000000000 0.8344 ns/op
BenchmarkDivMod/Div/7/Div7Signed-16 755509315 1.590 ns/op
BenchmarkDivMod/Div/7/Div7ShiftSigned-16 841563656 1.424 ns/op
BenchmarkDivMod/Mod/3/basic-16 1000000000 0.8309 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed-16 812136249 1.466 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed2-16 1000000000 0.8410 ns/op
BenchmarkDivMod/Mod/7/basic-16 1000000000 0.8332 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed-16 766677633 1.564 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed2-16 1000000000 1.095 ns/op
BenchmarkDivMod/Mod/10/basic-16 1000000000 0.8318 ns/op
BenchmarkDivMod/Mod/10/Mod10Signed-16 868932930 1.441 ns/op
BenchmarkDivMod/DivExact/7/basic-16 1000000000 0.9247 ns/op
BenchmarkDivMod/DivExact/7/DivExact7-16 1000000000 0.9238 ns/op
BenchmarkDivMod/DivExact/7/Div7Signed-16 718667949 1.668 ns/op
BenchmarkDivMod/DivExact/7/Div7ShiftSigned-16 802988229 1.490 ns/op
BenchmarkCbrt/basic-16 47340079 26.01 ns/op
BenchmarkCbrt/Cbrt-16 85196262 14.55 ns/op
BenchmarkPow/basic-16 24005180 48.25 ns/op
BenchmarkPow/Pow-16 65121390 19.08 ns/op
BenchmarkLog/uint32/2/basic-16 99810775 12.06 ns/op
BenchmarkLog/uint32/2/Log2-16 984283590 1.223 ns/op
BenchmarkLog/uint32/10/basic-16 140540709 8.516 ns/op
BenchmarkLog/uint32/10/Log10-16 539441811 2.220 ns/op
BenchmarkLog/uint64/2/basic-16 100000000 11.73 ns/op
BenchmarkLog/uint64/2/Log2-16 839779903 1.419 ns/op
BenchmarkLog/uint64/10/basic-16 142679388 8.419 ns/op
BenchmarkLog/uint64/10/Log10-16 538269764 2.228 ns/op
BenchmarkSqrt/basic-16 1000000000 1.019 ns/op
BenchmarkSqrt/SqrtNewton-16 188142513 6.538 ns/op
BenchmarkSqrt/SqrtBinarySearch-16 74752382 15.71 ns/op
BenchmarkSqrt/SqrtShiftAndSubtract-16 136426688 8.834 ns/op
BenchmarkCRC32/basic-16 220094371 5.395 ns/op
BenchmarkCRC32/CRC32Basic-16 448540 2510 ns/op
BenchmarkCRC32/CRC32TableLookup-16 8108901 147.5 ns/op
BenchmarkRSqrtFloat32/basic-16 1000000000 0.9354 ns/op
BenchmarkRSqrtFloat32/RSqrtFloat32-16 828149971 1.448 ns/op
BenchmarkAbs/basic-16 1489353 811.6 ns/op
BenchmarkAbs/Abs-16 1460946 818.8 ns/op
BenchmarkAbs/Abs2-16 1464883 816.4 ns/op
BenchmarkAbs/Abs3-16 1443763 830.9 ns/op
BenchmarkAbs/Abs4-16 1431141 882.4 ns/op
BenchmarkAbs/AbsFastMul-16 1494900 803.1 ns/op
BenchmarkAvg/basic-16 611107 1970 ns/op
BenchmarkAvg/AvgFloor-16 811885 1603 ns/op
BenchmarkAvg/AvgCeil-16 609648 1916 ns/op
BenchmarkCycleThree/basic-16 1000000000 1.105 ns/op
BenchmarkCycleThree/CycleThreeValues-16 1000000000 1.000 ns/op
BenchmarkLeadingZeros/uint32/basic-16 1474354 816.2 ns/op
BenchmarkLeadingZeros/uint32/LeadingZerosUint32-16 1000000 1174 ns/op
BenchmarkLeadingZeros/uint64/basic-16 1483987 808.8 ns/op
BenchmarkLeadingZeros/uint64/LeadingZerosUint64-16 892615 1349 ns/op
BenchmarkCompress/Compress-16 150361084 8.064 ns/op
BenchmarkCompress/Compress2-16 54269787 21.47 ns/op
BenchmarkLRU/basic-16 447112422 2.689 ns/op
BenchmarkLRU/LRUCache-16 625848724 1.878 ns/op
BenchmarkMul/uint32/basic-16 610756 1963 ns/op
BenchmarkMul/uint32/MultiplyHighOrder32-16 590781 1851 ns/op
BenchmarkMul/uint64/basic-16 109549 11017 ns/op
BenchmarkMul/uint64/MultiplyHighOrder64-16 71486 19700 ns/op
BenchmarkDivMod/DivMod/3/basic-16 1483515 809.9 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed-16 630376 1903 ns/op
BenchmarkDivMod/DivMod/3/DivMod3Signed2-16 1000000 1078 ns/op
BenchmarkDivMod/DivMod/7/basic-16 1470925 813.2 ns/op
BenchmarkDivMod/DivMod/7/DivMod7Signed-16 617737 1977 ns/op
BenchmarkDivMod/Div/3/basic-16 1474382 816.0 ns/op
BenchmarkDivMod/Div/3/Div3Signed-16 855132 1387 ns/op
BenchmarkDivMod/Div/3/Div3ShiftSigned-16 958530 1245 ns/op
BenchmarkDivMod/Div/7/basic-16 1480100 808.9 ns/op
BenchmarkDivMod/Div/7/Div7Signed-16 831228 1438 ns/op
BenchmarkDivMod/Div/7/Div7ShiftSigned-16 890666 1347 ns/op
BenchmarkDivMod/Mod/3/basic-16 1494051 807.6 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed-16 896326 1306 ns/op
BenchmarkDivMod/Mod/3/Mod3Signed2-16 1491132 804.5 ns/op
BenchmarkDivMod/Mod/7/basic-16 1489940 804.5 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed-16 865971 1394 ns/op
BenchmarkDivMod/Mod/7/Mod7Signed2-16 1000000 1108 ns/op
BenchmarkDivMod/Mod/10/basic-16 1470753 816.9 ns/op
BenchmarkDivMod/Mod/10/Mod10Signed-16 1000000 1113 ns/op
BenchmarkDivMod/DivExact/7/basic-16 1492069 803.8 ns/op
BenchmarkDivMod/DivExact/7/DivExact7-16 1452621 820.2 ns/op
BenchmarkDivMod/DivExact/7/Div7Signed-16 763095 1558 ns/op
BenchmarkDivMod/DivExact/7/Div7ShiftSigned-16 869890 1382 ns/op
BenchmarkCbrt/basic-16 4669 260382 ns/op
BenchmarkCbrt/Cbrt-16 7956 150951 ns/op
BenchmarkPow/basic-16 2311 520212 ns/op
BenchmarkPow/Pow-16 6334 188887 ns/op
BenchmarkLog/uint32/2/basic-16 10000 120539 ns/op
BenchmarkLog/uint32/2/Log2-16 96388 12463 ns/op
BenchmarkLog/uint32/10/basic-16 13887 86154 ns/op
BenchmarkLog/uint32/10/Log10-16 54652 21962 ns/op
BenchmarkLog/uint64/2/basic-16 10000 120627 ns/op
BenchmarkLog/uint64/2/Log2-16 81048 15237 ns/op
BenchmarkLog/uint64/10/basic-16 14146 84683 ns/op
BenchmarkLog/uint64/10/Log10-16 51010 23544 ns/op
BenchmarkSqrt/basic-16 134250 8954 ns/op
BenchmarkSqrt/SqrtNewton-16 18208 63912 ns/op
BenchmarkSqrt/SqrtBinarySearch-16 7302 168653 ns/op
BenchmarkSqrt/SqrtShiftAndSubtract-16 10000 114464 ns/op
BenchmarkCRC32/basic-16 181860 6610 ns/op
BenchmarkCRC32/CRC32Basic-16 434 2778007 ns/op
BenchmarkCRC32/CRC32TableLookup-16 7068 168972 ns/op
BenchmarkRSqrtFloat32/basic-16 139954 8663 ns/op
BenchmarkRSqrtFloat32/RSqrtFloat32-16 99153 12287 ns/op
PASS
ok github.com/nikolaydubina/go-hackers-delight 91.183s
ok github.com/nikolaydubina/go-hackers-delight 83.522s

```
</details>

[^1]: showcase in `C`https://github.com/hcs0/Hackers-Delight
[^2]: showcase in `Rust`https://github.com/victoryang00/Delight
[^3]: given manual inlining of generic type, which produces equivalent Go code
[^4]: we are comparing native float64 result converted to uint32, as there is no better standard function
[^5]: which is due to `float64` not having enough precision

0 comments on commit 808efaf

Please sign in to comment.