Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse in-mem HNSW graph #4888

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Sparse in-mem HNSW graph #4888

wants to merge 1 commit into from

Conversation

ray6080
Copy link
Contributor

@ray6080 ray6080 commented Feb 12, 2025

Description

Introduce SparseInMemHNSWGraph, which is more space-efficient, as the upper layer for InMemHNSWIndex during index construction.

Copy link

Benchmark Result

Master commit hash: dfabf90eab17ec0dc0f87d18464152412e1fd8ee
Branch commit hash: 5351c3a194a4c3e2171f04e79c4ba1a629111e2d

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 742.45 736.85 5.60 (0.76%)
aggregation q28 6379.45 6358.04 21.40 (0.34%)
filter q14 143.85 128.29 15.56 (12.13%)
filter q15 145.23 126.50 18.73 (14.81%)
filter q16 320.14 306.18 13.96 (4.56%)
filter q17 462.29 446.79 15.50 (3.47%)
filter q18 1926.94 1922.93 4.01 (0.21%)
filter zonemap-node 107.14 88.87 18.27 (20.56%)
filter zonemap-node-lhs-cast 105.61 90.75 14.86 (16.38%)
filter zonemap-node-null 105.46 90.66 14.80 (16.33%)
filter zonemap-rel 5688.28 5394.15 294.13 (5.45%)
fixed_size_expr_evaluator q07 599.11 581.95 17.16 (2.95%)
fixed_size_expr_evaluator q08 828.49 801.57 26.91 (3.36%)
fixed_size_expr_evaluator q09 827.07 803.44 23.63 (2.94%)
fixed_size_expr_evaluator q10 260.64 236.67 23.97 (10.13%)
fixed_size_expr_evaluator q11 253.91 229.64 24.27 (10.57%)
fixed_size_expr_evaluator q12 250.59 231.70 18.89 (8.15%)
fixed_size_expr_evaluator q13 1470.19 1465.25 4.95 (0.34%)
fixed_size_seq_scan q23 134.58 111.76 22.82 (20.42%)
join q29 715.24 703.37 11.87 (1.69%)
join q30 10690.08 11083.57 -393.49 (-3.55%)
join q31 5.43 9.98 -4.55 (-45.57%)
join SelectiveTwoHopJoin 57.46 59.99 -2.53 (-4.21%)
ldbc_snb_ic q35 2699.55 2607.02 92.53 (3.55%)
ldbc_snb_ic q36 478.11 485.56 -7.45 (-1.53%)
ldbc_snb_is q32 5.06 4.47 0.59 (13.11%)
ldbc_snb_is q33 15.48 14.83 0.65 (4.40%)
ldbc_snb_is q34 1.23 1.25 -0.02 (-1.36%)
multi-rel multi-rel-large-scan 1404.23 1392.59 11.65 (0.84%)
multi-rel multi-rel-lookup 25.08 32.54 -7.46 (-22.94%)
multi-rel multi-rel-small-scan 91.63 102.16 -10.53 (-10.30%)
order_by q25 147.38 131.92 15.46 (11.72%)
order_by q26 470.60 452.45 18.15 (4.01%)
order_by q27 1461.59 1420.37 41.22 (2.90%)
recursive_join recursive-join-bidirection 313.02 296.22 16.80 (5.67%)
recursive_join recursive-join-dense 7401.56 7444.01 -42.45 (-0.57%)
recursive_join recursive-join-path 24164.21 24117.33 46.88 (0.19%)
recursive_join recursive-join-sparse 1055.29 1057.45 -2.16 (-0.20%)
recursive_join recursive-join-trail 7395.44 7418.08 -22.64 (-0.31%)
scan_after_filter q01 188.72 175.01 13.71 (7.83%)
scan_after_filter q02 174.24 159.85 14.39 (9.00%)
shortest_path_ldbc100 q37 91.01 97.65 -6.63 (-6.79%)
shortest_path_ldbc100 q38 382.15 377.28 4.87 (1.29%)
shortest_path_ldbc100 q39 61.96 64.85 -2.89 (-4.45%)
shortest_path_ldbc100 q40 466.89 464.15 2.74 (0.59%)
var_size_expr_evaluator q03 2110.99 2149.45 -38.46 (-1.79%)
var_size_expr_evaluator q04 2237.16 2203.44 33.72 (1.53%)
var_size_expr_evaluator q05 2721.30 2620.11 101.19 (3.86%)
var_size_expr_evaluator q06 1341.70 1345.39 -3.70 (-0.27%)
var_size_seq_scan q19 1516.90 1459.82 57.08 (3.91%)
var_size_seq_scan q20 2491.02 2352.12 138.90 (5.91%)
var_size_seq_scan q21 2313.83 2311.06 2.76 (0.12%)
var_size_seq_scan q22 130.48 128.13 2.35 (1.83%)

Copy link

codecov bot commented Feb 12, 2025

Codecov Report

Attention: Patch coverage is 92.79279% with 8 lines in your changes missing coverage. Please review.

Project coverage is 86.56%. Comparing base (bf8c8b8) to head (4a9b75b).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/storage/index/hnsw_graph.cpp 87.71% 7 Missing ⚠️
src/include/storage/index/hnsw_graph.h 97.67% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4888   +/-   ##
=======================================
  Coverage   86.55%   86.56%           
=======================================
  Files        1409     1409           
  Lines       60916    61012   +96     
  Branches     7493     7501    +8     
=======================================
+ Hits        52726    52813   +87     
- Misses       8021     8030    +9     
  Partials      169      169           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ray6080 ray6080 marked this pull request as ready for review February 13, 2025 05:35
@ray6080 ray6080 requested review from andyfengHKU and removed request for benjaminwinger February 13, 2025 05:36
Copy link

Benchmark Result

Master commit hash: 0e61d7391827730cb6df21ce4abd40b02e227171
Branch commit hash: 63a7b73c83b984a5398acf94e7dfa5c625c36063

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 733.37 725.06 8.31 (1.15%)
aggregation q28 6355.42 6365.47 -10.05 (-0.16%)
filter q14 134.75 127.96 6.79 (5.31%)
filter q15 135.53 125.90 9.63 (7.65%)
filter q16 309.36 302.95 6.42 (2.12%)
filter q17 452.65 447.74 4.91 (1.10%)
filter q18 1942.70 1927.66 15.03 (0.78%)
filter zonemap-node 97.33 88.86 8.47 (9.53%)
filter zonemap-node-lhs-cast 97.77 89.13 8.64 (9.70%)
filter zonemap-node-null 97.03 90.69 6.33 (6.98%)
filter zonemap-rel 5493.43 5604.53 -111.11 (-1.98%)
fixed_size_expr_evaluator q07 579.48 571.83 7.65 (1.34%)
fixed_size_expr_evaluator q08 811.95 802.87 9.08 (1.13%)
fixed_size_expr_evaluator q09 812.12 799.89 12.23 (1.53%)
fixed_size_expr_evaluator q10 244.81 236.77 8.04 (3.39%)
fixed_size_expr_evaluator q11 237.35 231.22 6.13 (2.65%)
fixed_size_expr_evaluator q12 234.93 227.30 7.63 (3.36%)
fixed_size_expr_evaluator q13 1464.09 1449.45 14.64 (1.01%)
fixed_size_seq_scan q23 121.44 110.52 10.92 (9.88%)
join q29 702.08 713.01 -10.93 (-1.53%)
join q30 9768.09 10109.53 -341.44 (-3.38%)
join q31 4.25 7.46 -3.21 (-43.04%)
join SelectiveTwoHopJoin 55.39 58.13 -2.74 (-4.71%)
ldbc_snb_ic q35 2716.53 2680.25 36.27 (1.35%)
ldbc_snb_ic q36 487.89 477.54 10.35 (2.17%)
ldbc_snb_is q32 6.04 3.90 2.14 (54.85%)
ldbc_snb_is q33 13.91 12.09 1.82 (15.09%)
ldbc_snb_is q34 1.18 1.14 0.03 (3.00%)
multi-rel multi-rel-large-scan 1323.69 1372.62 -48.93 (-3.56%)
multi-rel multi-rel-lookup 20.61 10.87 9.75 (89.66%)
multi-rel multi-rel-small-scan 96.93 96.04 0.89 (0.92%)
order_by q25 137.79 136.56 1.23 (0.90%)
order_by q26 459.64 451.63 8.00 (1.77%)
order_by q27 1397.03 1413.47 -16.43 (-1.16%)
recursive_join recursive-join-bidirection 298.15 293.37 4.79 (1.63%)
recursive_join recursive-join-dense 7361.24 7358.13 3.11 (0.04%)
recursive_join recursive-join-path 23939.47 24082.51 -143.04 (-0.59%)
recursive_join recursive-join-sparse 1061.97 1056.68 5.30 (0.50%)
recursive_join recursive-join-trail 7356.57 7316.10 40.48 (0.55%)
scan_after_filter q01 179.18 173.03 6.15 (3.56%)
scan_after_filter q02 165.19 159.90 5.29 (3.31%)
shortest_path_ldbc100 q37 96.22 88.39 7.83 (8.86%)
shortest_path_ldbc100 q38 343.96 352.52 -8.56 (-2.43%)
shortest_path_ldbc100 q39 63.82 65.89 -2.07 (-3.14%)
shortest_path_ldbc100 q40 444.33 443.60 0.73 (0.17%)
var_size_expr_evaluator q03 2066.50 2102.24 -35.74 (-1.70%)
var_size_expr_evaluator q04 2269.38 2202.42 66.95 (3.04%)
var_size_expr_evaluator q05 2610.38 2673.80 -63.42 (-2.37%)
var_size_expr_evaluator q06 1331.07 1339.30 -8.23 (-0.61%)
var_size_seq_scan q19 1439.93 1453.15 -13.22 (-0.91%)
var_size_seq_scan q20 2470.32 2538.69 -68.37 (-2.69%)
var_size_seq_scan q21 2287.75 2321.33 -33.58 (-1.45%)
var_size_seq_scan q22 130.33 127.85 2.48 (1.94%)

Copy link

Benchmark Result

Master commit hash: bf8c8b88cbdeb1a10e58e53a07a156ea72932d40
Branch commit hash: c7114b7923917ee409cac02a4703e4b7c35adffc

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 725.46 722.70 2.76 (0.38%)
aggregation q28 6356.04 6396.89 -40.85 (-0.64%)
filter q14 126.39 130.34 -3.95 (-3.03%)
filter q15 125.77 125.08 0.69 (0.55%)
filter q16 305.37 306.73 -1.35 (-0.44%)
filter q17 451.28 448.03 3.25 (0.73%)
filter q18 1926.66 1934.88 -8.21 (-0.42%)
filter zonemap-node 89.04 94.15 -5.11 (-5.43%)
filter zonemap-node-lhs-cast 90.76 90.39 0.37 (0.41%)
filter zonemap-node-null 90.68 89.94 0.74 (0.82%)
filter zonemap-rel 5626.43 5587.69 38.75 (0.69%)
fixed_size_expr_evaluator q07 572.74 570.73 2.01 (0.35%)
fixed_size_expr_evaluator q08 800.53 801.53 -1.00 (-0.13%)
fixed_size_expr_evaluator q09 802.62 800.49 2.13 (0.27%)
fixed_size_expr_evaluator q10 237.03 236.62 0.41 (0.17%)
fixed_size_expr_evaluator q11 229.98 230.25 -0.27 (-0.12%)
fixed_size_expr_evaluator q12 228.40 226.52 1.88 (0.83%)
fixed_size_expr_evaluator q13 1454.77 1458.46 -3.70 (-0.25%)
fixed_size_seq_scan q23 113.42 108.92 4.50 (4.13%)
join q29 775.90 768.96 6.94 (0.90%)
join q30 10073.93 11630.24 -1556.32 (-13.38%)
join q31 5.72 8.94 -3.22 (-36.00%)
join SelectiveTwoHopJoin 53.81 57.24 -3.42 (-5.98%)
ldbc_snb_ic q35 2750.09 2621.22 128.87 (4.92%)
ldbc_snb_ic q36 489.33 484.63 4.70 (0.97%)
ldbc_snb_is q32 4.63 5.68 -1.06 (-18.59%)
ldbc_snb_is q33 13.12 15.43 -2.31 (-14.95%)
ldbc_snb_is q34 1.35 1.20 0.15 (12.87%)
multi-rel multi-rel-large-scan 1324.79 1499.32 -174.53 (-11.64%)
multi-rel multi-rel-lookup 20.55 44.94 -24.39 (-54.27%)
multi-rel multi-rel-small-scan 92.27 88.16 4.11 (4.67%)
order_by q25 128.81 134.53 -5.72 (-4.25%)
order_by q26 454.02 461.64 -7.62 (-1.65%)
order_by q27 1413.93 1407.49 6.44 (0.46%)
recursive_join recursive-join-bidirection 308.36 305.79 2.58 (0.84%)
recursive_join recursive-join-dense 6416.55 7092.58 -676.03 (-9.53%)
recursive_join recursive-join-path 23740.44 23969.41 -228.98 (-0.96%)
recursive_join recursive-join-sparse 1053.17 1047.42 5.75 (0.55%)
recursive_join recursive-join-trail 6703.53 7036.01 -332.49 (-4.73%)
scan_after_filter q01 173.91 172.68 1.23 (0.71%)
scan_after_filter q02 158.59 157.25 1.34 (0.85%)
shortest_path_ldbc100 q37 86.57 82.22 4.35 (5.29%)
shortest_path_ldbc100 q38 341.39 325.19 16.20 (4.98%)
shortest_path_ldbc100 q39 62.60 60.50 2.10 (3.47%)
shortest_path_ldbc100 q40 410.17 299.15 111.02 (37.11%)
var_size_expr_evaluator q03 2110.43 2073.31 37.12 (1.79%)
var_size_expr_evaluator q04 2210.34 2212.19 -1.85 (-0.08%)
var_size_expr_evaluator q05 2637.00 2644.54 -7.54 (-0.29%)
var_size_expr_evaluator q06 1332.31 1329.93 2.38 (0.18%)
var_size_seq_scan q19 1470.53 1470.74 -0.21 (-0.01%)
var_size_seq_scan q20 2569.08 2772.24 -203.16 (-7.33%)
var_size_seq_scan q21 2315.06 2392.45 -77.39 (-3.23%)
var_size_seq_scan q22 126.57 125.08 1.48 (1.19%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants