[Neurips23] ParlayANN Submission for OOD track #186

magdalendobson · 2023-10-25T19:33:28Z

This PR contains our submission for the OOD track. We tested on an AWS c6i.2xlarge and found that at 90% recall, our entry was averaging around 7000-7500 QPS, while the baseline was around 3500. The build took about 4 hours. I found that on this machine, when building and then querying, the main memory did not seem to flush out between saving the files and then reloading them and thus overran the memory limit. This means that it was necessary to build the index and then load and query it on a separate run. I observed the same problem with the DiskANN baseline. Just wanted to note that in case it comes up during evaluation.

I have edited the neurips23.yml file to test our build using the random-xs dataset. Please let me know if there is anything else needed.

harsha-simhadri

Thanks for the heads up on memory. Will try out your PR.

.github/workflows/neurips23.yml

benchmark/dataset_io.py

benchmark/runner.py

magdalendobson · 2023-10-25T20:11:25Z

I think my latest commit fixes those three issues.

neurips23/ood/diskann/Dockerfile

harsha-simhadri · 2023-10-25T21:50:50Z

I am unable to run this since the time out for docker is set to 30 minutes in your branch. You may want to rebase with the latest commit on this repo's main to get the latest runner config.

magdalendobson · 2023-10-26T02:17:26Z

I am unable to run this since the time out for docker is set to 30 minutes in your branch. You may want to rebase with the latest commit on this repo's main to get the latest runner config.

I just pulled from main and rebased.

maumueller · 2023-10-27T12:17:53Z

@magdalendobson Do you think the memory issue is happening because of some detail in the eval framework? It seems to me that https://github.com/harsha-simhadri/big-ann-benchmarks/pull/186/files#diff-923b8101f105761e2d658a67e990261dd3658cc86f6482d9231316d184cb101aR100-R108 might be the culprit because build_vamana_index might not free the memory after writing the index to disk.

magdalendobson · 2023-10-27T15:53:25Z

Hmm, yeah, I think it's possible. Let me experiment with modifying the delete constructors of some of my objects and I'll get back to you.

magdalendobson · 2023-10-27T19:18:11Z

@maumueller thanks for the suggestion. I took a closer look and it did turn out that some of my data structures were not freeing memory properly. You should now be able to re-install and then run my submission without memory problems.

harsha-simhadri · 2023-10-28T05:39:14Z

I see the following results after restarting from a crash post index build

vamana,vamana,text2image-10M,10,8568.28073422188,0.0,1000000.0,13098176.0,1528.6819382196045,0,0,ood,0.8773770000000001
vamana,vamana,text2image-10M,10,7777.599618424573,0.0,1000000.0,13098176.0,1684.089776101532,0,0,ood,0.88975
vamana,vamana,text2image-10M,10,7128.303182660356,0.0,1000000.0,13098176.0,1837.488623079529,0,0,ood,0.8999010000000001
vamana,vamana,text2image-10M,10,6873.170646655227,0.0,1000000.0,13098176.0,1905.6963188269044,0,0,ood,0.9041650000000001
vamana,vamana,text2image-10M,10,6607.938732179394,0.0,1000000.0,13098176.0,1982.1878699047852,0,0,ood,0.908165
vamana,vamana,text2image-10M,10,6376.8978543202775,0.0,1000000.0,13098176.0,2054.0043606196596,0,0,ood,0.911952
vamana,vamana,text2image-10M,10,6148.750960929249,0.0,1000000.0,13098176.0,2130.2173536103824,0,0,ood,0.9153420000000001
vamana,vamana,text2image-10M,10,5739.398208736937,0.0,1000000.0,13098176.0,2282.1514597229,0,0,ood,0.92154
vamana,vamana,text2image-10M,10,5551.22053214029,0.0,1000000.0,13098176.0,2359.512817796478,0,0,ood,0.924339
vamana,vamana,text2image-10M,10,4789.167993183752,0.0,1000000.0,13098176.0,2734.958560368347,0,0,ood,0.935529

magdalendobson · 2023-10-28T18:01:21Z

That seems about right. Are you sure you pulled the latest version if you're still having the crash issue? My latest push fixed it on the 16G machine I tested on. You might have to increment the CACHEBUST argument in the Dockerfile. Also, are there any outstanding edits needed? I think I fixed everything you requested earlier.

harsha-simhadri · 2023-10-29T00:39:41Z

Let me merge this PR since its functional. We will do a final run later and it would have your latest commit.

landrumb and others added 16 commits September 14, 2023 13:06

initial commit

a150d6d

added default alpha

99ef4ba

fixed bad dockerfile

0fea75f

cache bust

66c8066

fixed timeout

134e866

added additional search configs to get past .9

07fae9f

one more query config

ee9924e

added two pass arg

79ee157

fixing arg in diskann dockerfile

cb5e609

fixed merge conflict

2e1b64a

committing to switch branches

193aabd

committing to switch branches

78e26ba

committing to switch branches

a4a8bef

added vamana.py

72cf68e

fixed issue in file detection

9abd7be

finalizing before PR

00a2626

harsha-simhadri requested changes Oct 25, 2023

View reviewed changes

.github/workflows/neurips23.yml Show resolved Hide resolved

benchmark/dataset_io.py Outdated Show resolved Hide resolved

benchmark/runner.py Outdated Show resolved Hide resolved

changes requested for PR

61938a8

harsha-simhadri reviewed Oct 25, 2023

View reviewed changes

neurips23/ood/diskann/Dockerfile Outdated Show resolved Hide resolved

changes for PR

65f94eb

landrumb and others added 8 commits October 25, 2023 22:10

initial commit

96b5307

added two pass arg

690dfab

added default alpha

795e6b3

cache bust

774cb47

added additional search configs to get past .9

2a08366

one more query config

d65e895

committing to switch branches

89258d4

committing to switch branches

76d8443

Magdalen Dobson added 6 commits October 25, 2023 22:12

committing to switch branches

92e0185

added vamana.py

11ab3fc

fixed issue in file detection

8a2c9ee

finalizing before PR

f2983d0

changes requested for PR

1b16b0f

merged

3c46abc

harsha-simhadri merged commit 72b61f6 into harsha-simhadri:main Oct 29, 2023
15 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Neurips23] ParlayANN Submission for OOD track #186

[Neurips23] ParlayANN Submission for OOD track #186

magdalendobson commented Oct 25, 2023

harsha-simhadri left a comment

magdalendobson commented Oct 25, 2023

harsha-simhadri commented Oct 25, 2023

magdalendobson commented Oct 26, 2023

maumueller commented Oct 27, 2023

magdalendobson commented Oct 27, 2023

magdalendobson commented Oct 27, 2023

harsha-simhadri commented Oct 28, 2023

magdalendobson commented Oct 28, 2023

harsha-simhadri commented Oct 29, 2023

[Neurips23] ParlayANN Submission for OOD track #186

[Neurips23] ParlayANN Submission for OOD track #186

Conversation

magdalendobson commented Oct 25, 2023

harsha-simhadri left a comment

Choose a reason for hiding this comment

magdalendobson commented Oct 25, 2023

harsha-simhadri commented Oct 25, 2023

magdalendobson commented Oct 26, 2023

maumueller commented Oct 27, 2023

magdalendobson commented Oct 27, 2023

magdalendobson commented Oct 27, 2023

harsha-simhadri commented Oct 28, 2023

magdalendobson commented Oct 28, 2023

harsha-simhadri commented Oct 29, 2023