-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research and set up vector database system #3
Comments
Options:
Note: An embeddings model will need to be chosen to go with the vector database. |
Nice, thanks for listing this out. What do you think of using pgvector and supabase? So we get the best of both worlds? A customizable system for storing vectors, and a managed instance of the database for easy setup? Check out this It seems like all new Supabase db's support vector hsnw now too. Idk, what's your take? Pinecone and Chroma would be nice as well. What we really want is a system which will allow us to store the most vectors, the easiest |
Thats a pretty good idea! I think that supabase combined with pgvector would be a really powerful combo. It would us to use SQL, reducing the learning curve of the db, and supabase would allow us to host it on the cloud and have a central db instead of individual ones. HOWEVER, the database size on the free plan of Supabase is only 500mb (https://supabase.com/pricing). If the goal is a system to store the most vectors, then we would need to know roughly how many documents we plan on uploading and wether that would be under 500mb. For reference, uploading a chunk of 2000 words would take up ~10 KB. We could possibly filter out stopwords from the chunks to make them smaller but would need to test if that affects context given to the LLM at runtime. Supabase does have really good intergration with pgvector though, and I like the support for the new hsnq indexes. I think the supabase + pgvector combo is the best pick for the project as long as we don't think we are going to go over 500mb of chunks. |
Oh my goodness, thank you, Cyrus, for doing the research and finding that. I really appreciate it! Especially because, that 500MB limit is definitely something!!! 😅 We DONT want chunk storage to be a bottleneck, especially if people want to upload lecture videos or other multimodal content in the future. I think Pinecone, or one of the other options mentioned would be a better fit since there's less scale concerns, and we can easily add metadata filters for each course to handle everything more efficiently. Let's chat about this at our next meeting. Lots of interesting stuff here. |
@Nyumat Are we looking to use JS for the entire project? Would something like Go be considered at all? |
Yeah, ideally. Most of these AI SDKs have Python and TS/JS wrappers so that's why it's relevant here.
I'm not opposed, but would we gain from it that say, Python wouldn't provide? And where would we look to add it? I've seen stuff like Gofast out and about, but seems like the benefits we'd get from Go (speed, better grpc libs, multi-threading) are out of scope for this project, at least. Curious to hear what you think. |
Yea, makes sense. I mention go just because it would be different and JS is always used for everything. The JS ecosystem has a new framework, run-time, library, etc come out every second 😂. |
Lol, yeah agreed. Well, I'll tell you this. When we did BeavsAI in the past, we had tons of memory inefficiencies that served as a bottleneck once we tried to product-ionize the application. I did some research, and found there's a great langchain alternative for Go, used to build these AI applications. If you can get a running proof-of-concept (I can assist if need be as well, or we can do it live during our Wednesday meeting) — I'd be down to consider it, especially given all the performance we'd gain from using it, compared to Python and JS. |
Hey @Nyumat we didn't get a chance to talk at the meeting Wednesday. I think that the best step forward is Pinecone. We get 2GB of storage with the free tier and it looks pretty simple to set up. I'm gonna get to work on implementing the vector db this week, let me know if you have any questions or concerns. |
Huge, I'll be on the look out! And yeah my bad, couldn't make the meeting in-person |
@s2xon what do you think of assisting me in building a Discord bot (in go, as i've been enjoying using it recently again) to extend our current application's knowledge base? Ideally, we'd create a Discord bot that first retrieves & from then on listens to messages in the CS Discord Server (
|
Sadge ghosted |
Vector Databases are the core of how we can power ML applications to do things such as retrieve relevant information. I could talk about it a lot more in-depth, but I recommend you read the linked article from Pinecone, the leading vector DB solution, if you'd like to know more.
For our options—there's many, many, tools we can use here—so I'll list them out:
Pinecone
Supabase
Dewy
I could go on and on, but I found this nice comparison on reddit of the popular options.
Some others include Astra and Momento.
The text was updated successfully, but these errors were encountered: