Amata mea Argeia - gratiam magnam tibi ago!
Patientia tua in studiis meis computatoriis auxilium meum maximum!
Many thanks to Svilen Stanoev who introduced me to the concept of partitioned datasets in object storage some years ago!
Thank you, Svilen!
Many thanks to the contributors of Apache Arrow, DuckDB and Hugging Face Tokenizers!
Many thanks to the teams of Fly.io, Tigris Data, MinIO, Cloudflare R2 and Hugging Face Datasets!
Many thanks to the creators of HarperDB, Inc.! Their system introduced me to the "exploded data model".
This paradigm influenced heavily the partitioned index of Reteti which is never read in its entirety during search.
https://huggingface.co/docs/tokenizers/index
https://arrow.apache.org/docs/python/api.html
https://duckdb.org/docs/
https://min.io/docs/minio/linux/developers/python/API.html
https://huggingface.co/docs/huggingface_hub/index
https://www.gradio.app/docs
https://fly.io/docs/
https://www.tigrisdata.com/docs/overview/
https://commoncrawl.org/blog/news-dataset-available
https://huggingface.co/datasets/CloverSearch/cc-news-mutlilingual
https://huggingface.co/datasets/CloverSearch/data_article_count
https://stackoverflow.com/questions/2564137/how-to-terminate-a-thread-when-main-program-ends