v0.1.22
This sycamore release includes support for Python 3.12, a connector for the Qdrant vector database, and many bug fixes and enhancements. Thanks to @Anush008 for contributing the Qdrant support!
What's Changed
- bump sdk to 0.1.4 by @HenryL27 in #823
- Fix issue with empty tool response leading to hallucinations. by @mdwelsh in #818
- Fix bug where prompt is modified by OpenAIEntityExtractor. by @mdwelsh in #824
- Fix poetry.lock with missing dependency. by @mdwelsh in #825
- Query trace viewer for Luna demo, and better PDF previews. by @mdwelsh in #828
- Batch Processing Bug Fix by @karanataryn in #829
- Get local mode working 1/n by @eric-anderson in #826
- Changing titles for some posts by @AbhijitP-009 in #827
- Transform to convert Document into Markdown. by @alexaryn in #811
- Fix query trace viewer. by @mdwelsh in #830
- Ingest more fields into OpenSearch schema for NTSB demo. by @mdwelsh in #834
- Fix bug with trace view. by @mdwelsh in #833
- Improved sorting of elements by bbox for one and two columns. by @alexaryn in #801
- Make PDFMiner Pipelined by @karanataryn in #807
- Fix error message on None value passed to DateTimeStandardizer. by @mdwelsh in #835
- Sundry improvements while using luna in a customer. by @eric-anderson in #832
- fix to pass string to tokenizer by @Soeb-aryn in #831
- Some improvements to query plans for Luna demo. by @mdwelsh in #836
- Update requires_modules type annotations to work with mypy. by @bsowell in #837
- Lazily Set Table Text Representation by @karanataryn in #839
- Have Luna use .keyword field for path field. by @mdwelsh in #841
- Add a simple logical query plan compare function by @baitsguy in #840
- Improve luna property handling by @eric-anderson in #842
- Add support for Python 3.12. by @bsowell in #838
- Fix Luna UI to show query plan operators. by @mdwelsh in #847
- bugfix to extract text summaries(dont just randomly assert) by @RitxmSaha in #848
- Ignore bad tables by @MarkLindblad in #849
- Add support for caching intermediate results of Luna queries. by @mdwelsh in #850
- add read.opensearch(reconstruct_document =True) option by @baitsguy in #845
- Fold in query-demo capability to query-ui. by @mdwelsh in #852
- Define parallelism on nodes by @eric-anderson in #853
- Basic documentation for APS markdown option. by @alexaryn in #854
- Implement output_format in Aryn SDK partition_file(). by @alexaryn in #857
- Add
local-inference
extra tosycamore-ai
dependency inapps/query-ui
. by @mdwelsh in #859 - Super basic FastAPI wrapper to Sycamore Query. by @mdwelsh in #855
- Support output_format in ArynPartitioner. by @alexaryn in #858
- Fix tile cannot extend outside image by @dhruvkaliraman7 in #856
- Support Jupyter saving to S3 by @eric-anderson in #860
- Add PaddleOCR and Refactor Text Extraction by @karanataryn in #745
- Fix broken test. by @mdwelsh in #863
- Get Local Mode working 2/n by @eric-anderson in #861
- Remove package-mode by @eric-anderson in #865
- Add similarity scoring and rerank transform by @baitsguy in #864
- adding docs for AssignDocProperties, Standardizer and ExtractTableProperties by @Soeb-aryn in #866
- Add newline before text elements. by @alexaryn in #862
- handle file paths in the sdk by @HenryL27 in #869
- Add packaging library to aryn-sdk pyproject.toml. by @bsowell in #870
- Do some escaping of special Markdown characters. by @alexaryn in #867
- fix type annotation for file by @HenryL27 in #871
- Element ordering and test improvements by @baitsguy in #872
- Test fixes and more local mode by @baitsguy in #873
- Add a few more files to .gitignore. by @bsowell in #875
- feat: Qdrant support by @Anush008 in #821
- Get llm_filter to support document structure + similarity sorting for elements by @baitsguy in #876
- Add documentation for Sycamore Query. by @mdwelsh in #878
- Move loaddata script to query-ui. by @mdwelsh in #877
- Remove deprecated query-demo UI. by @mdwelsh in #881
- Adjust Pinecene Docs for Clarity by @karanataryn in #883
- Add source_mode parameter to AutoMaterialize. by @bsowell in #885
- add optimization from training development by @HenryL27 in #886
- Fix documentation link, sentence grammar by @MarkLindblad in #879
- Clean Up Text Extraction by @karanataryn in #868
- Fix Parameter Error in Docs by @karanataryn in #888
- Enable document model in sycamore.query + query-ui improvements by @baitsguy in #884
- Fix parallelism bug. by @eric-anderson in #889
- fix issue when packages and containers do not align at all -> max([]) by @HenryL27 in #891
- Bump version to 0.1.22. by @bsowell in #892
New Contributors
Full Changelog: v0.1.21...v0.1.22