What's Changed
Non backward compatible change
- Renamed criterias in LLM-as-a-Judge metrics to criteria - Breaking change by @tejaswini in #1545
New features
- Add Replicate inference support by @elronbandel in #1544
- Add text2sql tasks by @perlitz in #1414
- Add deduplicate operator by @elronbandel in #1549
New Assets
- Add more granite llm as judge artifacts by @martinscooper in #1516
- Add mtrag benchmark by @elronbandel in #1548
Documentation
- End of year summary blog post by @elronbandel in #1530
- Update notification banner styles and add 2024 summary blog link by @elronbandel in #1538
- Updated documentation and examples of LLM-as-Judge by @tejaswini in #1532
- Eval assist documentation by @tejaswini in #1537
Bug Fixes
- Fix Australian legal qa dataset by @elronbandel in #1542
- Set use 1 shot for wikitq in tables_benchmark by @yifanmai in #1541
- Bugfix: indexed row major serialization fails with None cell values by @yifanmai in #1540
- Solve issue of expired token in Unitxt Assistant by @eladven in #1543
- add a filter to wikitq by @ShirApp in #1547
- Fix the authentication problem by @eladven in #1550
- Attach assitant answers to their origins with url link by @elronbandel in #1528
- Update end of year summary blog by @elronbandel in #1552
- Add data classification policy to CrossProviderInferenceEngine initialization based on selected model by @elronbandel in #1539
- Fix recently broken rag metrics by @elronbandel in #1554
- Finqa hash to top by @elronbandel in #1555
- Refactor safety metric to be faster and updated by @elronbandel in #1484
- Improve assistant by @elronbandel in #1556
New Contributors
- @tejaswini made their first contribution in #1532
Full Changelog: 1.17.0...1.17.1