- Remove python 3.7 pipelines and add python 3.11 support
- Allow pre-aggregation even if no group_by is provided
- Add new group_by logic to allow more efficient querying
- Remove numexpr as a dependency for python 2 because it's causing problems
- Improved aggregation logic prefer pyarrow of pandas in python 3
- Fixed a bug where a rowgroup would be empty after filtering, causing strange NaN results after group by
- Fixed a bug where column were not properly renamed after aggregation
- Fixed broken serialization and added tests to verify
- Unpinned pyarrow since we have a newer version now
- Implemented python 3 (pyarrow 9+) specific pyarrow logic to optimize memory consumption
- Pinned pyarrow version because the new 9.0.0 causes segfaults
- Skip row_group if output is empty
- Fixes bug that min and max can't be calculated in an empty row group
- Refactored rowgroup_metadata_filter
- Add handling of missing columns in a parquet file that is used in a filter. This happens when new dimensions are created but existing parquet files do not have them yet. Now it throws an error for the query, the new behaviour will change this to giving an empty result. This is better because as the real value for the dimension is unknown for the file, the result should also be zero. It also greatly helps with issues where old files break reporting because they have not been updated yet.
- Removed specification of parquet format (2.0 is now old)
- Fix an import issue
- Add a parameter to handle default response when a parquet file is missing
- Handle the request for non-existing columns in a parquet file
- Align requirements over requirements.txt, pyproyect.toml and setup.py
- Improve performance for complex list filters
- Improve performance by only aggregating end result (at the cost of some memory efficiency)
- Fixed circleci python 3 tests (rounding and pip versioning)
- Handle count by aggregated results
- Enforce order of columns for partial results
- Handle non-natural naming ("-" in column names)
- Check for filter columns that are not part of the result
- Remove the entire uses of categorical values as they impede concatenation of results
- Ensure that groupby columns are seen as categorical series
- Fix Python 2 legacy differences in pyarrow
- Fix Python 2 requirements
- Updated Links
- Added arrow aggregation method
- Introduced writer debug output
- Updated manifest
- Updated requirements for dependencies based on the python version
- Inital release