You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've created a dtype benchmark to help track our support of different dtypes. However, the current report format is difficult to quickly and easily understand. It also may test situations that we do not expect to support, leading to a support percentage than expected.
Expected behavior
Update how we run the dtype benchmark:
Run the data type across fit and sample using GaussianCopula
Run the data type across all transformers that support that data type
Run the data type across all constraints that support that data type
We should also exclude dtype/sdtype combinations that we do not expect to work/support (see spreadsheet here, red indicates we should not support the dtype/sdtype combination). We do not need to test pyarrow dtypes since we do not officially support them (we can leave the code for testing them, and either (1) continue to test them but exclude them from the summary or (2) stop testing them entirely for now).
The final report should also be updated to improve readability. It should have one tab for the total results for each sdtype, as well as a summary page. Every row of the summary should be a dtype/sdtype combination and indicate whether fit worked, sample worked, the percentage of constraints that worked, and the percentage of transformers that worked.
Problem Description
We've created a dtype benchmark to help track our support of different dtypes. However, the current report format is difficult to quickly and easily understand. It also may test situations that we do not expect to support, leading to a support percentage than expected.
Expected behavior
Update how we run the dtype benchmark:
We should also exclude dtype/sdtype combinations that we do not expect to work/support (see spreadsheet here, red indicates we should not support the dtype/sdtype combination). We do not need to test pyarrow dtypes since we do not officially support them (we can leave the code for testing them, and either (1) continue to test them but exclude them from the summary or (2) stop testing them entirely for now).
The final report should also be updated to improve readability. It should have one tab for the total results for each sdtype, as well as a summary page. Every row of the summary should be a dtype/sdtype combination and indicate whether fit worked, sample worked, the percentage of constraints that worked, and the percentage of transformers that worked.
Additional context
See this doc for more information.
The text was updated successfully, but these errors were encountered: