Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dtypes benchmark report #2365

Open
frances-h opened this issue Feb 11, 2025 · 0 comments
Open

Improve dtypes benchmark report #2365

frances-h opened this issue Feb 11, 2025 · 0 comments
Labels
feature request Request for a new feature internal The issue doesn't change the API or functionality

Comments

@frances-h
Copy link
Contributor

Problem Description

We've created a dtype benchmark to help track our support of different dtypes. However, the current report format is difficult to quickly and easily understand. It also may test situations that we do not expect to support, leading to a support percentage than expected.

Expected behavior

Update how we run the dtype benchmark:

  1. Run the data type across fit and sample using GaussianCopula
  2. Run the data type across all transformers that support that data type
  3. Run the data type across all constraints that support that data type

We should also exclude dtype/sdtype combinations that we do not expect to work/support (see spreadsheet here, red indicates we should not support the dtype/sdtype combination). We do not need to test pyarrow dtypes since we do not officially support them (we can leave the code for testing them, and either (1) continue to test them but exclude them from the summary or (2) stop testing them entirely for now).

The final report should also be updated to improve readability. It should have one tab for the total results for each sdtype, as well as a summary page. Every row of the summary should be a dtype/sdtype combination and indicate whether fit worked, sample worked, the percentage of constraints that worked, and the percentage of transformers that worked.

Additional context

See this doc for more information.

@frances-h frances-h added feature request Request for a new feature internal The issue doesn't change the API or functionality labels Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature internal The issue doesn't change the API or functionality
Projects
None yet
Development

No branches or pull requests

1 participant