Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Robustness by Handling Missing Values and Group-wise Calculation of PPR #192

Open
lshpaner opened this issue May 7, 2024 · 0 comments

Comments

@lshpaner
Copy link

lshpaner commented May 7, 2024

Background

The Aequitas library is used for auditing bias and fairness in machine learning models. One key metric it computes is the total number of predicted positives (k), crucial for further fairness metrics calculations.

Issue

Currently, the computation of k on line 130 of group.py assumes there are no missing values in the predictions. This leads to inaccurate calculations if the data contains missing values. Additionally, the current method calculates k across all groups together, which is done on line 164. This method might mask disparities in the predicted positives across different demographic groups.

Suggested Improvement

It would be beneficial to handle missing values explicitly, either by excluding them with a warning or by offering an option to impute them based on user preference. Furthermore, calculating k separately for each group and then summing these values can provide a clearer view of model behavior across different groups. This approach would enhance the transparency and utility of the fairness assessment.

Below is a proposed change in the calculation method:

# Proposed method to calculate k group-wise and handle missing values
grouped = df.groupby('group')
k_per_group = grouped.apply(lambda x: x[x[score] == 1].dropna().shape[0])
total_k = k_per_group.sum()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant