Interactive tool for evaluation #122

omri374 · 2025-02-05T13:55:03Z

User Story

As a data scientist or analyst
I want to run an end-to-end interactive evaluation process for Presidio
So that I could quickly set up, configure, and assess Presidio's performance with synthetic data and structured evaluation

Acceptance Criteria

Provide an interactive pipeline (e.g., Gradio or Streamlit app) that includes:
1. Data Generation:
  - Users can define template sentences and Faker providers to generate synthetic PII data.
  - The generated dataset is structured and stored for evaluation.
2. Presidio Configuration:
  - Users can specify configuration settings via a YAML file (aligned with AnalyzerEngineProvider).
3. Entity Mapping:
  - Users can define a mapping between dataset entities and Presidio entities for evaluation.
4. Presidio Evaluation:
  - The Evaluator object runs Presidio on each sample and compares predictions to ground truth.
5. Metrics & Plots:
  - Compute and display key evaluation metrics (per-entity precision, recall, F2-score, and overall PII detection scores).
  - Generate visualizations (e.g., confusion matrices, precision-recall curves).
6. Error Analysis:
  - Highlight common failure cases (e.g., false positives, false negatives).
  - Provide interactive filtering options to inspect misclassified samples.
Ensure the workflow is modular so users can modify each step independently.
Provide a sample dataset and default configurations to help users get started quickly.
Document the process with clear instructions and examples.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interactive tool for evaluation #122

Interactive tool for evaluation #122

omri374 commented Feb 5, 2025 •

edited

Loading

Interactive tool for evaluation #122

Interactive tool for evaluation #122

Comments

omri374 commented Feb 5, 2025 • edited Loading

Acceptance Criteria

omri374 commented Feb 5, 2025 •

edited

Loading