Skip to content

Machine Learning classifier comparison GUI application. Choose 21 classifiers, evaluation data (optional for evaluation of synthetic data), hyperparameters, cross-validation splits, and rng seed; tabulates, and visualizes in Parallel Coordinates: best, worst, average, and standard deviation of Accuracy/F1/Recall.

License

Notifications You must be signed in to change notification settings

AvaAvarai/ML_Classifier_Comparison_Tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Classifier Comparison Tool

The Machine Learning Classifier Comparison Tool helps benchmark and compare the performance of various machine learning classifiers on a dataset. It supports optional evaluation data, cross-validation (or none if splits = 1), and an embedded parallel-coordinates visualization of the final results.

Load the train and evaluation datasets, select the classifiers, set the cross-validation parameters, and run the experiment for a selected number of runs tracking the best, worst, average, and standard deviation of the accuracy, F1, and recall scores.

Load Data

Select Classifiers

Select Parameters

The results are displayed in a table and can be exported to a CSV file the value in parentheses is the difference from the train and the evaluation dataset for synthetic data evaluation. ACC is the accuracy, F1 is the F1 score, and REC is the recall score. Analyze Results

The results can be visualized in a parallel coordinates plot with unique colors for each classifier with normalization and axes toggles. Visualize Results

Features

  • Load Main Dataset
    • Load a CSV file for training/benchmarking.
    • The tool automatically identifies the class column (requires "class" in the column name).
  • Optional Evaluation Dataset
    • Load a second CSV for evaluation.
    • If provided, cross-validation is performed on the evaluation data (training always on the main dataset).
  • Flexible Cross-Validation
    • Set the number of folds for CV (Cross-Validation Split).
    • If set to 1, no cross-validation is performed (the entire main dataset is used for training, and either the same dataset or the evaluation dataset is used for testing).
  • Multiple Classifiers
    • Choose from a variety of popular algorithms (Decision Tree, Random Forest, SVM, KNN, Logistic Regression, AdaBoost, XGBoost, etc.).
  • Hyperparameter Editing
    • Each classifier has its own parameter panel (e.g., number of neighbors for KNN, max depth for Trees, etc.).
  • Multiple Runs
    • Specify the number of runs to repeat the experiment (with different seeds) for more robust statistics.
  • Results & Visualization
    • Best, worst, average, and standard deviation (std) for Accuracy, F1, and Recall are displayed in a results table.
    • Parallel Coordinates: click “Visualize” to see an embedded parallel coordinates plot in a separate tab.
    • Export results to CSV.

Usage

  1. Load Main File (required).
  2. Optionally load an Evaluate File if you want to test on separate data.
  3. Go to Classifiers tab, pick one or more algorithms, and set the cross-validation parameters (split, runs, seed).
  4. Go to Parameters tab to tweak each classifier’s hyperparameters.
  5. Click Run Selected Classifiers to benchmark.
  6. Check results in the Results tab.
    • Export to CSV if desired.
    • Click Visualize to see a parallel coordinates chart in the Plot tab.

Getting Started

  1. Clone the repository
  2. Run pip install -r requirements.txt
  3. Run the main.py file with python main.py or python3 main.py depending on your python installation.

Planned Enhancements

  • Explore further graphical summaries (e.g., box plots, bar charts).
  • Automatic hyperparameter tuning with grid or random search.

Acknowledgements

License

This project is licensed under the MIT License.

About

Machine Learning classifier comparison GUI application. Choose 21 classifiers, evaluation data (optional for evaluation of synthetic data), hyperparameters, cross-validation splits, and rng seed; tabulates, and visualizes in Parallel Coordinates: best, worst, average, and standard deviation of Accuracy/F1/Recall.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages