Machine Learning Classifier Comparison Tool

The Machine Learning Classifier Comparison Tool helps benchmark and compare the performance of various machine learning classifiers on a dataset. It supports optional evaluation data, cross-validation (or none if splits = 1), and an embedded parallel-coordinates visualization of the final results.

Load the train and evaluation datasets, select the classifiers, set the cross-validation parameters, and run the experiment for a selected number of runs tracking the best, worst, average, and standard deviation of the accuracy, F1, and recall scores.

The results are displayed in a table and can be exported to a CSV file the value in parentheses is the difference from the train and the evaluation dataset for synthetic data evaluation. ACC is the accuracy, F1 is the F1 score, and REC is the recall score.

The results can be visualized in a parallel coordinates plot with unique colors for each classifier with normalization and axes toggles.

Features

Load Main Dataset
- Load a CSV file for training/benchmarking.
- The tool automatically identifies the class column (requires "class" in the column name).
Optional Evaluation Dataset
- Load a second CSV for evaluation.
- If provided, cross-validation is performed on the evaluation data (training always on the main dataset).
Flexible Cross-Validation
- Set the number of folds for CV (Cross-Validation Split).
- If set to 1, no cross-validation is performed (the entire main dataset is used for training, and either the same dataset or the evaluation dataset is used for testing).
Multiple Classifiers
- Choose from a variety of popular algorithms (Decision Tree, Random Forest, SVM, KNN, Logistic Regression, AdaBoost, XGBoost, etc.).
Hyperparameter Editing
- Each classifier has its own parameter panel (e.g., number of neighbors for KNN, max depth for Trees, etc.).
Multiple Runs
- Specify the number of runs to repeat the experiment (with different seeds) for more robust statistics.
Results & Visualization
- Best, worst, average, and standard deviation (std) for Accuracy, F1, and Recall are displayed in a results table.
- Parallel Coordinates: click “Visualize” to see an embedded parallel coordinates plot in a separate tab.
- Export results to CSV.

Usage

Load Main File (required).
Optionally load an Evaluate File if you want to test on separate data.
Go to Classifiers tab, pick one or more algorithms, and set the cross-validation parameters (split, runs, seed).
Go to Parameters tab to tweak each classifier’s hyperparameters.
Click Run Selected Classifiers to benchmark.
Check results in the Results tab.
- Export to CSV if desired.
- Click Visualize to see a parallel coordinates chart in the Plot tab.

Getting Started

Clone the repository
Run pip install -r requirements.txt
Run the main.py file with python main.py or python3 main.py depending on your python installation.

Planned Enhancements

Explore further graphical summaries (e.g., box plots, bar charts).
Automatic hyperparameter tuning with grid or random search.

Acknowledgements

Color palette from Roman Roads Project

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
demo_images		demo_images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
about.txt		about.txt
classifier_config.yaml		classifier_config.yaml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Classifier Comparison Tool

Features

Usage

Getting Started

Planned Enhancements

Acknowledgements

License

About

Releases

Packages

Languages

License

AvaAvarai/ML_Classifier_Comparison_Tool

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Classifier Comparison Tool

Features

Usage

Getting Started

Planned Enhancements

Acknowledgements

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages