We are given a dataset containing 28x28 grayscale images. Each image is either a handwritten letter or digit. The dataset can be downloaded from here: https://drive.google.com/file/d/12OYCKGQp1VybvLM157ioLU4Bjt7PWpt-/
Format of dataset: The dataset is well-balanced and contains 47 classes, as described in the image below. (10 digits, 26 capital letters and 11 small letters)
The dataset is present as a CSV file. You’ll find two CSV files: Train-set and Test-set. You are supposed to train only using the train-set and use test-set only for calculating accuracy.
Number of samples: Train set : 112,800 (2400 images per class) Test set : 18,800 ( 400 images per class)
Each line in the csv file corresponds to 1 sample. Each line will contain 785 values. The first value in all lines indicate the label ID, and the remaining 784 values corresponds to the individual pixel values of the 28 X 28 image The ASCII value of each label ID can be found in the mapping.txt file. For example, a label ID of 10 has an ASCII value of 65, which means that it corresponds to the character ‘A’. You are supposed to use all the train samples (lines) to complete the following tasks: Note: Each task must be submitted as independent runnable codes/Notebooks in a single GitHub repo. (i.e Don’t squeeze in all tasks in a single file/Model).
Given an image, you must be able to classify whether the image is a letter or a digit. Expected outcome: You are expected to use a ML-based model (like CNNs, etc.) to solve the problem with a reasonably high accuracy.
Given an image, you are supposed to design model(s) which does the following:
- If the image is a letter, you are supposed to predict if it is a vowel or consonant.
- If the image is a digit, you are supposed to predict if it is an even or odd number.
You are supposed to use only ML models that directly predicts the above, instead of doing manual predictions like using modulus operator on top of digit predictions.
Expected outcome: Given an image, your end-to-end setup must print whether it is a letter or digit, and based on that, it must automatically run the corresponding model to print if it is vowel/consonant or even/odd respectively.
Given an image, you are supposed to predict what digit or letter the image contains. That is, you will be doing a classification task for 47 classes.
Expected outcome: Given an image, you have to print what character it is (just using a single model). Also, report the class-wise accuracy if possible.
I trained a basic model which classifies all 47 charecters. Then I used this pretrained model in Task 1 and Task 2. In Task 1 and 2, I used Decision Tree to classify Number, Letter and Odd, Even, Vowel, Const. In Task 3, I used 10 CNNs to classify charecters accurately.
Procedure in Task 1 and 2:-
- Loaded data and visualized it.
- Changed the input array from 784 to 28 X 28 X 1 and then divided by 255 to normalized it.
- build the CNN model and to classify the charecters.
- Ran a Decision Tree model on the output of CNN.
- Visualized and saved the output.
Procedure of Task 3:-
- Loaded data and visualized it.
- Changed the input array from 784 to 28 X 28 X 1 and then divided by 255 to normalized it.
- build 10 CNN models and to classify the charecters.
- Combine their results.
- Visualized and saved the output.
- Normalization
- CNN
- ANN
- XGBoost
- Weights Regularizers
- Batch Normalization
- Data Augmentation
- Pruning
- Stacked models
- RandomCV
- Normalization
- CNN
- Batch Normalization
- Data Augmentation
- Pruning
- Stacked models
Models are stored in model folder.
- dataset_distribution - Dividing dataset into train validate test
- one_hot_encoding - One hot encoding of Labels
- de_encoding - Decoding of Labels
- change_to_image - Change matrix from (784,) to (28,28,1)
- create_download_link - creating downloading link
- acc - printing accuracy, classification report
- labelToDigitLetters - changing labels
- labelToOddeven_Vowelcharecter - changing labels
- It returns the CNN Model