Forest-Cover-Classification

The objectives of this project is built a deep learning model to predict the forest cover type from different cartographic variables.

Forest Cover Types: ['Spruce/Fir', 'Lodgepole Pine','Ponderosa Pine', 'Cottonwood/Willow','Aspen', 'Douglas-fir', 'Krummholz']
Dataset ('cover_data.csv') that contains 581012 observations. Each observation has 55 columns (54 features and the last one being the class).

Model Test Accuracy: 91.5432 %
Model Test Loss: 0.2159

Model Accuracy & Model Loss with sparse_categorical_crossentropy loss function and Adam optimizer:

From confusion matrix heatmap, we see that Lodgepole Pine, Cottonwood Willow, Aspen, and Douglas-Fir suffer from a high percentage of mis-classifications. To investigate the possible causes, one can explore the following:

Future Improvement

Check the proportion of observations for each cover-type. Imbalances in the dataset will affect classification.
How each wilderness area is distributed.
Find the similarties among different cover-types (correlation, scatter-plots, etc.) These similarities might be one of the reasons the model might be tripping-up. There are ways to address it - one of it is to carefully remove all of the collinear variables, leaving only one.
Remove noise, inconsistent data and errors in training data - this should be done carefully with domain experts.
Try to use some other performance metric other than 'accuracy'. It fails to be a reliable metric when data in imbalanced. That is why we have other metrics such as Precision/Recall, F1-score etc.
Try resampling the data (undersample or oversample appropriately or stratified). Downsampling could be done with thresholding.
The most important thing to understand here is that in deep-learning, the gradient(s) of the majority class(es) dominate(s) and will influence the weight-updates. There are also some advanced techniques that will ameliorate this situation.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
.gitignore		.gitignore
README.md		README.md
cover_data.csv		cover_data.csv
forest cover classification.ipynb		forest cover classification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forest-Cover-Classification

The objectives of this project is built a deep learning model to predict the forest cover type from different cartographic variables.

Future Improvement

About

Releases

Packages

Languages

ZiGuan/Forest-Cover-Classification

Folders and files

Latest commit

History

Repository files navigation

Forest-Cover-Classification

The objectives of this project is built a deep learning model to predict the forest cover type from different cartographic variables.

Future Improvement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages