Skip to content

markbakos/geo-guesser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Geo Guesser

Time spent in project

Table of Contents

About

This is a deep learning project that tries to predict where an image was taken. It uses a Convolutional Neural Network to analyze visual features and distinguish between locations based on the unique architectural, environmental and infrastructural elements.

The project currently focuses on five major capitals (you can set custom cities and locations):

  • Budapest
  • Ottawa
  • Tokyo
  • Cairo
  • Canberra

The model performs two key tasks:

  1. City Classification: Assigns an image to one of the pre-defined city categories using a softmax-activated output layer.
  2. Coordinate Regression: Estimates latitude and longitude values via a linear-activated output layer.

This project is divided into three main components:

  • Backend: A FastAPI server that connects the model with the web interface.
  • Model: The EfficientNetV2S-based neural network handling both classification and regression.
  • Frontend: A web application built with Next.js, TypeScript and TailwindCSS to easily interact with the model.

The model's architecture

Features

Data Handling:

  • Uses the Mapillary API to collect street-level images and metadata using the Mapillary API.
  • Speeds up data gathering through concurrent API requests.
  • Preprocesses images (resizing to 224x224 pixels) for efficient training.

Deep Learning Model:

  • Uses the pre-trained EfficientNetV2S network with custom upper layers.
  • Employs fine-tuning where the EfficientNetV2S base is frozen, and only the custom layers are trained with the Adam optimizer.
  • Uses GRAD-CAM to produce heatmaps on request that reveal image regions influencing the model's decisions.

Heatmap from the model

Training and Evaluation:

  • Achieves approximately 83% accuracy in city classification.
  • Saves best_location_model.keras as the best validation coordinates accuracy from training.
  • Saves best_overall_model.keras as the best overall model based on validation loss from training.

Frontend UI:

  • Developed with Next.js, TypeScript and TailwindCSS.
  • Provides an easily usable interface for users to interact with the model.
  • Accessible online at Location Guesser,

Requirements

Prerequisites

  1. Python 3.10 or higher: Install from python.org.
  2. pip: Python package manager (comes with Python installations).
  3. CUDA Toolkit 12.8 (optional)
  4. cuDNN 9.7.1 (optional)

Python Dependencies

Install the required Python packages from requirements.txt found in the root folder.

pip install -r requirements.txt

Installation

  1. Clone the repository
 https://github.com/markbakos/geo-guesser.git
 cd geo-guesser
  1. Set up environmental variables
  • In the root folder (geo-guesser), in your .env file:
MAPILLARY_KEY=[Your API key]
  1. Prepare the dataset
  • Set your desired locations to gather data from, or keep the original 5.
  • Collect images using mapillary_collection.py
  1. Using the trained model
  • From console:
python -m predict path/to/saved/image --generate_heatmap
uvicorn server:app

Contributing

Feel free to fork this repository, make changes, and submit a pull request.

📧 Contact

For any inquiries, feel free to reach out:

Email: markbakosss@gmail.com
GitHub: markbakos