This is a Flask-based web application that allows users to search for datasets on Kaggle and download them directly. The app provides a user-friendly interface to interact with Kaggle's dataset repository, making it easier to find and retrieve datasets.
- Search for Datasets: Enter a search term to find datasets hosted on Kaggle.
- View Search Results: See a list of datasets that match your query, along with their titles.
- Download Datasets: Download selected datasets as
.zip
files directly from the application.
-
App.py
- The main Flask application that handles routes and renders templates.
- Routes:
/
: Home page./search
: Search for datasets./download/<path:dataset_ref>
: Download the selected dataset.
-
kaggle_connect.py
- Handles interaction with the Kaggle API.
- Functions:
search_datasets(search_term)
: Searches Kaggle for datasets matching the provided term.download_dataset(dataset_ref)
: Downloads a dataset by its reference.
-
index.html
- Home page with a welcome message and a link to the search page.
-
search.html
- A form to input a search term for finding datasets.
-
search_results.html
- Displays the search results and provides download links for each dataset.
- Python 3.7+
- Kaggle API credentials (download your
kaggle.json
from Kaggle and place it in~/.kaggle/
or the project root).
-
Clone the repository:
git clone https://github.com/yourusername/dataset-search-download.git cd dataset-search-download
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up Kaggle API credentials:
- Place your
kaggle.json
file in the~/.kaggle/
directory or in the root of the project.
- Place your
-
Run the Flask application:
python App.py
-
Open your web browser and navigate to:
http://127.0.0.1:5000/
-
Use the application to search for datasets, view results, and download datasets.
- After accessing
http://127.0.0.1:5000/
, you should see a page like the one shown below, and click the "Search for Datasets" button. - Then, after selecting "Search for Datasets". We enter the term we want to find. In this case, we are using the example "College"."
- We will see a list of Datasets from which we will select one. In this case we are going to select the first dataset "College Basketball Dataset".
- The system will save the .zip file of the selected dataset into the "Downloads folder".
- After selecting the dataset we want, we can go back to the beginning to download another dataset.
- Finally, it shows us the home page with a small message telling us the last .zip file we downloaded.
.
├── App.py # Main Flask application
├── kaggle_connect.py # Kaggle API integration
├── templates/ # HTML templates
│ ├── index.html
│ ├── search.html
│ ├── search_results.html
├── dataset/ # Directory for downloaded datasets
└── README.md # Project documentation
- Ensure that the Kaggle API is properly authenticated to use this application.
- The downloaded datasets are saved in the
dataset/
directory as.zip
files.