Skip to content

Latest commit

 

History

History
73 lines (48 loc) · 2.62 KB

README.md

File metadata and controls

73 lines (48 loc) · 2.62 KB

Data Profiler

Data Profiler is a powerful and user-friendly web application built with Streamlit that allows you to analyze and visualize your datasets with ease. Simply upload your data in .csv or .xlsx format, and generate comprehensive profiling reports that help you detect anomalies, patterns, and trends within your data.

Features

  • Automated Data Analysis: Quickly generate detailed profiling reports by uploading your dataset.
  • Customizable Reports: Choose between different display modes, including Primary, Dark, and Orange.
  • Support for Multiple Formats: Upload .csv or .xlsx files (up to 10 MB) for analysis.
  • Interactive UI: Easy-to-use interface with options to select specific sheets for .xlsx files.
  • Downloadable Reports: Save the profiling report as an HTML file for offline analysis.

Installation

To run the Data Profiler application on your local machine, follow the steps below:

1. Clone the Repository

git clone https://github.com/srinibas-masanta/data-profiler.git
cd data-profiler

2. Set Up a Virtual Environment

Create and activate a virtual environment to manage dependencies.

python -m venv dataprofile
.\dataprofile\Scripts\activate  # On Windows
source dataprofile/bin/activate  # On macOS/Linux

3. Install Dependencies

Install the required Python packages listed in the requirements.txt file.

pip install -r requirements.txt

Alternatively, manually install the necessary packages:

pip install numpy pandas scipy matplotlib streamlit ydata-profiling streamlit-pandas-profiling openpyxl xlrd

4. Run the Application

Start the Streamlit application by running the following command:

streamlit run app.py

Usage

Once the application is running, follow these steps:

  1. Upload Your Data: Use the sidebar to upload a .csv or .xlsx file (up to 10 MB).
  2. Select Options: Choose the report mode (Primary, Dark, Orange), and decide if you want a minimal report or a full report.
  3. Generate Report: Click to generate the report, which will be displayed within the app.
  4. Download Report (Optional): If desired, save the report as an HTML file using the download button.

Project Structure

  • app.py: Main script containing the Streamlit application logic.
  • media/DP Logo.jpg: Logo used in the welcome page of the application.
  • requirements.txt: List of all the Python dependencies required to run the application.

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.