Skip to content

Analysis of public reviews of housing providers in Manchester, using Natural Language Processing AI


Notifications You must be signed in to change notification settings


Repository files navigation

Trustpilot Housing Reviews Analysis

License: CC0-1.0 Python Version

Table of Contents


The Trustpilot Housing Reviews Analysis project aims to extract and analyze Trustpilot reviews for various housing providers. The project consists of two main components:

  1. Data Extraction: Automates the retrieval of Trustpilot reviews for specified housing providers.
  2. Classification: Processes and classifies the extracted reviews to identify key issues and sentiment trends.

This analysis helps in understanding tenant satisfaction, common complaints, and areas requiring improvement for housing providers.

  • Funding This project was funded by a Campion Grant awarded by Manchester Statistical Society. See for more information.
  • Report An external link to the report accompanying this project can be found here: MSS REPORT LINK WHEN PUBLISHED.


  • Automated Data Extraction: Scrapes Trustpilot reviews for selected housing providers.
  • Data Cleaning and Preprocessing: Cleans the extracted data for accurate analysis.
  • Text Classification: Categorizes reviews into predefined categories (e.g., Maintenance, Customer Service).
  • Reporting: Generates summary reports and visualizations of findings.

Project Structure

├── Housing Association Review Classification and Theme Visualization.ipynb  #  Notebook for classifying reviews and visualizing themes in housing association data.
├── Keyword Analysis 1 Star Reviews.ipynb  #  Notebook for analyzing keywords within 1 star housing association reviews.
├── LICENSE  #  Project license file.
├──  #  Repository overview, setup instructions, and usage guidelines.
├── Themes2D.xlsx  #  Excel file containing theme data, with keywords for HACT UK Data Standards classes, for visualization and further analysis.
├── Trustpilot Review Single Page Extractor.ipynb  #  Notebook for scraping reviews from a single Trustpilot page.
└── Trustpilot Review Extraction Compilation.ipynb  #  Notebook for systematically extracting across multiple Trustpilot pages.



  • Python 3.8+: Ensure you have Python installed. You can download it here.

Clone the Repository

git clone
cd trustpilot-housing-reviews-analysis


Data Extraction

The data extraction component scrapes Trustpilot for reviews related to specified housing providers.

Configure Housing Providers

Edit the file to specify the housing providers you want to analyze.

# Example
housing_providers = [

Run the Data Extraction Tool

You can run the data extraction tool using the provided script or via a Jupyter notebook.

Using Jupyter Notebook:

Open Trustpilot Review Extraction Compilation.ipynb and run the cells sequentially.


The classification component processes the extracted reviews and categorizes them based on predefined criteria.

Using Jupyter Notebook:

Open Housing Association Review Classification and Theme Visualization.ipynb and run the cells sequentially.


Required Python packages are:

  • Requests: HTTP library for web scraping.
  • BeautifulSoup4: Web scraping.
  • pandas: Data manipulation and analysis.
  • scikit-learn: Machine learning for classification.
  • Matplotlib: Data visualization.
  • Seaborn: Data visualization.
  • Jupyter Notebook: Interactive development.


Contributions are welcome! Please follow these steps:

  1. Fork the Repository

  2. Create a Feature Branch

    git checkout -b feature/YourFeature
  3. Commit Your Changes

    git commit -m "Add some feature"
  4. Push to the Branch

    git push origin feature/YourFeature
  5. Open a Pull Request


This project is licensed under the CC0-1.0 (LICENSE).


For any questions or suggestions, please open an issue or contact

Disclaimer: This project is not affiliated with Trustpilot or any of the housing providers mentioned. It is intended for educational and analytical purposes only.


Analysis of public reviews of housing providers in Manchester, using Natural Language Processing AI








No releases published


No packages published