Skip to content

This exploratory data analysis (EDA) delves deep into a comprehensive dataset of hotel bookings, uncovering valuable insights into customer behaviour and market trends. By employing a rigorous statistical analysis and a variety of data visualization techniques, we have gained a nuanced understanding the factors influencing hotel booking decision.

Notifications You must be signed in to change notification settings

Debarpan200/EDA-Hotel-Booking-Analysis

Repository files navigation

Booking.com:Hotel Booking Analysis - EDA

Objective

To conduct a comprehensive exploratory data analysis (EDA) on the provided hotel booking dataset to identify key trends, patterns, and correlations among various factors influencing hotel bookings.

Dataset

 We get a dataset of hotel reservations. A city hotel and a resort hotel's reservations are included in this dataset. It has the following features:
 - hotel: Name of hotel ( City or Resort)
 - is_canceled: Whether the booking is canceled or not (0 for no canceled and 1 for canceled)
 - lead_time: time (in days) between booking transaction and arrival.
 - arrival_date_year: Year of arrival
 - arrival_date_month: month of arrival
 - arrival_date_week_number: week number of arrival date.
 - arrival_date_day_of_month: Day of month of arrival date
 - stays_in_weekend_nights: No. of weekend nights spent in a hotel
 - stays_in_week_nights: No. of weeknights spent in a hotel
 - adults: No. of adults in single booking record.
 - children: No. of children in single booking record.
 - babies: No. of babies in a single booking record. 
 - meal: Type of meal chosen 
 - country: Country of origin of customers
 - market_segment: What segment via booking was made and for what purpose?
 - distribution_channel: Via which medium booking was made.
 - is_repeated_guest: Whether the customer has made any booking before(0 for No and 1 for Yes)
 - previous_cancellations: No. of previous canceled bookings.
 - previous_bookings_not_canceled: No. of previous non-canceled bookings.
 - reserved_room_type: Room type reserved by a customer.
 - assigned_room_type: Room type assigned to the customer.
 - booking_changes: No. of booking changes done by customers
 - deposit_type: Type of deposit at the time of making a booking (No deposit/ Refundable/ No refund)
 - agent: Id of agent for booking
 - company: Id of the company making a booking
 - days_in_waiting_list: No. of days on waiting list.
 - customer_type: Type of customer(Transient, Group, etc.)
 - adr: Average Daily rate.
 - required_car_parking_spaces: No. of car parking asked in booking
 - total_of_special_requests: total no. of special request.
 - reservation_status: Whether a customer has checked out or canceled,or not showed 
 - reservation_status_date: Date of making reservation status.

Total 119390 rows and 32 columns in dataset

Libraries Used

 - Pandas for data manipulation, aggregation
 - Matplotlib and Seaborn for visualisation and behaviour with respect to the target variable.
 - NumPy for computationally efficient operations

Data Cleaning

 - Duplicate records were eliminated,
 - correcting improper data type format which was used in the data, and 
 - missing values were addressed.  

Exploratory Data Analysis

 - Analyzing the Data
 - Plotting the DataFrame in the HeatMap to visualize the total count of null values in each column and also if there is any relationship between null values.
 - Replacing null values from Children, Country, & Agents and Dropped Country Column
 - Correcting data type
 - Removing Duplicate Rows
 - Data Visualization

 The following graphs and plots were primarily created using Matplotlib and the Seaborn package.
  - Bar Plots: Compared categorical variables against numerical variables to identify trends.
  - Heatmaps: Visualized the correlation matrix to understand the relationships between numerical variables.
  - Pair Plots: Explored pairwise relationships between some numerical variables.
  - Line Plots + Scatter Plot: line plots are used in showing the trands/count over time period and Scatterplot is used in specifing the data points.

Questions which are answered through Visualization

  - What are the total number of hotel booked Yearly?
  - Total Monthly Booking in 2015, 2016 & 2017?
  - Which Type of Customer Booked Maximum Hotels?
  - Which Top Country Makes the most reservation?
  - Which Hotel type is the Busiest?
  - Total number of reservation cancelled in each type of Hotel?
  - Which Market Segment has maximum and Minimum number of booking?
  - Top 5 agent makes the most number of booking.
  - How many customer don't wish to make a booking with a pre-deposit.
  - Total Number of Repeated Guest both hotels combined. 
  - Correlation Heatmap of Numerical columns
  - Pair plot showing correlation between Guests and Booking Length.

Conclusion

 - The top country with the most number of bookings is PRT.
 - The number one agent with the most number of bookings is 9. 
 - The Maximum hotel bookings are made by new guests. Only less then 5% guests returned.
 - The Online (internet) platform is used to make the majority of bookings.
 - A city hotel is busier than a resort.
 - The busiest months for hotels are August, September and October.
 - Customers do not wish to make a bookings with a pre-deposit.
 - City Hotel have maximum number of booking cancelled as compared to Resort Hotel
 - Guests often book multiple nights, especially for weekends.
 - Longer lead times can result in longer wait times.

Challenges Faced

- The data contained a large number of duplicates.
- It was challenging to select the best visualization techniques.
- The dataset contained a large number of null values.
- The improper data type format was used for the data.

About

This exploratory data analysis (EDA) delves deep into a comprehensive dataset of hotel bookings, uncovering valuable insights into customer behaviour and market trends. By employing a rigorous statistical analysis and a variety of data visualization techniques, we have gained a nuanced understanding the factors influencing hotel booking decision.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published