Flight-Booking-Price-Prediction

Project Objective

The objective is to analyse the flight booking dataset obtained from a platform which is used to book flight tickets. A thorough study of the data will aid in the discovery of valuable insights that will be of enormous value to passengers. Apply EDA, statistical methods and Machine learning algorithms in order to get meaningful information from it.

Data Description

Dataset Information: Flight booking price prediction dataset contains around 3 lacs records with 11 attributes

Data Pre-processing steps and inspiration

a. Loading the Dataset: We Have Loaded the dataset using pandas.

b. Checking for Data Types: It is imperative to inspect the data types of each column to ensure consistency and appropriateness for subsequent analyses and operations.

c. Preprocessing Data: In the Preprocessing step we have Inspected the dataset. And removed the columns which unwanted. and we found that there are no missing values in the dataset. And we have done Label Encoding in order to do statical analysis. And machine learning model implementation. After that we have standardized the data which feuded to the ML models to get better performance of the model.

d. Handling Outliers: There is an outlier in the ‘Duration’, ‘days left’, ‘price’ columns. And we have removed the outliers. With IQR Range.

Inferences from the Data

So here our target is to predict price. so, our EDA will also be done by targeting the price column which is dependent on other independent variables.

1) AIRLINE V/S PRICE

• Air India and Vistara Has Highest Ticket Price Compared to other airlines.

• Remaining airlines prices are more or the same.

2) FLIGHT V/S PRICE

• Most of the flights are in range up to 10000 price range

3) SOURCE CITY V/S PRICE

• Most of the Cities are in same range of price

4) STOPS V/S PRICE

• the price for ‘one’ stop is higher than others

5) DEPARTURE TIME V/S PRICE

• Most of the Departure Time are in same range of price.

6) ARRIVAL TIME V/S PRICE

• Evening and Morning Prices are high and Late night is having low prices

7) DESTINATION CITY V/S PRICE

• The Delhi has low price compared to others.

• Remain all are more or the same range.

8) CLASS V/S PRICE

• The Business class is having high price than the Economy.

9) DURATION V/S PRICE

• as The Duration Increase the prices also increasing.

• But there is no straight forward Correlation. Some durations having low prices also.

• Majority of the prices are increasing as duration increase

10) DAY’S LEFT V/S PRICE

• as The Day's left are increasing the prices falling down.

• It is indicating that early bookings are good to save money.

11) CATEGORICAL FEATURES

• AIRLINE: The Vistara and air India has high frequency respect to the count. SpiceJet is has Low Frequency

• SOURCE CITY: The Delhi and Mumbai has high frequency respect to the count. Chennai is having Low Frequency

• DEPARTURE TIME: Early Morning and Morning has high frequency respect to the count. Late Night is having Low Frequency.

• STOPS: One Stop has high frequency respect to the count.

• ARRIVAL TIME: Night and Evening has high frequency respect to the count. Late Night is having Low Frequency

• DESTINATION CITY: Delhi and Mumbai have high frequency respect to the count. Chennai is having Low Frequency.

• CLASS: Economy Class high Frequency.

Choosing the algorithm for the project

Here we have chosen different model’s that can predict the price of the flight ticket booking Models listed below:

1. Linear Regression

2. Decision Tree Regressor

3. Random Forest Regressor

• Motive for all this model is to predict the ticket price.

• In this model our independent features would be all expect flight and price.

• And we will evaluate the model performance with the help of r2 score, MAE, MAPE, MSE, RMSE. Root Mean square error (RMSE).

Motivation and reasons for choosing Model’s

1. Linear Regression

• Motive for choosing this model is to predict the target which is continues in nature. Which is regression problem.

2. Decision Tree Regressor

• Decision tree models are adept at handling classification and regression problems.

• by recursively partitioning the input space into regions, making predictions based on the majority class or average value within each region.

• This allows them to handle both categorical and numerical data, making them versatile for a wide range of predictive tasks in various domains.

3. Random Forest Regressor

• Random Forest offers high predictive accuracy by averaging predictions from multiple decision trees, making it robust to overfitting.

• It handles non-linear relationships well, provides feature importance insights, and is resilient to outliers and missing data. With its scalability, ability to handle large datasets, and no assumptions about data distribution, Random Forest is a versatile choice suitable for various machine learning tasks.

Conclusion

• Based on the analysis conducted and the model’s performance on the dataset, it can be concluded that the Liner Regression for predicting Ticket Prices and Decision Tree Regressor and Random Forest Regressor will be fit and suitable for predicting the flight ticket booking price.

In Linear Regression we got the evaluation matrix.

• r2_Score: 0.9049847760699258

• mean abs error: 4625.601159976593

• mean absolute percentage error: 0.43627317283189276

• mean sq error: 48931028.45225085

• RMSE: 6995.071726026178

In Decision Tree Regressor we got the evaluation matrix.

• r2_Score: 0.9761309874775854

• mean abs error: 1168.7175598163174

• mean absolute percentage error: 0.07422407497383977

• mean sq. error: 12292086.284203626

• RMSE: 3506.0071711569026

In Random Forest Regressor we got the evaluation matrix.

• r2_Score: 0.9852298788429149

• mean abs error: 1085.061065975238

• mean absolute percentage error: 0.07049862212289662

• mean sq. error: 7606330.740349579

• RMSE: 2757.9577118494003

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Flight_Booking.csv		Flight_Booking.csv
Project_Flight_Booking.ipynb		Project_Flight_Booking.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flight-Booking-Price-Prediction

Table Of Contents

Project Objective

Data Description

Data Pre-processing steps and inspiration

Inferences from the Data

So here our target is to predict price. so, our EDA will also be done by targeting the price column which is dependent on other independent variables.

1) AIRLINE V/S PRICE

2) FLIGHT V/S PRICE

3) SOURCE CITY V/S PRICE

4) STOPS V/S PRICE

5) DEPARTURE TIME V/S PRICE

6) ARRIVAL TIME V/S PRICE

7) DESTINATION CITY V/S PRICE

8) CLASS V/S PRICE

9) DURATION V/S PRICE

10) DAY’S LEFT V/S PRICE

11) CATEGORICAL FEATURES

Choosing the algorithm for the project

Motivation and reasons for choosing Model’s

Conclusion

In Linear Regression we got the evaluation matrix.

In Decision Tree Regressor we got the evaluation matrix.

In Random Forest Regressor we got the evaluation matrix.

Thank You!

About

Releases

Packages

Languages

DrPoojaAbhijith/Project-Predicting-flight-booking-prices.

Folders and files

Latest commit

History

Repository files navigation

Flight-Booking-Price-Prediction

Table Of Contents

Project Objective

Data Description

Data Pre-processing steps and inspiration

Inferences from the Data

So here our target is to predict price. so, our EDA will also be done by targeting the price column which is dependent on other independent variables.

1) AIRLINE V/S PRICE

2) FLIGHT V/S PRICE

3) SOURCE CITY V/S PRICE

4) STOPS V/S PRICE

5) DEPARTURE TIME V/S PRICE

6) ARRIVAL TIME V/S PRICE

7) DESTINATION CITY V/S PRICE

8) CLASS V/S PRICE

9) DURATION V/S PRICE

10) DAY’S LEFT V/S PRICE

11) CATEGORICAL FEATURES

Choosing the algorithm for the project

Motivation and reasons for choosing Model’s

Conclusion

In Linear Regression we got the evaluation matrix.

In Decision Tree Regressor we got the evaluation matrix.

In Random Forest Regressor we got the evaluation matrix.

Thank You!

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages