This project leveraged machine learning and bootstrapping to identify an optimal region among three options for fictional energy company OilyGiant’s expansion, focusing on maximizing profit and minimizing risk. Using a linear regression model and a dataset of 100,000 data points, Region 2 emerged as the best choice, with an average potential profit exceeding $4 million, a 95% confidence interval predicting positive returns, and only a 1.8% risk of loss. These findings provide a data-driven framework for OilyGiant to allocate resources effectively and maximize profitability.
🐍 Python and sklearn 🤖 Machine Learning and Cross-Validation 👩🏽💻 Data Collection and Labelling 💰 Business Metrics: Calculating Revenue, Operating Profit, Margin, and Return on Investment 📊 Statistical Methods: Bootstrapping and Confidence Intervals 💿 Data Sources
- This project uses pandas, numpy, train_test_split, StandardScaler, shuffle, LinearRegression, accuracy_score, mean_squared_error, and matplotlib.pyplot. It requires python 3.11.