Leveraged 2021-2023 NFL play-by-play database and utilized Python for data scraping from PFF and Pro Football Reference, statistical modeling, and machine learning techniques. Implemented data pre-processing, feature selection, and cross-validation for predictive accuracy. • Developed models to estimate 4th down conversion probabilities, combining play-by-play data, team rankings, and situational variables. Created decision-making framework that considers various scenarios, conditions, and micro variables to deliver a verdict on whether a conversion attempt will be successful. • Leveraged logistic regression, feature importance selection, and random forests to predict binary outcomes, using football context to optimize model features, and understanding team dynamics, field conditions, and strategy.
This analysis uses the nfl_data_py library. nfl_data_py: https://github.com/cooperdff/nfl_data_py