A curated list of awesome machine learning libraries for marketing. Inspired by both awesome-production-machine-learning and awesome-machine-learning, and created and maintained by Station 10.
Note that some packages could fit into more than one section. This has been noted in the descriptions so be sure to Ctrl + F as well as exploring by sections.
Want to contribute? Please raise a Pull Request or an issue. If you find this useful please drop a ⭐️. This helps motivate us and others to update and maintain the list.
All packages are Python based unless otherwise stated. We welcome contributions from R Users!
- ChannelAttribution
Python and R library that employs a k-order Markov representation to identify structural correlations in customer journey data.
- fractribution
Data driven MTA by Google.
- Marketing-Attribution-Models
Heuristic and data driven Multi Touch Attribution.
- markov-chain-attribution
Leverages a first order Markov chain to reallocate conversions.
- mta
Various data driven Multi Touch Attribution algorithms.
- pychattr
Python implementation of the excellent R ChannelAttribution library.
- shapley
Shapley Values For Attribution Modelling.
- shapley-attribution-model-zhao-naive
Shapley Value Methods for Attribution Modeling (Naive, Set-based).
- CausalImpact
(R) Causal Inference using Bayesian structural time-series models by Google.
- causalml
Uplift modeling and causal inference with ML by Uber.
- CausalPy
Causal Inference & Synthetic Control. Supports fitting with
scikit-learn
andPyMC
models. - dowhy
Causal Inference that supports explicit modeling and testing of causal assumptions.
- SyntheticControlMethods
Causal inference using Synthetic Control.
- tfcausalimpact
Google's CausalImpact Algorithm implemented on top of TensorFlow Probability.
- upliftml
Scalable unconstrained and constrained uplift modeling from experimental data using PySpark and H20.
- scikit-uplift
- Uplift modeling python package that provides fast sklearn-style models implementation, evaluation metrics and visualization tools.
- btyd
Buy Till You Die and CLV statistical models in Python.
- lifetimes
CLV and Churn modelling. Deprecated and incorporated into pymc-marketing.
- lucius-ltv
CLV for subscriptions.
- gapandas4
Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe.
- EconML
AI, Econometrics and Causal Inference modelling.
- statsmodels
Statistical modeling including time series and econometrics.
- trimmed_match
Ad effectiveness through the design and analysis of randomized Geo Experiments by Google.
- matched_markets
Time-Based regression matched markets approach for designing Geo Experiments by Google.
- GeoexperimentsResearch
(R) Open-source implementation of the geo experiment analysis methodology developed at Google (Archived)
- GeoLift
Geo Experimentation methodology based on Synthetic Control Methods used to measure lift of ad campaigns by Facebook.
- BayesianMMM
Bayesian Media Mix mMdelling with shape and carryover effect.
- dammmdatagen
(R) Media Mix Modeling Data Generator.
- lightweight-mmm
Bayesian Media Mix Models by Google.
- mamimo
Small Media Mix Models designed to be used in conjunction with ML libraries (e.g. SKL)
- mmm-stan
Multiplicative Media Media Mix Model.
- pymc-marketing
Bayesian Media Mix, Adstock, Saturation Customer Lifetime Value & Churn models.
- Robyn
(R) Bayesian Media Mix Models by Facebook.
- amazon-denseclus
Python module for clustering both categorical and numerical data using UMAP and HDBSCAN by Amazon.
- rfm
RFM Analysis and Customer Segmentation.
- retentioneering-tools
Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation
- ecommercetools
Data science toolkit for those working in technical ecommerce, marketing science, and technical seo and includes a wide range of features to aid analysis and model building.
- lightfm
Implementation of LightFM, a hybrid recommendation algorithm.
- openrec
Open-source and modular library for neural network-inspired recommendation algorithms.
- recmetrics
A library of metrics for evaluating recommender systems
- recommenders
Best Practices on Recommendation Systems by Microsoft.
- Surprise
Scikit for building and analyzing recommender systems that deal with explicit rating data.
- darts
Python library for user-friendly forecasting and anomaly detection on time series built using SKL conventions.
- gluonts
Probabilistic time series modeling, focusing on deep learning based models, based on PyTorch and MXNet.
- neural_prophet
Framework for interpretable time series forecasting built on PyTorch.
- orbit
Python package for Bayesian time series forecasting and inference by Uber.
- pmdarima
- Pmdarima is a statistical library designed to fill the void in Python's time series analysis capabilities.
- prophet
Additive time series modelling by Facebook.
- sktime
A unified framework for ML with Time Eeries.
- statsforecast
Lightning ⚡️ fast forecasting with statistical and econometric models.
- stumpy
STUMPY computes something called the matrix profile, which is just an academic way of saying "for every subsequence automatically identify its corresponding nearest-neighbor"
- temporian
Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖.
- tbats
BATS and TBATS time series forecasting
- tsfresh
Time Series Feature extraction based on scalable hypothesis tests.
- tslearn
The machine learning toolkit for time series analysis in Python.
- lifelines
lifelines is a pure Python implementation of the best parts of survival analysis.
- pysurvival
An open source python package for Survival Analysis modeling.
- scikit-survival
Survival analysis built on top of scikit-learn.