Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ccomkhj committed Nov 8, 2023
1 parent 978d1a9 commit 84a1e04
Show file tree
Hide file tree
Showing 3 changed files with 128 additions and 55 deletions.
97 changes: 43 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,43 @@
# constrained-linear-regression
[![PyPI version](https://badge.fury.io/py/constrained-linear-regression.svg)](https://badge.fury.io/py/constrained-linear-regression)

This is a Python implementation of constrained linear regression in scikit-learn style.
The current version supports upper and lower bound for each slope coefficient.

It was developed after this question https://stackoverflow.com/questions/50410037

Installation:
```pip install constrained-linear-regression```

You can use this model, for example, if you want all coefficients to be non-negative:

```Python
from constrained_linear_regression import ConstrainedLinearRegression
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
X, y = load_boston(return_X_y=True)
model = ConstrainedLinearRegression(nonnegative=True)
model.fit(X, y)
print(model.intercept_)
print(model.coef_)
```
The output will be like
```commandline
-36.99292986145538
[0. 0.05286515 0. 4.12512386 0. 8.04017956
0. 0. 0. 0. 0. 0.02273805
0. ]
```
You can also impose arbitrary bounds for any coefficients you choose
```Python
model = ConstrainedLinearRegression()
min_coef = np.repeat(-np.inf, X.shape[1])
min_coef[0] = 0
min_coef[4] = -1
max_coef = np.repeat(4, X.shape[1])
max_coef[3] = 2
model.fit(X, y, max_coef=max_coef, min_coef=min_coef)
print(model.intercept_)
print(model.coef_)
```
The output will be
```commandline
24.060175576410515
[ 0. 0.04504673 -0.0354073 2. -1. 4.
-0.01343263 -1.17231216 0.2183103 -0.01375266 -0.7747823 0.01122374
-0.56678676]
```

You can also set coefficients `lasso` and `ridge` if you want to apply the
corresponding penalties. For `lasso`, however, the output might not be exactly
equal to the result of `sklearn.linear_model.Lasso` due to the difference
in the optimization algorithm.
# Multi-Constrained Regression and Neural Network Repository

## Overview

This repository is dedicated to hosting and sharing advanced techniques in machine learning algorithms, particularly focusing on constraining the weights of certain inputs in regression and multi-layer perceptron. Inspired by the robust scikit-learn library, we have ventured into reverse engineering and extending its capabilities to fit custom requirements for specific types of learning problems.

## Purpose

The purpose of this repository is to provide a resource for machine learning practitioners looking to impose constraints on the input features' weights, which could be critical in certain domains such as finance, healthcare, and operational research. The reverse-engineered solutions herein allow for greater control over the machine learning model's behavior, ensuring that the influence of some features remains within desired boundaries.

## Tutorials

We provide detailed tutorials for the following topics:

- **Multi-Constrained Linear Regression**: This tutorial takes you through the steps of creating a linear regression model that allows constraints to be placed on the weights of multiple input features.
- [Multi-Constrained Linear Regression Tutorial](tutorial/MultiConstrainedLinearRegression.md)

- **Multi-Constrained Multi-Layer Perceptron**: Explore the implementation of a multi-layer perceptron (neural network) that incorporates constraints on the weights corresponding to specific input features.
- [Multi-Constrained Multi-Layer Perceptron Tutorial](tutorial/MultiConstrainedMultiLayerPerceptron.md)

## Features

- Reverse engineering techniques applied to scikit-learn's Linear Regression and MLP models
- Custom weight constraint functionalities
- Step-by-step tutorials for implementing the above models

## Getting Started

To get started with these tutorials and code, you should clone the repository and navigate to the `tutorial` directory where you can find the markdown files with detailed explanations and code samples.

```bash
git clone https://github.com/your-github-username/multi-constrained-models.git
cd multi-constrained-models/tutorial
Contributing
We welcome contributions from the community! Whether it's improving the tutorials, extending the features of the models, or fixing bugs, please feel free to fork the repo, make your changes, and submit a pull request.
Acknowledgments
Thanks to the scikit-learn developers for their work on creating a comprehensive machine learning library.
This project was inspired by the need for industry-specific machine learning models that require tailored constraints.
Contact
If you have any questions or feedback, please open an issue in the repository, and we'll get back to you as soon as possible.

This template provides a structure that explains the purpose, features, tutorials, and contribution guide for your repository. You would want to replace placeholder links and text (such as `your-github-username`) with the actual ones corresponding to your GitHub repository details. Also, you need to ensure that the referenced files (`LICENSE.md`, etc.) exist and are in the correct locations within your repository.
2 changes: 1 addition & 1 deletion tutorial/MultiConstrainedLinearRegression.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ max_coef = np.ones((horizon, total_lags))*np.inf
max_coef[:,0] = 0 # don't use target
max_coef[1:,6] = 0
max_coef[2:,1] = 0
...
max_coef[:,6] = 0 # ignore the last
```
```Python
Expand All @@ -84,3 +83,4 @@ for i, estimator in enumerate(model.model.estimators_):
In this example, the `MultiConstrainedLinearRegression` model is used with Darts' `RegressionModel` for sequence forecasts. We apply different constraints for each day within the forecast horizon, controlled by the `min_coef` and `max_coef` parameters.

This gives us the flexibility to independently control the linearity of the regression model for different periods within the forecast horizon.

84 changes: 84 additions & 0 deletions tutorial/MultiConstrainedMultiLayerPerceptron.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# constrained-linear-regression with Darts

This package also provides out-of-box compatibility with [Unit8's Darts](https://unit8co.github.io/darts/), a forecasting library. Darts supports a rich variety of models including but not limited to ARIMA, Prophet, Theta, and a host of others.

One of the major features of `MultiConstrainedMultiLayerPerceptron` when used with Darts, is the ability to set different constraints for coefficients for each custom horizon.


Here's an example using dummy data:

```Python
import numpy as np
import pandas as pd
from constrained_linear_regression.multi_constrained_multi_layer_perceptron import MultiConstrainedMultilayerPerceptron
from darts.models import RegressionModel
from darts.timeseries import TimeSeries
from darts.dataprocessing.transformers import Scaler

# Create two pandas dataframes df_feature and df_target
date_rng = pd.date_range(start='1/1/2021', end='3/31/2021', freq='D')
df_feature = pd.DataFrame(date_rng, columns=['ds'])
df_feature['feature1'] = np.random.randint(0,100,size=(len(date_rng)))
df_feature['feature2'] = np.random.randint(0,100,size=(len(date_rng)))
df_feature['feature3'] = np.random.randint(0,100,size=(len(date_rng)))
df_feature['feature4'] = np.random.randint(0,100,size=(len(date_rng)))

df_target = pd.DataFrame(date_rng, columns=['ds'])
df_target['y'] = np.random.randint(0,100,size=(len(date_rng)))

# Load dataframes into time series
series_feature = TimeSeries.from_dataframe(df_feature, time_col='ds')
series_target = TimeSeries.from_dataframe(df_target, time_col='ds')

lag_feature = {
"feature1": [-1,-2,-3],
"feature2": [-1,-2],
"feature3": 1,
"feature4": 1
}

def get_total_lags(lag_feature):
total_lags = 1 # Initialized to 1 for the original target
for key, value in lag_feature.items():
if isinstance(value, list):
total_lags += len(value) # For lists, add the number of lags specified
else:
total_lags += 1 # For integers or other types, consider it as a single lag
return total_lags

total_lags = get_total_lags(lag_feature) # numer of features
horizon = 14 # forecast horizon days

custom_model = MultiConstrainedMultilayerPerceptron(solver = 'lbfgs', hidden_layer_sizes=4, batch_size='auto', activation='relu')

model = RegressionModel(lags=1, output_chunk_length=horizon, lags_past_covariates=lag_feature, model=custom_model, multi_models=True,
add_encoders={'transformer': Scaler()})

```
```Python
# Here we set different minimum and maximum constraints for coefficients for each horizon.
min_coef = np.zeros((horizon, total_lags))
max_coef = np.ones((horizon, total_lags))*np.inf

# Custom constraints
max_coef[:,0] = 0 # don't use target
max_coef[1:,6] = 0
max_coef[2:,1] = 0
max_coef[:,6] = 0 # ignore the last
```
```Python
model.fit(series_target, past_covariates=series_feature, min_coef=min_coef, max_coef=max_coef)
model.model.estimator.reset() # Reset global variable between multi_models

# Predict the next 14 days, using the past 14 days.
pred = model.predict(n=horizon, past_covariates=series_feature)

# Output the learned coefficients and intercepts
for i, estimator in enumerate(model.model.estimators_):
print(f"The coefficients for day {i+1} are {estimator.coefs_}")
print(f"The intercept for day {i+1} is {estimator.intercepts_}")
```
In this example, the `MultiConstrainedMultilayerPerceptron` model is used with Darts' `RegressionModel` for sequence forecasts. We apply different constraints for each day within the forecast horizon, controlled by the `min_coef` and `max_coef` parameters.

This gives us the flexibility to independently control the linearity of the regression model for different periods within the forecast horizon.

0 comments on commit 84a1e04

Please sign in to comment.