Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added ridge_regression.py #12553

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions machine_learning/ridge_regression.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
import numpy as np
from matplotlib import pyplot as plt
from sklearn import datasets

Check failure on line 3 in machine_learning/ridge_regression.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (I001)

machine_learning/ridge_regression.py:1:1: I001 Import block is un-sorted or un-formatted

# Ridge Regression function
# reference : https://en.wikipedia.org/wiki/Ridge_regression
def ridge_cost_function(x: np.ndarray, y: np.ndarray, theta: np.ndarray, alpha: float) -> float:

Check failure on line 7 in machine_learning/ridge_regression.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

machine_learning/ridge_regression.py:7:89: E501 Line too long (96 > 88)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_cost_function

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_cost_function

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_cost_function

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_cost_function

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

"""
Compute the Ridge regression cost function with L2 regularization.

J(θ) = (1/2m) * Σ (y_i - hθ(x))^2 + (a/2) * Σ θ_j^2 (for j=1 to n)

Where:
- J(θ) is the cost function we aim to minimize
- m is the number of training examples
- hθ(x) = X * θ (prediction)
- y_i is the actual target value for example i
- a is the regularization parameter

@param X: The feature matrix (m x n)
@param y: The target vector (m,)
@param theta: The parameters (weights) of the model (n,)
@param alpha: The regularization parameter

@returns: The computed cost value
"""
m = len(y)
predictions = np.dot(x, theta)
cost = (1 / (2 * m)) * np.sum((predictions - y) ** 2) + \
(alpha / 2) * np.sum(theta[1:] ** 2)

return cost

def ridge_gradient_descent(x: np.ndarray, y: np.ndarray, theta: np.ndarray, alpha: float, learning_rate: float, max_iterations: int) -> np.ndarray:

Check failure on line 34 in machine_learning/ridge_regression.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

machine_learning/ridge_regression.py:34:89: E501 Line too long (147 > 88)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_gradient_descent

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_gradient_descent

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_gradient_descent

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function ridge_gradient_descent

Please provide descriptive name for the parameter: x

Please provide descriptive name for the parameter: y

"""
Perform gradient descent to minimize the cost function and fit the Ridge regression model.

Check failure on line 36 in machine_learning/ridge_regression.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

machine_learning/ridge_regression.py:36:89: E501 Line too long (94 > 88)

@param X: The feature matrix (m x n)
@param y: The target vector (m,)
@param theta: The initial parameters (weights) of the model (n,)
@param alpha: The regularization parameter
@param learning_rate: The learning rate for gradient descent
@param max_iterations: The number of iterations for gradient descent

@returns: The optimized parameters (weights) of the model (n,)
"""
m = len(y)

for iteration in range(max_iterations):
predictions = np.dot(x, theta)
error = predictions - y

# calculate the gradient
gradient = (1 / m) * np.dot(x.T, error)
gradient[1:] += (alpha / m) * theta[1:]
theta -= learning_rate * gradient

if iteration % 100 == 0:
cost = ridge_cost_function(x, y, theta, alpha)
print(f"Iteration {iteration}, Cost: {cost}")

return theta



if __name__ == "__main__":
import doctest
doctest.testmod()

# Load California Housing dataset
california_housing = datasets.fetch_california_housing()
x = california_housing.data[:, :2] # 2 features for simplicity
y = california_housing.target
x = (x - np.mean(x, axis=0)) / np.std(x, axis=0)

# Add a bias column (intercept) to X
x = np.c_[np.ones(x.shape[0]), x]

# Initialize parameters (theta)
theta_initial = np.zeros(x.shape[1])

# Set hyperparameters
alpha = 0.1
learning_rate = 0.01
max_iterations = 1000

optimized_theta = ridge_gradient_descent(x, y, theta_initial, alpha, learning_rate, max_iterations)

Check failure on line 87 in machine_learning/ridge_regression.py

View workflow job for this annotation

GitHub Actions / ruff

Ruff (E501)

machine_learning/ridge_regression.py:87:89: E501 Line too long (103 > 88)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error occurred while parsing the file: machine_learning/ridge_regression.py

Traceback (most recent call last):
  File "/opt/render/project/src/algorithms_keeper/parser/python_parser.py", line 146, in parse
    reports = lint_file(
              ^^^^^^^^^^
libcst._exceptions.ParserSyntaxError: Syntax Error @ 99:3.
parser error: error at 98:2: expected one of (, *, +, -, ..., AWAIT, EOF, False, NAME, NUMBER, None, True, [, break, continue, elif, else, lambda, match, not, pass, ~

    optimized_theta = ridge_gradient_descent(x, y, theta_initial, alpha, learning_rate, max_iterations)
  ^

print(f"Optimized theta: {optimized_theta}")

# Prediction
def predict(x, theta):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: predict. If the function does not return a value, please provide the type hint as: def function() -> None:

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function predict

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Please provide type hint for the parameter: theta

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: predict. If the function does not return a value, please provide the type hint as: def function() -> None:

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function predict

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Please provide type hint for the parameter: theta

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide return type hint for the function: predict. If the function does not return a value, please provide the type hint as: def function() -> None:

As there is no test file in this pull request nor any test function or class in the file machine_learning/ridge_regression.py, please provide doctest for the function predict

Please provide descriptive name for the parameter: x

Please provide type hint for the parameter: x

Please provide type hint for the parameter: theta

return np.dot(x, theta)
y_pred = predict(x, optimized_theta)

# Plotting the results (here we visualize predicted vs actual values)
plt.figure(figsize=(10, 6))
plt.scatter(y, y_pred, color='b', label='Predictions vs Actual')
plt.plot([min(y), max(y)], [min(y), max(y)], color='r', label='Perfect Fit')
plt.xlabel("Actual values")
plt.ylabel("Predicted values")
plt.title("Ridge Regression: Actual vs Predicted Values")
plt.legend()
plt.show()
Loading