Skip to content

Commit

Permalink
add Penalties and Regularized Least Squares
Browse files Browse the repository at this point in the history
format

use :cholesky and dropcollinear like in GLM

correctly use QR

test method :cholesky

change types

clean after rebase
  • Loading branch information
getzze committed Sep 26, 2024
1 parent af96ef1 commit d3207a9
Show file tree
Hide file tree
Showing 17 changed files with 4,124 additions and 21 deletions.
97 changes: 81 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ This package implements:
* MQuantile regression (e.g. Expectile regression)
* Robust Ridge regression (using any of the previous estimator)
* Quantile regression using interior point method
* Regularized Least Square regression
* Θ-IPOD regression, possibly with a penalty term

## Installation

Expand All @@ -55,14 +57,23 @@ julia>] add RobustModels#main

## Usage

The prefered way of performing robust regression is by calling the `rlm` function:
The preferred way of performing robust regression is by calling the `rlm` function:

`m = rlm(X, y, MEstimator{TukeyLoss}(); initial_scale=:mad)`

For quantile regression, use `quantreg`:

`m = quantreg(X, y; quantile=0.5)`

For Regularized Least Squares and a penalty term, use `rlm`:

`m = rlm(X, y, L1Penalty(); method=:cgd)`

For Θ-IPOD regression with outlier detection and a penalty term, use `ipod`:

`m = ipod(X, y, L2Loss(), SquaredL2Penalty(); method=:auto)`


For robust version of `mean`, `std`, `var` and `sem` statistics, specify the estimator as first argument.
Use the `dims` keyword for computing the statistics along specific dimensions.
The following functions are also implemented: `mean_and_std`, `mean_and_var` and `mean_and_sem`.
Expand Down Expand Up @@ -119,6 +130,16 @@ m9 = rlm(form, data, MEstimator{TukeyLoss}(); initial_scale=:L1, ridgeλ=1.0)
## Quantile regression solved by linear programming interior point method
m10 = quantreg(form, data; quantile=0.2)
refit!(m10; quantile=0.8)

## Penalized regression
m11 = rlm(form, data, SquaredL2Penalty(); method=:auto)

## Θ-IPOD regression with outlier detection
m12 = ipod(form, data, TukeyLoss(); method=:auto)

## Θ-IPOD regression with outlier detection and a penalty term
m13 = ipod(form, data, L2Loss(), L1Penalty(); method=:ama)

;

# output
Expand All @@ -136,7 +157,7 @@ With ordinary least square (OLS), the objective function is, from maximum likeli
where `yᵢ` are the values of the response variable, `𝒙ᵢ` are the covectors of individual covariates
(rows of the model matrix `X`), `𝜷` is the vector of fitted coefficients and `rᵢ` are the individual residuals.

A `RobustLinearModel` solves instead for the following loss function: `L' = Σᵢ ρ(rᵢ)`
A `RobustLinearModel` solves instead for the following loss function [1]: `L' = Σᵢ ρ(rᵢ)`
(more precisely `L' = Σᵢ ρ(rᵢ/σ)` where `σ` is an (robust) estimate of the standard deviation of the residual).
Several loss functions are implemented:

Expand All @@ -150,7 +171,7 @@ Several loss functions are implemented:
- `CauchyLoss`: `ρ(r) = log(1+(r/c)²)`, non-convex estimator, that also corresponds to a Student's-t distribution (with fixed degree of freedom). It suppresses outliers more strongly but it is not sure to converge.
- `GemanLoss`: `ρ(r) = ½ (r/c)²/(1 + (r/c)²)`, non-convex and bounded estimator, it suppresses outliers more strongly.
- `WelschLoss`: `ρ(r) = ½ (1 - exp(-(r/c)²))`, non-convex and bounded estimator, it suppresses outliers more strongly.
- `TukeyLoss`: `ρ(r) = if r<c; ⅙(1 - (1-(r/c)²)³) else ⅙ end`, non-convex and bounded estimator, it suppresses outliers more strongly and it is the prefered estimator for most cases.
- `TukeyLoss`: `ρ(r) = if r<c; ⅙(1 - (1-(r/c)²)³) else ⅙ end`, non-convex and bounded estimator, it suppresses outliers more strongly and it is the preferred estimator for most cases.
- `YohaiZamarLoss`: `ρ(r)` is quadratic for `r/c < 2/3` and is bounded to 1; non-convex estimator, it is optimized to have the lowest bias for a given efficiency.

The value of the tuning constants `c` are optimized for each estimator so the M-estimators have a high efficiency of 0.95. However, these estimators have a low breakdown point.
Expand Down Expand Up @@ -178,9 +199,20 @@ the two loss functions should be the same but with different tuning constants.
### MQuantile-estimators

Using an asymmetric variant of the `L1Estimator`, quantile regression is performed
(although the `QuantileRegression` solver should be prefered because it gives an exact solution).
Identically, with an M-estimator using an asymetric version of the loss function,
a generalization of quantiles is obtained. For instance, using an asymetric `L2Loss` results in _Expectile Regression_.
(although the `QuantileRegression` solver should be preferred because it gives an exact solution).
Identically, with an M-estimator using an asymmetric version of the loss function,
a generalization of quantiles is obtained. For instance, using an asymmetric `L2Loss` results in _Expectile Regression_.

### Quantile regression

_Quantile regression_ results from minimizing the following objective function:
`L = Σᵢ wᵢ|yᵢ - 𝒙ᵢ 𝜷| = Σᵢ wᵢ(rᵢ) |rᵢ|`,
where `wᵢ = ifelse(rᵢ>0, τ, 1-τ)` and `τ` is the quantile of interest. `τ=0.5` corresponds to _Least Absolute Deviations_.

This problem is solved exactly using linear programming techniques,
specifically, interior point methods using the internal API of [Tulip](https://github.com/ds4dm/Tulip.jl).
The internal API is considered unstable, but it results in a much lighter dependency than
including the [JuMP](https://github.com/JuliaOpt/JuMP.jl) package with Tulip backend.

### Robust Ridge regression

Expand All @@ -192,27 +224,60 @@ By default, all the coefficients (except the intercept) have the same penalty, w
all the feature variables have the same scale. If it is not the case, use a robust estimate of scale
to normalize every column of the model matrix `X` before fitting the regression.

### Quantile regression
### Regularized Least Squares

_Quantile regression_ results from minimizing the following objective function:
`L = Σᵢ wᵢ|yᵢ - 𝒙ᵢ 𝜷| = Σᵢ wᵢ(rᵢ) |rᵢ|`,
where `wᵢ = ifelse(rᵢ>0, τ, 1-τ)` and `τ` is the quantile of interest. `τ=0.5` corresponds to _Least Absolute Deviations_.
_Regularized Least Squares_ regression is the solution of the minimization of following objective function:
`L = ½ Σᵢ |yᵢ - 𝒙ᵢ 𝜷|² + P(𝜷)`,
where `P(𝜷)` is a (sparse) penalty on the coefficients.

The following penalty function are defined:
- `NoPenalty`: `cost(𝐱) = 0`, no penalty.
- `SquaredL2Penalty`: `cost(𝐱) = λ ½||𝐱||₂²`, also called Ridge.
- `L1Penalty`: `cost(𝐱) = λ||𝐱||₁`, also called LASSO.
- `ElasticNetPenalty`: `cost(𝐱) = λ . l1_ratio .||𝐱||₁ + λ .(1 - l1_ratio) . ½||𝐱||₂²`.
- `EuclideanPenalty`: `cost(𝐱) = λ||𝐱||₂`, non-separable penalty used in Group LASSO.

Different penalties can be applied to different indices of the coefficients, using
`RangedPenalties(ranges, penalties)`. E.g., `RangedPenalties([2:5], [L1Penalty(1.0)])` defines
a L1 penalty on every coefficients except the first index, which can correspond to the intercept.

With a penalty, the following solvers are available (instead of the other ones):
- `:cgd`, Coordinate Gradient Descent [2].
- `:fista`, Fast Iterative Shrinkage-Thresholding Algorithm [3].
- `:ama`, Alternating Minimization Algorithm [4].
- `:admm`, Alternating Direction Method of Multipliers [5].

To use a robust loss function with a penalty, see Θ-IPOD regression.

### Θ-IPOD regression

_Θ-IPOD regression_ (Θ-thresholding based Iterative Procedure for Outlier Detection) results from
minimizing the following objective function [6]:
`L = ½ Σᵢ |yᵢ - 𝒙ᵢ 𝜷 - γᵢ|² + P(𝜷) + Q(γ)`,
where `Q(γ)` is a penalty function on the outliers `γ` that is sparse so the problem is not underdetermined.
We don't need to know the expression of this penalty function, just that it leads to thresholding using
one of the loss function used by M-Estimators. Then Θ-IPOD is equivalent to solving an M-Estimator.
This problem is solved using an alternating minimization technique, for the outlier detection.
Without penalty, the coefficients are updated at every step using a solver for _Ordinary Least Square_.

`P(𝜷)` is an optional (sparse) penalty on the coefficients.

This problem is solved exactly using linear programming techniques,
specifically, interior point methods using the internal API of [Tulip](https://github.com/ds4dm/Tulip.jl).
The internal API is considered unstable, but it results in a much lighter dependency than
including the [JuMP](https://github.com/JuliaOpt/JuMP.jl) package with Tulip backend.

## Credits

This package derives from the [RobustLeastSquares](https://github.com/FugroRoames/RobustLeastSquares.jl)
package for the initial implementation, especially for the Conjugate Gradient
solver and the definition of the M-Estimator functions.

Credits to the developpers of the [GLM](https://github.com/JuliaStats/GLM.jl)
Credits to the developers of the [GLM](https://github.com/JuliaStats/GLM.jl)
and [MixedModels](https://github.com/JuliaStats/MixedModels.jl) packages
for implementing the Iteratively Reweighted Least Square algorithm.

## References

- "Robust Statistics: Theory and Methods (with R)", 2nd Edition, 2019, R. Maronna, R. Martin, V. Yohai, M. Salibián-Barrera
[1] "Robust Statistics: Theory and Methods (with R)", 2nd Edition, 2019, R. Maronna, R. Martin, V. Yohai, M. Salibián-Barrera
[2] "Regularization Paths for Generalized Linear Models via Coordinate Descent", 2009, J. Friedman, T. Hastie, R. Tibshirani
[3] "A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems", 2009, A. Beck, M. Teboulle
[4] "Applications of a splitting algorithm to decomposition in convex programming and variational inequalities", 1991, P. Tseng
[5] "Fast Alternating Direction Optimization Methods", 2014, T. Goldstein, B. O'Donoghue, S. Setzer, R. Baraniuk
[6] "Outlier Detection Using Nonconvex Penalized Regression", 2011, Y. She, A.B. Owen
1 change: 1 addition & 0 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ Artic = "Artic"

[default.extend-words]
qest = "qest"
wrk = "wrk"
28 changes: 27 additions & 1 deletion docs/src/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ AbstractQuantileEstimator
LossFunction
RobustLinearModel
RobustModels.RobustLinResp
RobustModels.IPODResp
GLM.LinPred
RobustModels.DensePredCG
RobustModels.SparsePredCG
Expand All @@ -17,7 +18,12 @@ GLM.SparsePredChol
GLM.DensePredQR
RobustModels.RidgePred
RobustModels.AbstractRegularizedPred
RobustModels.CGDPred
RobustModels.FISTAPred
RobustModels.AMAPred
RobustModels.ADMMPred
QuantileRegression
IPODRegression
```

## Constructors for models
Expand All @@ -30,6 +36,7 @@ fit(::Type{M}, ::Union{AbstractMatrix{T}}, ::AbstractVector{T}) where {T<:Abstra
```@docs
rlm
quantreg
ipod
fit!
refit!
```
Expand Down Expand Up @@ -67,10 +74,13 @@ StatsAPI.residuals
StatsModels.hasintercept
hasformula
formula
haspenalty
penalty
scale
tauscale
RobustModels.location_variance
Estimator
outliers
GLM.linpred!
RobustModels.pirls!
RobustModels.pirls_Sestimate!
Expand Down Expand Up @@ -111,7 +121,17 @@ HardThresholdLoss
HampelLoss
```

## Estimator and Loss functions methods
## Penalty functions
```@docs
NoPenalty
SquaredL2Penalty
EuclideanPenalty
L1Penalty
ElasticNetPenalty
RangedPenalties
```

## Estimator, Loss and Penalty functions methods
```@docs
RobustModels.rho
RobustModels.psi
Expand Down Expand Up @@ -139,4 +159,10 @@ RobustModels.set_MEstimator
RobustModels.update_weight!
RobustModels.tau_scale_estimate
RobustModels.quantile_weight
RobustModels.cost
RobustModels.proximal!
RobustModels.proximal
RobustModels.isconcrete
RobustModels.concrete!
RobustModels.concrete
```
23 changes: 23 additions & 0 deletions src/RobustModels.jl
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,18 @@ export LossFunction,
GeneralizedQuantileEstimator,
ExpectileEstimator,
L2Estimator,
PenaltyFunction,
NoPenalty,
SquaredL2Penalty,
L1Penalty,
ElasticNetPenalty,
EuclideanPenalty,
BerhuPenalty,
CappedL1Penalty,
SCADPenalty,
MCPPenalty,
RangedPenalties,
End,
DensePredCG,
SparsePredCG,
RidgePred,
Expand All @@ -183,6 +195,11 @@ export LossFunction,
Estimator,
rlm,
quantreg,
IPODRegression,
ipod,
outliers,
penalty,
haspenalty,
loss,
tuning_constant,
refit!,
Expand Down Expand Up @@ -224,6 +241,9 @@ abstract type AbstractMEstimator <: AbstractEstimator end
"Generalized M-Quantile estimator"
abstract type AbstractQuantileEstimator <: AbstractMEstimator end

"Penalty function"
abstract type PenaltyFunction{T} end


"""
AbstractRobustModel
Expand All @@ -243,17 +263,20 @@ abstract type AbstractRegularizedPred{T} end

Base.broadcastable(m::AbstractEstimator) = Ref(m)
Base.broadcastable(m::LossFunction) = Ref(m)
Base.broadcastable(m::PenaltyFunction) = Ref(m)


include("tools.jl")
include("losses.jl")
include("estimators.jl")
include("penalties.jl")
include("linpred.jl")
include("regularizedpred.jl")
include("linresp.jl")
include("robustlinearmodel.jl")
include("univariate.jl")
include("quantileregression.jl")
include("ipod.jl")
include("deprecated.jl")

end # module
Loading

0 comments on commit d3207a9

Please sign in to comment.