add IPODRegression

getzze · May 12, 2023 · 0eb0550 · 0eb0550
1 parent afe4c95
commit 0eb0550
Show file tree

Hide file tree

Showing 6 changed files with 1,237 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -41,6 +41,7 @@ This package implements:
 * Robust Ridge regression (using any of the previous estimator)
 * Quantile regression using interior point method
 * Regularized Least Square regression
+* Θ-IPOD regression, possibly with a penalty term
 
 ## Installation
 
@@ -68,6 +69,11 @@ For Regularized Least Squares and a penalty term, use `rlm`:
 
 `m = rlm(X, y, L1Penalty(); method=:cgd)`
 
+For Θ-IPOD regression with outlier detection and a penalty term, use `ipod`:
+
+`m = ipod(X, y, L2Loss(), SquaredL2Penalty(); method=:auto)`
+
+
 For robust version of `mean`, `std`, `var` and `sem` statistics, specify the estimator as first argument.
 Use the `dims` keyword for computing the statistics along specific dimensions.
 The following functions are also implemented: `mean_and_std`, `mean_and_var` and `mean_and_sem`.
@@ -128,6 +134,12 @@ refit!(m10; quantile=0.8)
 ## Penalized regression
 m11 = rlm(form, data, SquaredL2Penalty(); method=:auto)
 
+## Θ-IPOD regression with outlier detection
+m12 = ipod(form, data, TukeyLoss(); method=:auto)
+
+## Θ-IPOD regression with outlier detection and a penalty term
+m13 = ipod(form, data, L2Loss(), L1Penalty(); method=:ama)
+
 ;
 
 # output
@@ -235,6 +247,21 @@ With a penalty, the following solvers are available (instead of the other ones):
     - `:ama`, Alternating Minimization Algorithm [4].
     - `:admm`, Alternating Direction Method of Multipliers [5].
 
+To use a robust loss function with a penalty, see Θ-IPOD regression.
+
+### Θ-IPOD regression
+
+_Θ-IPOD regression_ (Θ-thresholding based Iterative Procedure for Outlier Detection) results from
+minimizing the following objective function [6]:
+`L = ½ Σᵢ |yᵢ - 𝒙ᵢ 𝜷 - γᵢ|² + P(𝜷) + Q(γ)`,
+where `Q(γ)` is a penalty function on the outliers `γ` that is sparse so the problem is not underdetermined.
+We don't need to know the expression of this penalty function, just that it leads to thresholding using
+one of the loss function used by M-Estimators. Then Θ-IPOD is equivalent to solving an M-Estimator.
+This problem is solved using an alternating minimization technique, for the outlier detection.
+Without penalty, the coefficients are updated at every step using a solver for _Ordinary Least Square_.
+
+`P(𝜷)` is an optionnal (sparse) penalty on the coefficients.
+
 
 ## Credits
 
@@ -253,3 +280,4 @@ for implementing the Iteratively Reweighted Least Square algorithm.
 [3] "A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems", 2009, A. Beck, M. Teboulle
 [4] "Applications of a splitting algorithm to decomposition in convex programming and variational inequalities", 1991, P. Tseng
 [5] "Fast Alternating Direction Optimization Methods", 2014, T. Goldstein, B. O'Donoghue, S. Setzer, R. Baraniuk
+[6] "Outlier Detection Using Nonconvex Penalized Regression", 2011, Y. She, A.B. Owen
diff --git a/docs/src/api.md b/docs/src/api.md
@@ -23,6 +23,7 @@ RobustModels.FISTAPred
 RobustModels.AMAPred
 RobustModels.ADMMPred
 QuantileRegression
+IPODRegression
 ```
 
 ## Constructors for models

diff --git a/src/RobustModels.jl b/src/RobustModels.jl
@@ -127,6 +127,9 @@ export LossFunction,
        Estimator,
        rlm,
        quantreg,
+       IPODRegression,
+       ipod,
+       outliers,
        penalty,
        haspenalty,
        loss,
@@ -201,6 +204,7 @@ include("linresp.jl")
 include("robustlinearmodel.jl")
 include("univariate.jl")
 include("quantileregression.jl")
+include("ipod.jl")
 include("deprecated.jl")
 
 end # module