Skip to content

Latest commit

 

History

History
103 lines (65 loc) · 3.81 KB

Regularized Regression_MARS_PLS_SVM.md

File metadata and controls

103 lines (65 loc) · 3.81 KB
tags: python Machine Learning

Regularized Regression & MARS & PLS & kernel overview

statistic package for python

https://bradleyboehmke.github.io/HOML/mars.html

regularized regression

  1. Ridge

  2. Lasso

  • lasso can be used to identify and extract those features with the largest (and most consistent) signal.
  1. Group Lasso Group Lasso

  2. Elastic Net

the difference between Lasso and Ridge

Ridge vs. Lasso

  • having a knit in lasso(which is equal to 0)
  • For Ridge regression, the larger the lambda is, the slope(coefficient) will more close to 0

MARS(Multivariate Adaptive Regression Spline)

PLS (partial least square)

  • suitable for the case that the number of predictors is larger than that of observations (p >> n)
  • supervised learning considering the projection to latent structure

loading plot for PLS like PCA

interpretation of PLS

sample code for PLS

variable selesction for PLS

Gaussian Process Regression vs. Kernel Ridge regression

GPR vs. KRR

Different Kernel for GPR

kernel introduction

Residual (GLS influential points)

influential point for glm model

SVR

Different SVR models

$\epsilon$-SVR

  • minimize the $\epsilon$-insensitive loss function along with the $\frac{1}{2}w^Tw$ regularization term, where $|y_i-f(x_i)_\epsilon|=max(0, |y_i-f(x_i)|-\epsilon$ is the $epsilone$-insensitive loss function

$min_{w, b}\frac{1}{2}w^Tw+C\Sigma_{i=1}^l(\kappa_i+\kappa_i^)$ $subject \ to$ $y_i-(A_iw+b)<=\epsilon+\kappa_i$ $(A_iw+b)-y_i<=\epsilon+\kappa_i^, \ \ \kappa_i, \kappa_i^*>=0$

Least Squared SVR

  • minimize the quadratic loss function along with the $\frac{1}{2}w^Tw$ regularization term

$min_{w, b}\frac{1}{2}w^Tw+C\Sigma_{i=1}^l(y_i-f(x_i))^2$

$min_{w, b, \xi}\frac{c}{2}||w||^2+C_1\Sigma_{i=1}^l(\xi_i^2)$ $subject \ to$ $y_i-(A_iw+b)=\xi_i, \ \ i=1, 2, ..., l$

Huber SVR

  • use huber loss (改善Squared loss function 對outlier的robustness)
  • SGDClassifier
    • classification loss function: hinge, log, modified_huber, squared_hinge
    • regression loss function: squared_error, huber, epsilon_insensitive
      • log loss: give logistic regression
      • modified_huber: smooth loss that brings to tolerance to outliers as well as prbability estimates

Twin SVR (TSVR)

  • TSVR estimates two non-parallel hyperplanes by solving two quadratic progeamming problems (QPP)

Focal loss

  • 針對 easy example進行down-weighting,因為focal loss希望在訓練過程中盡量去訓練hard example