Skip to content

software for best-subset selection in polynomial time

Notifications You must be signed in to change notification settings

Jiang-Kangkang/abess

 
 

Repository files navigation

abess: R & Python Software for Best-Subset Selection in Polynomial Time

Github action codecov docs cran pypi pyversions License Codacy Badge

abess (Adaptive Best Subset Selection) aims to find a small subset of predictors such that the resulting linear model is expected to have the most desirable prediction accuracy. This project implements a polynomial algorithm proposed to solve these problems. It supports:

  • linear regression
  • classification (binary or multi-class)
  • counting-response modeling
  • censored-response modeling
  • multi-response modeling (multi-tasks learning)
  • group best subset selection
  • nuisance penalized regression
  • sure independence screening

Installation

The abess software has both Python and R's interfaces.

Python package

Install the stable version of Python-package from Pypi with:

pip install abess

R package

Install the stable version of R-package from CRAN with:

install.packages("abess")

Performance

To show the computational efficiency of abess, we compare abess R package with popular R libraries: glmnet, ncvreg for linear and logistic regressions; Timings of the CPU execution are recorded in seconds and averaged over 100 replications on a sequence of 100 regularization parameters.

source("R-package/example/timing.R")

All experiments are evaluated on an Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz 3.31 GHz and under R version 3.6.1. for 100 replicas.

Results are presented in the following picture. As a package solving the best subset selection, abess reaches a high efficient performance especially in linear regression where it gives the fastest solution.

Figure 1. Runing Time for different packages

Reference

A polynomial algorithm for best-subset selection problem. Junxian Zhu, Canhong Wen, Jin Zhu, Heping Zhang, Xueqin Wang. Proceedings of the National Academy of Sciences Dec 2020, 117 (52) 33117-33123; DOI: 10.1073/pnas.2014241117
Fan, J. and Lv, J. (2008), Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70: 849-911. https://doi.org/10.1111/j.1467-9868.2008.00674.x
Qiang Sun & Heping Zhang (2020) Targeted Inference Involving High-Dimensional Data Using Nuisance Penalized Regression, Journal of the American Statistical Association, DOI: 10.1080/01621459.2020.1737079
Zhang, Y., Zhu, J., Zhu, J. and Wang, X., 2021. Certifiably Polynomial Algorithm for Best Group Subset Selection. arXiv preprint arXiv:2104.12576.

About

software for best-subset selection in polynomial time

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 92.9%
  • Python 2.0%
  • C 1.9%
  • R 1.8%
  • Cuda 1.3%
  • CMake 0.1%