Skip to content

Preliminary work to produce a one-hot encoding tool, now integrated in tick

Notifications You must be signed in to change notification settings

SimonBussy/one-hot-encoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

one-hot-encoding

This is a preliminary work to produce a scikit-learn transformer that transforms an input matrix of shape (n_samples, n_features) into a binary matrix of size (n_samples, n_new_features). Continous features are modified and extended into binary features, using linearly or inter-quantiles spaced bins. Discrete features are binary encoded with K columns, where K is the number of modalities. Other features (none of the above) are left unchanged.

This work have been updated and integrated to the tick module as a preprocessing tool (here) and used in the paper "Binarsity: a penalization for one-hot encoded features" available here and accepted for publication in JMLR.

About

Preliminary work to produce a one-hot encoding tool, now integrated in tick

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages