Mldb 2095 boosting and stump #785

guyd · 2016-12-20T21:17:30Z

I will wait until this is carefully reviewed before merging since it impacts the logic of all classifiers.

The main problem was in the Dense_Feature_Set that did not force the features to be sorted. This resulted in confusion about features while predicting. This was happening only when a classifier was returning false to predict_is_optimized (the default). This was in particular the case of Stump.

I cleaned also some of the logic around optimization. It seems that the original intent was to have all optimize calls to work transparently even when a classifier did not support optimization. This condition was tracked in two ways, calling predict_is_optimized() and using the Optimization_Info::initialized member. I kept only the former. It is therefore an error now to call an optimize predict with an uninitialized Optimization_Info.

I also removed the call optimization_supported. This call was not used except in the creation of the Optimization_Info and since the optimized predict methods can be called in all cases, it was not necessary to keep it.

Lastly, the all_features() method does not return all the features the classifier was trained on. An example is with a Committee of Stump where only the features that were actually selected during the training will be persisted. This might not be an issue but thought it was at the beginning. For this reason, the optimized predict will use Optimize_Info::from_features instead of Optimize_Info::to_features but this might not be required or even a good idea.

…stumps

…ng-and-stump

…mplemented the sorting of the Dense_Feature_Set

…ng-and-stump

…passed to the classifier tool

jeremybarnes · 2017-01-18T18:00:56Z

all_features should only return used features, not trained features

guyd · 2017-01-18T18:03:39Z

@jeremybarnes I get it for the all_features. I will update this PR since it has staled. I will submit for review in the coming day.

guyd added 3 commits December 15, 2016 18:14

[MLDB-2095] added a test that demonstrate the error with boosting on …

c2cb8de

…stumps

Merge branch 'master' of github.com:mldbai/mldb into MLDB-2095-boosti…

fd5881d

…ng-and-stump

[MLDB-2095] cleaned up the optimize condition in the classifier and i…

b2e9b72

…mplemented the sorting of the Dense_Feature_Set

guyd added the Not for general review label Dec 20, 2016

guyd added 3 commits January 9, 2017 13:52

[MLDB-2095] merge with latest changes in master

74bf24c

Merge branch 'master' of github.com:mldbai/mldb into MLDB-2095-boosti…

eb6807c

…ng-and-stump

[MLDB-2095] calling the non-optimize predict when no optimization is …

2c1163f

…passed to the classifier tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mldb 2095 boosting and stump #785

Mldb 2095 boosting and stump #785

guyd commented Dec 20, 2016 •

edited

Loading

jeremybarnes commented Jan 18, 2017

guyd commented Jan 18, 2017

Mldb 2095 boosting and stump #785

Are you sure you want to change the base?

Mldb 2095 boosting and stump #785

Conversation

guyd commented Dec 20, 2016 • edited Loading

jeremybarnes commented Jan 18, 2017

guyd commented Jan 18, 2017

guyd commented Dec 20, 2016 •

edited

Loading