Assigned leaf node in decision_path() #84
-
Hi Isak, I have a question about the RandomShapeletForest method "decision_path": I was wondering if it's possible to modify the boolean 2D matrix in order to keep track of the assigned leaf node for each sample. Is there another way to recover such information? Suppose that there are at least two leaves with the same class. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi! Thanks for your question! You can get the leaf node index using the For example: import numpy as np
from wildboar.ensemble import ShapeletForestClassifier
from wildboar.datasets import load_gun_point
X, y = load_gun_point()
f = ShapeletForestClassifier()
f.fit(X, y)
# Get the index of the node where each sample ends up
# eg leaves[0] contains the leaf node index for the first sample
leaves = f.estimators_[0].apply(X)
# count which leaf has the most samples in it
np.unique(leaves, return_counts=True) You can also request it for the forest: leaves = f.apply(X)
leaves[0, 0] # the index of the leaf of the first sample in the first tree
leaves[:, 0] # the index of the leaves for all samples in the first tree While writing this, I realise this might not be what you are asking for? Are you asking for information about the majority class in each leaf? You can get the probability distribution over labels in a tree from the f.estimators_[0].value You'll note that some of these values are tree = f.estimators_[0]
leaf_proba = tree.tree_.value[tree.tree_.left == -1] This will give us a matrix of To get the class name that a leaf will assign we take the labels for the index of the maximal column from the leaf_nodes = tree.tree_.left == -1
labels = f.classes_.take(np.argmax(tree.tree_.value[leaf_nodes], axis=1)) We can use list(zip(np.nonzero(leaf_nodes)[0], labels)) This would return something like: [(3, 2.0),
(5, 2.0),
(7, 1.0),
(8, 2.0),
(10, 1.0),
(12, 2.0),
(17, 2.0),
(19, 2.0),
(20, 1.0),
(21, 2.0),
(22, 1.0),
(25, 1.0),
(27, 2.0),
(30, 1.0),
(31, 2.0),
(32, 2.0),
(34, 1.0),
(37, 1.0),
(38, 2.0),
(39, 2.0),
(41, 2.0),
(43, 1.0),
(44, 2.0)] Hope this helps! I will close the issue (since its not an issue :)) We can continue the discussion under "Discussions" Cheers, |
Beta Was this translation helpful? Give feedback.
-
Thank you very much! |
Beta Was this translation helpful? Give feedback.
Hi!
Thanks for your question!
You can get the leaf node index using the
apply
method ofShapeletTreeClassifier
For example:
You can also request it for the forest: