Skip to content

Commit

Permalink
updates for new tree splitting chunk
Browse files Browse the repository at this point in the history
  • Loading branch information
ludwigbothmann committed Sep 23, 2024
1 parent 85fa599 commit 93bb478
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 68 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -231,9 +231,6 @@






% \vspace*{-0.2cm}

% \begin{eqnarray*}
Expand Down
65 changes: 0 additions & 65 deletions slides/advriskmin/slides-advriskmin-classification-brier.tex
Original file line number Diff line number Diff line change
Expand Up @@ -144,71 +144,6 @@

\end{vbframe}

\begin{vbframe}{Brier score minimization = Gini splitting}

When fitting a tree we minimize the risk within each node $\Np$ by risk minimization and predict the optimal constant. Another approach that is common in literature is to minimize the average node impurity $\text{Imp}(\Np)$.

\vspace*{0.2cm}

\textbf{Claim:} Gini splitting $\text{Imp}(\Np) = \sum_{k=1}^g \pikN \left(1-\pikN \right)$ is equivalent to the Brier score minimization.

\begin{footnotesize}
Note that $\pikN := \frac{1}{n_{\Np}} \sum\limits_{(\xv,y) \in \Np} [y = k]$
\end{footnotesize}

\vspace*{0.2cm}

\begin{footnotesize}

\textbf{Proof: } We show that the risk related to a subset of observations $\Np \subseteq \D$ fulfills


$$
\risk(\Np) = n_\Np \text{Imp}(\Np),
$$

where $\text{Imp}$ is the Gini impurity and $\risk(\Np)$ is calculated w.r.t. the (multiclass) Brier score


$$
L(y, \pix) = \sum_{k = 1}^g \left([y = k] - \pi_k(\xv)\right)^2.
$$

\framebreak

\vspace*{-0.5cm}
\begin{eqnarray*}
\risk(\Np) &=& \sum_{\xy \in \Np} \sum_{k = 1}^g \left([y = k] - \pi_k(\xv)\right)^2
= \sum_{k = 1}^g \sum_{\xy \in \Np} \left([y = k] - \frac{n_{\Np,k}}{n_{\Np }}\right)^2,
\end{eqnarray*}

by plugging in the optimal constant prediction w.r.t. the Brier score ($n_{\Np,k}$ is defined as the number of class $k$ observations in node $\Np$):
$$\hat \pi_k(\xv)= \pikN = \frac{1}{n_{\Np}} \sum\limits_{(\xv,y) \in \Np} [y = k] = \frac{n_{\Np,k}}{n_{\Np }}. $$

We split the inner sum and further simplify the expression

\begin{eqnarray*}
&=& \sum_{k = 1}^{g} \left(\sum_{\xy \in \Np: ~ y = k} \left(1 - \frac{n_{\Np,k}}{n_{\Np }}\right)^2 + \sum_{\xy \in \Np: ~ y \ne k} \left(0 - \frac{n_{\Np,k}}{n_{\Np }}\right)^2\right) \\
&=& \sum_{k = 1}^g n_{\Np,k}\left(1 - \frac{n_{\Np,k}}{n_{\Np }}\right)^2 + (n_{\Np } - n_{\Np,k})\left(\frac{n_{\Np,k}}{n_{\Np }}\right)^2,
\end{eqnarray*}

since for $n_{\Np,k}$ observations the condition $y = k$ is met, and for the remaining $(n_\Np - n_{\Np,k})$ observations it is not.


We further simplify the expression to

% \begin{footnotesize}
\begin{eqnarray*}
\risk(\Np) &=& \sum_{k = 1}^g n_{\Np,k}\left(\frac{n_{\Np } - n_{\Np,k}}{n_{\Np }}\right)^2 + (n_{\Np } - n_{\Np,k})\left(\frac{n_{\Np,k}}{n_{\Np }}\right)^2 \\
&=& \sum_{k = 1}^g \frac{n_{\Np,k}}{n_{\Np }} \frac{n_{\Np } - n_{\Np,k}}{n_{\Np }} \left(n_{\Np } - n_{\Np,k } + n_{\Np,k}\right) \\
&=& n_{\Np } \sum_{k = 1}^g \pikN \cdot \left(1 - \pikN \right) = n_\Np \text{Imp}(\Np).
\end{eqnarray*}
% \end{footnotesize}

\end{footnotesize}

\end{vbframe}


\endlecture

Expand Down

0 comments on commit 93bb478

Please sign in to comment.