-
Notifications
You must be signed in to change notification settings - Fork 0
4. K Means Clustering of Teams Based on Performances
Previous: Exploratory Data Analysis
For each league, k-means clustering was performed on all teams to put them into four clusters based on the points per game (PPG) and win proportions averaged across all five seasons for each season-end position.
-
Bundesliga
Cluster ID Size Mean PPG Mean win proportion 1 6 1.60 .45 2 7 1.18 .30 3 3 .83 .20 4 2 2.26 .69 -
La Liga
Cluster ID Size Mean PPG Mean win proportion 1 4 1.62 .46 2 10 1.21 .31 3 3 .78 .18 4 3 2.19 .66 -
Major League Soccer (MLS)
Cluster ID Size Mean PPG Mean win proportion 1 5 .88 .22 2 3 1.81 .53 3 10 1.49 .41 4 6 1.19 .32 -
Premier League (EPL)
Cluster ID Size Mean PPG Mean win proportion 1 5 1.76 .50 2 9 1.21 .32 3 4 .82 .20 4 2 2.33 .73
In each of the European leagues, the "best" cluster (i.e., one with the highest PPG and win proportions), each of which having two members, is far separated from the other clusters. This points toward the big gap between the top 2 teams and the other teams in these leagues. In the EPL, there is also a big gap between the "second-best" cluster (denoted by black circles) and the "third-best" one (denoted by red triangles), further demonstrating a lack of competitive balance in the league. In the Bundesliga and La Liga, the gaps among the "non-best" clusters are smaller and quite similar to one another. In comparison, in the MLS, the clusters are evenly separated, pointing toward a high level of competitive balance here.