-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathL110_SupportVectorMachines_Template.Rmd
131 lines (87 loc) · 2.17 KB
/
L110_SupportVectorMachines_Template.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
title: "Support Vector Machines"
author: "Bert Gollnick"
output:
html_document:
toc: true
toc_float: true
number_sections: true
code_folding: hide
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = T, warning = F, message = F)
```
```{r}
library(dplyr)
library(ggplot2)
library(keras)
library(e1071)
source("./functions/train_val_test.R")
```
# Data Preparation
## Data Creation
We create data on our own.
```{r}
set.seed(123)
df <- dplyr::tibble(x = runif(n = 2000, min = -10, max = 10),
y = runif(n = 2000, min = -10, max = 10)) %>%
dplyr::mutate(z = x^2+y^2) %>%
dplyr::mutate (class = ifelse(z < 50, "A", "B")) %>%
dplyr::mutate(class = as.factor(class))
```
You can see a circle inside a rectangular grid. The circle points refer to class A, the outer area refers to class B. The task is to predict the classes correctly.
## Train / Validation Split
Data will be splitted for 80 % training, and 20 % validation.
```{r}
c(train, val, test) %<-% train_val_test_split(df = df, train_ratio = 0.8, val_ratio = 0.2, test_ratio = 0)
```
## Visualisation
The data is visualised. It has only two dimensions, but since SVM is able to reshape the data into some higher-order representation it will be able to predict the classes nearly perfectly.
```{r}
g <- ggplot(train, aes(x, y, col = class))
g <- g + geom_point()
g <- g + theme_bw()
g
```
# Model
We create a Support Vector Machines model.
```{r}
# code here
```
It can also be plotted with the base-plotting environment.
```{r}
# code here
```
# Predictions
The classes of validation data are predicted.
```{r}
# code here
```
# Model Performance
## Confusion Matrix
A confusion matrix is created based on training and validation data.
Training:
```{r}
# code here
```
Validation:
```{r}
# code here
```
```{r}
# code here
```
## Hyperparameter Tuning
You can modify
- kernel
- cost
- gamma
```{r}
# code here
# create predictions
val$class_pred <- predict(object = model_svm,
newdata = val)
# Confusion Matrix
conf_mat <- table(predicted = val$class_pred, actual = val$class)
caret::confusionMatrix(conf_mat_val)
```