Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding workers fails on Wireguard configuration #329

Open
abarthol opened this issue Jun 12, 2020 · 17 comments
Open

adding workers fails on Wireguard configuration #329

abarthol opened this issue Jun 12, 2020 · 17 comments

Comments

@abarthol
Copy link

abarthol commented Jun 12, 2020

When adding a worker the install process stops at Wireguard configuration.

command:systemctl enable wg-quick@wg0 && systemctl restart wg-quick@wg0 && systemctl enable overlay-route.service && systemctl restart overlay-route.service stdout:Created symlink /etc/systemd/system/multi-user.target.wants/wg-quick@wg0.service → /lib/systemd/system/wg-quick@.service. Job for wg-quick@wg0.service failed because the control process exited with error code. See "systemctl status wg-quick@wg0.service" and "journalctl -xe" for details.

This seems to be related to: adrianmihalko/raspberrypiwireguard#11

@drallgood
Copy link
Contributor

yep. Ran into the same problem.

Ubuntu borked the wireguard module

Solution is apparently to install hwe (-> newer Kernel):
https://wiki.ubuntu.com/Kernel/LTSEnablementStack

It works, but... don't know how to do that for a node

@drallgood
Copy link
Contributor

So.. easiest fix: patch hetzner-kube to use Ubuntu 20.04LTS
I actually upgraded a few of my existing nodes as well that broke (actually the reason why I wanted to recreate one in the first place)

@abarthol
Copy link
Author

abarthol commented Jun 15, 2020

I've tried to use Ubuntu 20.04LTS but I get another error:

command:for i in ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4; do modprobe $i; done && kubeadm reset -f && kubeadm join 10.0.1.1:6443 --token q5nor9.i7r02nwpwgl1cimm     --discovery-token-ca-cert-hash sha256:4e2d467803467a8aab9c484fa24b0c45db0c865a68c528fdd985b56879afa6c9 

stdout:modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/5.4.0-28-generic

@drallgood
Copy link
Contributor

Strange.... I just installed multiple new nodes and it worked just fine...
Anyhow you need the corresponding kernel module

@tmemenga
Copy link

@abarthol have a look at https://github.com/tmemenga/hetzner-kube/tree/ubuntu-20-04, i was able to get past that error. But i still need to check if the cluster is really operational.

@xetys
Copy link
Owner

xetys commented Jun 15, 2020

If this works, I'd love to see a PR if you don't mind

@abarthol
Copy link
Author

Thanks @tmemenga. Your branch works for creating a new cluster. Please make a pull request to add this to the main project. Although I have not tested to add a new node to an existing (Ubuntu 16.04 LTS or 18.04 LTS) cluster.

@abarthol
Copy link
Author

abarthol commented Jun 15, 2020

After successful cluster setup with Ubuntu 20.04 LTS I recognized a problem with canal. The pods did not startup correctly. The error message was:

[FATAL][581] int_dataplane.go 1032: Kernel's RPF check is set to 'loose' ...

I hat to set this to make it work:

kubectl -n kube-system set env daemonset/canal FELIX_IGNORELOOSERPF=true

@drallgood
Copy link
Contributor

Thanks @tmemenga. Your branch works for creating a new cluster. Please make a pull request to add this to the main project. Although I have not tested to add a new node to an existing (Ubuntu 16.04 LTS or 18.04 LTS) cluster.

I'm running a mixed cluster right now, without any issues (control plane is 18.04 and 2 out of 6 nodes are 20.04)

@eugenpro
Copy link

Is it possible to manually add node to the cluster, that has been created with hetzner-kube utility now?

@tmemenga
Copy link

i also had to issue a

kubectl -n kube-system set env daemonset/canal FELIX_IGNORELOOSERPF=true

to stop canal from contstantly restarting.

But it seems this is something you should not do on systems other than DEV ?

https://alexbrand.dev/post/creating-a-kind-cluster-with-calico-networking/

Relax Calico's RPF Check Configuration
By default, Calico pods fail if the Kernel's Reverse Path Filtering (RPF) check is not enforced. This is a security measure to prevent endpoints from spoofing their IP address.

The RPF check is not enforced in Kind nodes. Thus, we need to disable the Calico check by setting an environment variable in the calico-node DaemonSet:

kubectl -n kube-system set env daemonset/calico-node FELIX_IGNORELOOSERPF=true
Note: I am disabling this check because this is a dev environment. You probably do not want to do this otherwise.

@max-software-net
Copy link

max-software-net commented Jun 22, 2020

After successful cluster setup with Ubuntu 20.04 LTS I recognized a problem with canal. The pods did not startup correctly. The error message was:

[FATAL][581] int_dataplane.go 1032: Kernel's RPF check is set to 'loose' ...

I get it working by changing /etc/sysctl.d/10-network-security.conf as follow:

net.ipv4.conf.default.rp_filter=1
net.ipv4.conf.all.rp_filter=1

@krzko
Copy link

krzko commented Jun 30, 2020

Looks like wireguard is borked in the 18.04 distro. Here's a cloud-init script that should bootstrap your cluster successfully.

my-k8s-cluster-cloud-init

#cloud-config

package_update: true

runcmd:
 - add-apt-repository ppa:wireguard/wireguard
 - apt-get update
 - apt-get install -y --install-recommends linux-generic-hwe-18.04
 - apt-get install -y wireguard wireguard-dkms wireguard-tools
 - modprobe wireguard
 - lsmod | grep wireguard

Can be invoked via;

hetzner-kube cluster create --name my-k8s-cluster --ssh-key my-ssh-key --cloud-init ./my-k8s-cluster-cloud-init

@ulfw
Copy link

ulfw commented Jul 6, 2020

No, sorry, that cloud-init is not a working fix

@Antauri
Copy link

Antauri commented Jul 17, 2020

I got it working with:


users:
  - name: your-sudo-user
    groups: users, admin
    sudo: ALL=(ALL) NOPASSWD:ALL
    shell: /bin/bash
    ssh_authorized_keys:
      - YOUR_KEY_HERE

package_update: true
package_upgrade: true      

packages:
  - your
  - list
  - of
  - packages

runcmd:
 - add-apt-repository ppa:wireguard/wireguard
 - apt-get update
 - apt-get install -y --install-recommends linux-generic-hwe-18.04
 - apt-get install -y wireguard wireguard-dkms wireguard-tools
 - modprobe wireguard
 - lsmod | grep wireguard
 - reboot```

And command:

hetzner-kube cluster create --name kubernetes -k YOUR-SSH-KEY --master-server-type cx21 -m 3 --worker-server-type cx21 --node-cidr a.b.c.d/16 -w 5 --ha-enabled --cloud-init /path/to/cloud-init.yml

This was referenced Jul 19, 2020
wethinkagile added a commit to wethinkagile/hetzner-kube that referenced this issue Oct 18, 2020
Incorporated the cloud-config from xetys#329 (comment)
@mashkovd
Copy link

hetzner-master-01    : installing transport tools         11.5% [--------------]
hetzner-worker-01    : prepare packages                   23.5% [=>------------]
run failed
command:add-apt-repository ppa:wireguard/wireguard -y
stdout:Cannot add PPA: 'ppa:~wireguard/ubuntu/wireguard'.
The team named '~wireguard' has no PPA named 'ubuntu/wireguard'
Please choose from the following available PPAs:

this command didn't work - hetzner-kube cluster create --name hetzner --ssh-key mctl --cloud-init ./my-k8s-cluster-cloud-init
(my-k8s-cluster-cloud-init above)

@eugene-chernyshenko
Copy link

I fixed this issue in #339

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests