Investigate and implement a means of migrating from K3s to RKE2 #7627

davidnuzik · 2021-04-14T17:51:30Z

davidnuzik
Apr 14, 2021

Request from Rancher Federal. Provide a means to migrate from K3s to RKE2. This can likely be done without much fuss today with snapshotting.

We will need to test and identify any issues with this today. It will be necessary to document the method to migrate from K3s to RKE2.

ChillAndImprove · 2022-12-18T13:30:20Z

ChillAndImprove
Dec 18, 2022

bump

0 replies

cwayne18 · 2022-12-18T15:57:03Z

cwayne18
Dec 18, 2022
Maintainer

The trickier migration path that will need to be solved first is RKE -> RKE2, which is currently being planned out now.

0 replies

ChillAndImprove · 2022-12-19T12:39:25Z

ChillAndImprove
Dec 19, 2022

I would like to migrate from K3 to rke2 and could possibly. also write the documentation for it. Is this just snapshotting?

0 replies

brandond · 2022-12-19T19:24:58Z

brandond
Dec 19, 2022
Maintainer

no, there are several other hurdles: It isn't really supported to change the cloud provider or CNI on an existing cluster; the packaged components (ingress, coredns, etc) would need to be removed and the replacement installed, and so on.

0 replies

wthrbtn · 2022-12-19T23:36:02Z

wthrbtn
Dec 19, 2022

Some raw notes from upgrading two k3s clusters to RKE2

I upgraded two k3s clusters earlier this year (while subscribing to updates to this issue...)

Cluster 1 - test environment, 10 VMs
Cluster 2 - air-gapped production environment, 30 bare-metal servers

In total we had about two hours of down-time in production due to issues we had not encounterd while testing.

Writing this to help others take the plunge 😁

List of some of ours steps, findings and tricks

Air-gapped environment makes everything harder - but it is possible when someone has support from the beginning - Thank You Rancher/SUSE!
Use some sort of scripting to automate the steps - we used Ansible for nearly everything
Make sure you scale down non-essential deployments BEFORE the snapshot used for the RKE2 install
Make sure taints on k3s masters/control-plane are compatible with RKE2 control-plane install - this caused our longest outage because the RKE2 Canal CNI refused to install with our old taints = nodes could not talk to each other
Copy and the document the k3s token file with the K10 prefix and everything! (K10 part really needs better docs...)
Do not store the k3s snapshots in the k3s default folders - to avoid deleting them with k3s-uninstall
Take the snapshot
DISABLE all the k3s server/agent services on the nodes
Run k3s-killall.sh on all nodes (maybe twice...)
Run your variant of: rke2 server --cluster-reset --cluster-reset-restore-path=/backups/k3s/etcd/db/snapshots/k3s_on_demand-xxxxxxxxxxxxxxxxxxxxx --token <your-token>
Follow logs and save these for troubleshooting - (rm /var/lib/rancher/rke2/server/cred/passwd and restarting rke2-server is a classic...)
Do NOT remove all control-plane taints in an effort to get CNI etc installed (this will try to start all the deployments you forgot to scale down in step 2 😢)
Install more RKE2 control-plane nodes
Make sure to scale down/delete old k3s coredns and metrics server so that RKE2 coredns can run OK
Install the RKE2 worker nodes
We did NOT delete nodes at first, just wanted the cluster up and running. A few days later we completely removed and reinstalled all nodes one at a time to get rid of old k3s labels and annotations.
WARNING! DO NOT RUN k3s-uninstall while RKE2 is running - we lost a lot files mounted by RKE2 pods when the k3s-uninstall deletes directories it thought was unmounted and not in use.
It took a weekend but everything has been running just fine. We are now running RKE2 v1.24.8

Cheers /Jörgen

Troubleshooting and comparing tokens when RKE2 refused to start on the first RKE2 control-plane node

cat /var/lib/rancher/rke2/server/token

K10<DIFFERENT-K10-SHA256-HASH-ON-RKE2>::server:<ORIGINAL-SHA256-HASH-TOKEN>

cat /var/lib/rancher/k3s/server/token

K10<ORIGINAL-K10-SHA256-HASH-FROM-K3S>::server:<ORIGINAL-SHA256-HASH-TOKEN>

deleting rke2 passwd and restarting

rm /var/lib/rancher/rke2/server/cred/passwd

systemctl restart rke2-server.service

systemctl status rke2-server.service

    ● rke2-server.service - Rancher Kubernetes Engine v2 (server)
        Loaded: loaded (/usr/local/lib/systemd/system/rke2-server.service; enabled; vendor preset: enabled)
        Active: active (running) since Sat 2022-06-18 15:07:08 UTC; 39s ago
        [...]

the first RKE2 control-plane node starts

/var/lib/rancher/rke2/bin/kubectl --kubeconfig /etc/rancher/rke2/rke2.yaml get nodes

NAME          STATUS     ROLES                       AGE    VERSION
g8            NotReady   <none>                      536d   v1.21.13+k3s1
vm-ceph-1     NotReady   <none>                      123d   v1.21.13+k3s1
vm-ceph-2     NotReady   <none>                      123d   v1.21.13+k3s1
vm-ceph-3     NotReady   <none>                      123d   v1.21.13+k3s1
vm-ceph-4     NotReady   <none>                      123d   v1.21.13+k3s1
vm-cpl-1      NotReady   control-plane,etcd,master   31d    v1.21.13+rke2r2   <--- YES
vm-cpl-2      NotReady   control-plane,etcd,master   31d    v1.21.13+k3s1
vm-cpl-3      NotReady   control-plane,etcd,master   32d    v1.21.13+k3s1
vm-neo-1      NotReady   <none>                      129d   v1.21.13+k3s1
vm-neo-2      NotReady   <none>                      129d   v1.21.13+k3s1
vm-neo-3      NotReady   <none>                      129d   v1.21.13+k3s1
vm-worker-1   NotReady   <none>                      31d    v1.21.13+k3s1
vm-worker-2   NotReady   <none>                      31d    v1.21.13+k3s1

And now all of a sudden the token files are identical after rm /var/lib/rancher/rke2/server/cred/passwd and restart

cat /var/lib/rancher/rke2/server/cred/passwd

<ORIGINAL-SHA256-HASH-TOKEN>,node,node,rke2:agent
<ORIGINAL-SHA256-HASH-TOKEN>,server,server,rke2:server

cat /var/lib/rancher/rke2/server/token

K10<ORIGINAL-K10-SHA256-HASH-FROM-K3S>::server:<ORIGINAL-SHA256-HASH-TOKEN>

cat /var/lib/rancher/k3s/server/token

K10<ORIGINAL-K10-SHA256-HASH-FROM-K3S>::server:<ORIGINAL-SHA256-HASH-TOKEN>

Some errors encountered and workarounds

CoreDNS

Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists.
Unable to continue with install:
    ServiceAccount "coredns" in namespace "kube-system" exists and cannot be imported into the current release:
    invalid ownership metadata;
    label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm";
    annotation validation error: missing key "meta.helm.sh/release-name": must be set to "rke2-coredns";
    annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"

Deleting old k3s coredns

kubectl scale -n kube-system deployment coredns --replicas 0
kubectl delete svc -n kube-system kube-dns
kubectl delete pod -n kube-system coredns-574bcc6c46-kkq9t --force

Rancher 2.6 webhook failed everything (since it could not start)

+ helm_v3 install --set-string global.cattle.systemDefaultRegistry= --set-string global.clusterCIDR=172.26.0.0/15 --set-string global.clusterCIDRv4=172.26.0.0/15 --set-string global.clusterCIDRv6= --set-string global.clusterDNS=172.28.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=172.28.0.0/16 --set-string global.systemDefaultRegistry= rke2-coredns /tmp/rke2-coredns.tgz
Error: INSTALLATION FAILED: create: failed to create: Internal error occurred:
failed calling webhook "rancher.cattle.io": Post "https://rancher-webhook.cattle-system.svc:443/v1/webhook/mutation?timeout=10s": dial tcp 172.28.141.76:443: connect: connection refused
+ exit

Deleted Rancher mutating webhook (reinstalled later)

kubectl delete -n cattle-system mutatingwebhookconfigurations.admissionregistration.k8s.io rancher.cattle.io

Logs showing conflict between old and new metrics server

+ helm_v3 install --set-string global.cattle.systemDefaultRegistry= --set-string global.clusterCIDR=172.26.0.0/15 --set-string global.clusterCIDRv4=172.26.0.0/15 --set-string global.clusterCIDRv6= --set-string global.clusterDNS=172.28.0.10 --set-string global.clusterDomain=cluster.local --set-string global.rke2DataDir=/var/lib/rancher/rke2 --set-string global.serviceCIDR=172.28.0.0/16 --set-string global.systemDefaultRegistry= rke2-metrics-server /tmp/rke2-metrics-server.tgz
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists.
Unable to continue with install:
APIService "v1beta1.metrics.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata;
label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm";
annotation validation error: missing key "meta.helm.sh/release-name": must be set to "rke2-metrics-server";
annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"
+ exit

Deleting apiservice for k3s metrics

$ kubectl get apiservices.apiregistration.k8s.io

NAME                                        SERVICE                      AVAILABLE                 AGE
[...]
v1beta1.metrics.k8s.io                      kube-system/metrics-server   False (ServiceNotFound)   603d
[...]

$ kubectl delete apiservices.apiregistration.k8s.io v1beta1.metrics.k8s.io
apiservice.apiregistration.k8s.io "v1beta1.metrics.k8s.io" deleted

0 replies

Martin-Weiss · 2022-12-20T06:38:18Z

Martin-Weiss
Dec 20, 2022

@wthrbtn - thanks so much for sharing this process and experience!!

Could you give some background on the business case?
Why did you start with k3s and move to rke2, now?
Why did you do this „in-place“ instead of doing a full re-deployment?

0 replies

wthrbtn · 2022-12-21T23:08:00Z

wthrbtn
Dec 21, 2022

Could you give some background on the business case?
Why did you start with k3s and move to rke2, now?

Early 2020 I think the choice was between RKE(1 with Docker) and k3s.
We liked k3s better - light and nimble. By sometime mid-2020 k3s got etcd clustering as well. Easy choice.
RKE2 was always at least a minor version behind at the time and did not fully release until December I think.

We could have stayed with k3s but there were voices demanding a solution with commercial support if/when needed.

Air-gapped RKE2 still is not the same joy when patching for example compared to k3s. We have now learned to trick RKE2 and pretend it has Internet instead of gzip or zstd archives to avoid stupid evictions of pods – node NotReady for 0-2 seconds vs 1-2 minutes

Why did you do this „in-place“ instead of doing a full re-deployment?

About 10 of our nodes are hosting a Ceph cluster (using Rook) and provide that storage as both StorageClass and S3 to other resources in the cluster.
We had nothing external that we could rely on to do a full re-deployment.
The same with nodes running a Neo4j cluster with fast local storage (using OpenEBS LVM)

We had to get everything up and running really fast and then we could rebuild nodes one at a time without impacting anything.

0 replies

Martin-Weiss · 2022-12-22T05:56:42Z

Martin-Weiss
Dec 22, 2022

@wthrbtn - thanks for sharing these details! Helps to understand! ;-)

Regarding the Air-gapped RKE2 - what are you missing when comparing K3S to RKE2?
And what is your problem with node NotReady?

0 replies

brandond · 2022-12-22T07:48:51Z

brandond
Dec 22, 2022
Maintainer

I suspect that they're referring to the delay in importing the airgap images for core pods. K3s has everything in the binary and unpacks the few external components quickly. Rke2 takes longer because it has to import all the images from the tarball before things can come up all the way. Loading things into a local registry is definitely better than using airgap tarballs on every node.

I am curious why you switched for support though, as far as I know you can get k3s support from us (suse) under the same terms as rke2.

0 replies

Martin-Weiss · 2022-12-22T08:07:46Z

Martin-Weiss
Dec 22, 2022

I suspect that they're referring to the delay in importing the airgap images for core pods. K3s has everything in the binary and unpacks the few external components quickly. Rke2 takes longer because it has to import all the images from the tarball before things can come up all the way. Loading things into a local registry is definitely better than using airgap tarballs on every node.

Registry is a good way to go - but maybe also like we do it in Harvester where we do a pre-load of the new images before restarting.. the other request is still "do not re-import already imported images during ever rke2 restart ;-))

I am curious why you switched for support though, as far as I know you can get k3s support from us (suse) under the same terms as rke2.

Me too..

0 replies

wthrbtn · 2022-12-22T17:59:25Z

wthrbtn
Dec 22, 2022

I suspect that they're referring to the delay in importing the airgap images for core pods. K3s has everything in the binary and unpacks the few external components quickly. Rke2 takes longer because it has to import all the images from the tarball before things can come up all the way. Loading things into a local registry is definitely better than using airgap tarballs on every node.

Correct. We DO use several registries but everything still has to be able to recover from COMPLETE blackout without registries. (Image running clusters on a submarine or similar – not that we do, but you can only bring so many backup systems...).
Now we use a combination of crictl pulling images in advance and moving away image archives from default location when doing planned restarts to apply updates and then restore image archives when done.
A common need for a while was to update registries.yaml to provide new mirrors and the only way to apply the update was to restart the RKE2 service.

I am curious why you switched for support though, as far as I know you can get k3s support from us (suse) under the same terms as rke2.

Politics. Someone who do not know Kubernetes heard we ran k3s and googled it. All they remembered where “lightweight, IoT & Edge”. So to end the FUD about running a “toy” Kubernetes I took the plunge even though k3s performed perfectly!!!
I also wanted to prove some people wrong that "it would be impossible to migrate” 😁

Happy Holidays to you all!

0 replies

wthrbtn · 2022-12-22T19:05:52Z

wthrbtn
Dec 22, 2022

Regarding the Air-gapped RKE2 - what are you missing when comparing K3S to RKE2?

Actually ONE thing I really miss is k3s check-config that we used in our Ansible playbooks to prevent applying problems to more than a single node. Checking ports etc is not really the same.

0 replies

sdemura · 2023-03-16T17:17:41Z

sdemura
Mar 16, 2023

I've been following the advice from #881 (comment) and it's been very helpful.

However I'm now in a situation where everytime I restart my cluster I must remove /var/lib/rancher/rke2/server/cred/passwd every single time.

Is there a way to prevent that?

0 replies

ChristianCiach · 2025-01-21T19:26:13Z

ChristianCiach
Jan 21, 2025

However I'm now in a situation where everytime I restart my cluster I must remove /var/lib/rancher/rke2/server/cred/passwd every single time.

@sdemura I had the same issue after migrating a cluster from K3s to RKE2. I was able to fix it by running rke2 token rotate and restarting the nodes using the new token from /var/lib/rancher/rke2/server/token. I don't know exactly why this fixes the issue, but it seems to reliably work for me.

0 replies

ChristianCiach · 2025-01-21T19:59:49Z

ChristianCiach
Jan 21, 2025

Could you give some background on the business case?
Why did you start with k3s and move to rke2, now?
Why did you do this „in-place“ instead of doing a full re-deployment?

Since @Martin-Weiss asked for some background on the business case, I would like to contribute ours. Actually, our background is mostly the same as the one described by @wthrbtn. Our migration from K3s to RKE2 is motivated by a mix of political and technical reasons.

Let's start with the political reasons. While we plan to continue using K3s for the majority of our clusters (especially those running on-premise on our customer's systems), parts of our businesses actually are subject to the laws of KRITIS. I don't believe that it's strictly necessary to migrate to RKE2 to conform to KRITIS, it surely is easier to reason about, as RKE2 is advertised as suitable for the security requirements of governments. It's just easier for us to use a Kubernetes distribution that advertises itself as being as secure as possible out of the box and that guarantees CSI benchmark conformance just by configuring a flag.

But there are quite some technical reasons for us to prefer RKE2 as well. I think it was around 2020/2021 when we migrated from Docker Swarm to K3s, which was just the perfect alternative for us at the time. We had very little experience with Kubernetes and RKE2 wasn't even a thing back then. But as our clusters and requirements grew, we saw ourselves replacing more and more of K3s' bundled components. Just at the top of my head:

K3s only deploys a single replica of CoreDNS. While it's possible to scale it manually, there is no way to do so declaratively when installing K3s. RKE2 deploys an autoscaler by default that ensures that at least two nodes are running CoreDNS pods.
- This should be fixable if K3s would deploy the bundled manifests using server-side-apply. The cluster administrator could then deploy a partial manifest to /var/lib/rancher/k3s/server/manifests/coredns-replicas.yaml that only contains the replicas snippet.
K3s does not bundle a Volume Snapshot controller.
We want to use MetalLB, so we need to manually disable Klipper-LB.
The local-path-provisioner contains a Storage-Class that declares itself as the "default" storage class. But we are using Longhorn and Rook-Ceph, which we want to be the default storage class, so we have to disable the local-path-provisioner altogether.
The bundled network policy controller (kube-router) has a small but annoying delay when applying the network policies to newly scheduled pods. This an issue especially for jobs that require network access immediately after startup. As a workaround we currently use init-containers like wait4it to delay the startup of containers until the network policies are applied. I didn't yet notice any such delay when using Cilium or Calico.
External CIS-benchmark tools work better with RKE2.
The kube-prometheus-stack works better out-of-the-box with RKE2 because it can access more kubernetes components from within the cluster. With K3s you even have to expose an uncomfortable amount of ports on the host to access the metrics at all.

I understand that K3s can do everything that RKE2 can do, but the default setup of RKE2 is a way better fit for our requirements.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate and implement a means of migrating from K3s to RKE2 #7627

{{title}}

Replies: 15 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Investigate and implement a means of migrating from K3s to RKE2 #7627

Replies: 15 comments

cwayne18 Dec 18, 2022 Maintainer

brandond Dec 19, 2022 Maintainer

Some raw notes from upgrading two k3s clusters to RKE2

List of some of ours steps, findings and tricks

Troubleshooting and comparing tokens when RKE2 refused to start on the first RKE2 control-plane node

deleting rke2 passwd and restarting

the first RKE2 control-plane node starts

And now all of a sudden the token files are identical after rm /var/lib/rancher/rke2/server/cred/passwd and restart

Some errors encountered and workarounds

CoreDNS

Deleting old k3s coredns

Rancher 2.6 webhook failed everything (since it could not start)

Deleted Rancher mutating webhook (reinstalled later)

Logs showing conflict between old and new metrics server

Deleting apiservice for k3s metrics

brandond Dec 22, 2022 Maintainer

cwayne18
Dec 18, 2022
Maintainer

brandond
Dec 19, 2022
Maintainer

brandond
Dec 22, 2022
Maintainer