Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Image Cache when netbooting #10212

Open
trevex opened this issue Jan 23, 2025 · 8 comments
Open

Support Image Cache when netbooting #10212

trevex opened this issue Jan 23, 2025 · 8 comments

Comments

@trevex
Copy link

trevex commented Jan 23, 2025

Feature Request

Description

Currently when netbooting (via iPXE) neither ISO nor raw image will detect their image cache.
Using the same ISO via USB will successfully boot and have access to the image cache.

iPXE is using sanboot to load and boot the ISO (or raw image), e.g.:

#!ipxe

sanboot --no-describe http://10.0.1.250/metal-amd64-secureboot.iso || goto failed
goto start

It would be great to be able to use the image cache when netbooting as well.
An alternative would be the ability to create an image cache on a running talos node or from a remote image cache oci image.

@rothgar
Copy link
Member

rothgar commented Jan 23, 2025

What does talosctl get discoverdvolumes show on the node that network booted?

We have a way to store the cache as an OCI image and push it to a registry but currently there's no way to reference that as a local cache (because it's not local).

@smira
Copy link
Member

smira commented Jan 23, 2025

The ISO volume in this case is never present, as it's all in-memory before boot, so regular methods won't work here.

There are still ways to achieve this, but it will be different than what we have today.

@trevex
Copy link
Author

trevex commented Jan 23, 2025

For completeness both raw and iso show the same discovered volumes:

$ talosctl get discoveredvolumes -n 10.0.1.11 -e 10.0.1.11 --insecure
NODE   NAMESPACE   TYPE               ID      VERSION   TYPE   SIZE     DISCOVERED   LABEL   PARTITIONLABEL
       runtime     DiscoveredVolume   loop0   1         disk   74 MB    squashfs
       runtime     DiscoveredVolume   sda     1         disk   8.0 TB   gpt
       runtime     DiscoveredVolume   sdb     1         disk   8.0 TB
       runtime     DiscoveredVolume   sdc     1         disk   8.0 TB
       runtime     DiscoveredVolume   sdd     1         disk   8.0 TB
       runtime     DiscoveredVolume   sde     1         disk   480 GB

Referencing an OCI image cache from a registry would also be a valid option in our case.

@smira
Copy link
Member

smira commented Jan 23, 2025

If you have a registry, you can simply use it as a registry mirror and skip image cache completely.

As anyways you're netbooting, you have network infra around available to handle this.

The only way here is to actually bundle image cache into the initial initramfs, and that would make it huge, and it will be kept in memory.

It's technically implementable, but doesn't sound like a good solution.

@rothgar
Copy link
Member

rothgar commented Jan 23, 2025

You could also plug in a USB drive or virtually mount an ISO to the machine with the image cache. That isn't nearly as convenient as net booting but an option.

@trevex
Copy link
Author

trevex commented Jan 23, 2025

Let me give you two some context, what we are trying to do in the lab:

We want to automatically provision a set of air-gapped servers. The netboot setup will only be temporarily available, so the cluster should be self-sufficient after it was "seeded". We are looking to create a Talos based K8s cluster that is resilient to power outage while still being airgapped without any dependency on additional services.

Our current workaround is to mount the ISO via Redfish as a Virtual Media, which works well enough.

Making a registry temporarily available as part of the netboot setup is not a problem, but we need to persist the images.

Thanks for taking the time going into this issue, I suspect though that mounting the ISO via BMC/OOB is the best approach for now.

@rothgar
Copy link
Member

rothgar commented Jan 23, 2025

Mounting an ISO is what is supported today. Are the clusters going to have access to a container registry within the air gapped environment? How do you plan to update the cluster (talos and workloads)?

We have future plans to allow for an in-cluster, distributed registry cache, but it's not a high priority and not yet fully planned. But the idea is to allow for nodes to "seed" each other with images. Mostly for caching and bandwidth reduction, but it also can be used for air gapped environments.

@trevex
Copy link
Author

trevex commented Jan 24, 2025

Currently we have a minimal set of "core" services we include in the image cache. They include a CNI, a MinIO setup with local path provisioner and Zot as a container registry (as well as relevant Talos/K8s images).

This allows us to have an airgapped registry that can persist through power failures. The Zot registry is then used for additional "platform" services such as ceph and more...

As mentioned earlier this is a laboratory setup, so we haven't tackled day 2 operations yet, but we are planning to investigate the ability to update the image cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants