Skip to content

Commit

Permalink
Simplify proposal to focus only on network device handling
Browse files Browse the repository at this point in the history
IP and network configuration behaviors are left out of the proposal and
delegated to other processes to be managed by higher level abstractions,
the runtimes will only handle the network interfaces namespace migration
and the optional renaming.

Signed-off-by: Antonio Ojea <aojea@google.com>
  • Loading branch information
aojea committed Jan 31, 2025
1 parent 1fac043 commit ebe9192
Show file tree
Hide file tree
Showing 5 changed files with 7 additions and 107 deletions.
63 changes: 4 additions & 59 deletions config-linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ In addition to any devices configured with this setting, the runtime MUST also s

Linux network devices are entities that send and receive data packets.
They are not represented as files in the /dev directory, unlike block devices, network devices are represented with the [`net_device`][net_device] data structure in the Linux kernel.
Network devices have their own network namespace and a set of operations distinct from regular file operations. Examples of network devices include Ethernet cards, loopback devices, and virtual devices like bridges, VLANs, and MACVLANs.
Network devices can belong to only one network namespace and use a set of operations distinct from regular file operations. Examples of network devices include Ethernet cards, loopback devices, and virtual devices like bridges, VLANs, and MACVLANs.

This schema focuses solely on moving existing network devices identified by name from the host network namespace into the container network namespace. It does not cover the complexities of network device creation or network configuration, such as IP address assignment, routing, and DNS setup.

Expand All @@ -203,23 +203,13 @@ The runtime MUST check that is possible to move the network interface to the con

The runtime MUST set the network device state to "up" after moving it to the network namespace to allow the container to send and receive network traffic through that device.

Notice that after deleting a network namespace, all its migratable network devices are moved to the default network namespace, virtual devices (veth, macvlan, ...) are destroyed.
The runtime MUST move back the network device before the network namespace is deleted.
The runtime MUST set the network device state to "down" before moving it back to ensure that the interface is no longer active and won't interfere with other network operations or cause IP address conflicts.
For proper container termination, the runtime must first set the device's state to "down" and then move it out of the namespace before the namespace is deleted. This ensures the device is inactive and avoids conflicts. If the container abnormally terminates and the runtime does not participate in the termination process, these steps might be skipped, and the kernel will handle the process, described in [network_namespaces(7)][net_namespaces.7] "When a network namespace is freed (i.e., when the last process in the namespace terminates), its physical network devices are moved back to the initial network namespace" . Notice that after deleting a network namespace, all its migratable network devices are moved to the default network namespace, but virtual devices (veth, macvlan, ...) are destroyed.

The name of the network device is the entry key.
Entry values are objects with the following properties:

* **`name`** *(string, OPTIONAL)* - the name of the network device inside the container namespace. If not specified, the host name is used. The network device name is unique per network namespace, if an existing network device with the same name exists that rename operation will fail. The runtime MAY check that the name is unique before the rename operation.
The runtime MUST revert back the original name to guarantee the idempotence of operations, so a container that moves an interface and renames it can be created and destroyed multiple times with the same result.
* **`addresses`** *(array of strings, OPTIONAL)* - the IP addresses, IPv4 and or IPv6, of the device within the container in CIDR format (IP address / Prefix). All IPv4 addresses SHOULD be expressed in their decimal format, consisting of four decimal numbers separated by periods. Each number ranges from 0 to 255 and represents an octet of the address. IPv6 addresses SHOULD be represented in their canonical form as defined in RFC 5952.
The runtime MAY limit the number of addresses allowed.
The runtime MAY revert back the original addresses, keep the existing ones or completely
remove them, since the interface MUST be in down state can not present a problem.
* **`hardwareAddress`** *(string, OPTIONAL)* - represents the hardware address (e.g. MAC Address) of the device's network interface, represented as an IEEE 802 MAC-48, EUI-48, EUI-64, or a 20-octet IP over InfiniBand link-layer address.
The runtime MAY decide to revert back the original hardware address.
* **`mtu`** *(uint32, OPTIONAL)* - the MTU (Maximum Transmission Unit) size for the device.
The runtime MAY decide to revert back the original MTU value.
The runtime, when participating on the container termination, must revert back the original name to guarantee the idempotence of operations, so a container that moves an interface and renames it can be created and destroyed multiple times with the same result.

### Example

Expand All @@ -235,52 +225,6 @@ The runtime MAY decide to revert back the original MTU value.

This configuration will move the device named "eth0" from the host into the container's network namespace. Inside the container, the device will be named "container_eth0".

#### Moving a device with a specific IP address and MTU inside the container:

IPv4 address

```json
"netDevices": {
"ens4": {
"addresses": [
"10.0.0.10/24"
],
"hardwareAddress": "32:ba:1c:b1:eb:63",
"mtu": 9000
}
}
```

IPv6 address

```json
"netDevices": {
"ens4": {
"addresses": [
"2001:db8:1:2::a/64"
],
"hardwareAddress": "32:ba:1c:b1:eb:63",
"mtu": 9000
}
}
```

Dual Stack

```json
"netDevices": {
"ens4": {
"addresses": [
"10.0.0.10/24",
"2001:db8:1:2::a/64"
],
"hardwareAddress": "32:ba:1c:b1:eb:63",
"mtu": 9000
}
}
```


## <a name="configLinuxControlGroups" />Control groups

Also known as cgroups, they are used to restrict resource usage for a container and handle device access.
Expand Down Expand Up @@ -1076,6 +1020,7 @@ subset of the available options.
[mknod.2]: https://man7.org/linux/man-pages/man2/mknod.2.html
[namespaces.7_2]: https://man7.org/linux/man-pages/man7/namespaces.7.html
[net_device]: https://docs.kernel.org/networking/netdevices.html
[net_namespaces.7]: https://man7.org/linux/man-pages/man7/network_namespaces.7.html
[null.4]: https://man7.org/linux/man-pages/man4/null.4.html
[personality.2]: https://man7.org/linux/man-pages/man2/personality.2.html
[pts.4]: https://man7.org/linux/man-pages/man4/pts.4.html
Expand Down
12 changes: 0 additions & 12 deletions schema/defs-linux.json
Original file line number Diff line number Diff line change
Expand Up @@ -194,18 +194,6 @@
"properties": {
"name": {
"type": "string"
},
"addresses": {
"type": "array",
"items": {
"type": "string"
}
},
"hardwareAddress": {
"type": "string"
},
"mtu": {
"$ref": "defs.json#/definitions/uint32"
}
}
},
Expand Down
3 changes: 1 addition & 2 deletions schema/test/config/bad/linux-netdevice.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@
"linux": {
"netDevices": {
"eth0": {
"name": "container_eth0",
"mtu": "not_an_int"
"name": 23
}
}
}
Expand Down
30 changes: 2 additions & 28 deletions schema/test/config/good/linux-netdevice.json
Original file line number Diff line number Diff line change
Expand Up @@ -8,34 +8,8 @@
"eth0": {
"name": "container_eth0"
},
"ens4": {
"addresses": [
"10.0.0.10/24"
],
"hardwareAddress": "32:ba:1c:b1:eb:63",
"mtu": 9000
},
"ens5": {
"addresses": [
"2001:db8:1:2::4/64"
],
"mtu": 1500
},
"ens6": {
"addresses": [
"10.0.0.10/24",
"2001:db8:1:2::4/64"
],
"mtu": 1500
},
"ens7": {
"addresses": [
"10.0.0.10/24",
"2001:db8:1:2::4/64",
"fd00:1::af/48"
],
"mtu": 1500
}
"ens4": {},
"ens5": {}
}
}
}
6 changes: 0 additions & 6 deletions specs-go/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -497,12 +497,6 @@ type LinuxDevice struct {
type LinuxNetDevice struct {
// Name of the device in the container namespace
Name string `json:"name,omitempty"`
// Addresses is the list of IP addresses, IPv4 or IPv6, in CIDR format in the container namespace
Addresses []string `json:"addresses,omitempty"`
// HardwareAddress represents the hardware address (e.g. MAC Address) of the device's network interface
HardwareAddress string `json:"hardwareAddress,omitempty"`
// MTU Maximum Transfer Unit of the network device in the container namespace
MTU uint32 `json:"mtu,omitempty"`
}

// LinuxDeviceCgroup represents a device rule for the devices specified to
Expand Down

0 comments on commit ebe9192

Please sign in to comment.