Skip to content

Commit

Permalink
Merge pull request #722 from jayeshh123/vnic_pr
Browse files Browse the repository at this point in the history
MROT related changes
  • Loading branch information
rajan-mis authored Aug 24, 2023
2 parents e6261e5 + b6f383b commit 5dcea25
Show file tree
Hide file tree
Showing 10 changed files with 795 additions and 0 deletions.
143 changes: 143 additions & 0 deletions roles/mrot_config/tasks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
**What is MROT??**

IBM Storage Scale 5.1.5 introduces the Multi-Rail over TCP (MROT) feature. This feature enables the concurrent use of multiple subnets to communicate with a specified destination, and now allows the concurrent use of multiple physical network interfaces without requiring bonding to be configured.

As per this MROT configuration code MROT can be configured when both storage and compute cluster VSI has two vNICs up and running.
For Compute cluster primary interface should be in compute cluster subnet and secondary interface should be in storage cluster subnet.
For Storage cluster both interfaces should be in storage cluster subnet.

**Note:** Using this mrot configuration code only M*N connection model can be configured. In this case for compute cluster scale got installed on secondary interface and for storage cluster scale got installed on primary interface.

For more information, see https://www.ibm.com/docs/en/storage-scale/5.1.7?topic=configuring-multi-rail-over-tcp-mrot

**How to check MROT and Logical subnet got configure??**

To check if MROT and logical subnet got configured we can use mmdiag --network and mmlsconfig command.
For more information, see below
https://www.ibm.com/docs/en/storage-scale/5.0.5?topic=reference-mmdiag-command
https://www.ibm.com/docs/en/storage-scale/5.0.4?topic=reference-mmlsconfig-command

**Example - mmdiag --network for the Compute Cluster**

Logical subnet can be seen under my addr list. In below result under hostname and idx destination hostnames of nodes and destination IPs can be found.

For compute cluster: scale is getting configured only on secondary IPs hence only secondary IPs can be seen in result.Individually for each nodes, Details can be found under Connection details and IpPair Table will show source ip and destination ip.
```
[root@scale-cluster-compute-1 ~]# mmdiag --network
=== mmdiag: network ===
Pending messages:
(none)
Inter-node communication configuration:
tscConnMode mrot
tscTcpPort 1191
my address 10.241.1.22/24 (eth1) <c0n0>
my addr list 10.241.1.22/24 (eth1)/scale-cluster.compscale.com;scale-cluster.strgscale.com
my subnet list 10.241.1.0/24
my node number 1
TCP Connections between nodes:
hostname node idx destination status err sock sent(MB) recvd(MB) ostype
scale-cluster-compute-3-sec <c0n1> 0 10.241.1.21 connected 0 124 0 0 Linux/L
scale-cluster-compute-3-sec <c0n1> 1 10.241.1.21 connected 0 127 0 0 Linux/L
scale-cluster-compute-2-sec <c0n2> 0 10.241.1.19 connected 0 125 0 0 Linux/L
scale-cluster-compute-2-sec <c0n2> 1 10.241.1.19 connected 0 128 0 0 Linux/L
scale-cluster-compute-4-sec <c0n3> 0 10.241.1.20 connected 0 126 0 0 Linux/L
scale-cluster-compute-4-sec <c0n3> 1 10.241.1.20 connected 0 108 0 0 Linux/L
scale-cluster-storage-1 <c1n0> 0 10.241.1.26 connected 0 134 0 0 Linux/L
scale-cluster-storage-1 <c1n0> 1 10.241.1.23 connected 0 135 0 0 Linux/L
scale-cluster-storage-3 <c1n1> 0 10.241.1.24 connected 0 137 0 0 Linux/L
scale-cluster-storage-3 <c1n1> 1 10.241.1.25 connected 0 138 0 0 Linux/L
scale-cluster-storage-2 <c1n2> 0 10.241.1.30 connected 0 136 0 0 Linux/L
scale-cluster-storage-2 <c1n2> 1 10.241.1.27 connected 0 140 0 0 Linux/L
scale-cluster-storage-4 <c1n3> 0 10.241.1.29 connected 0 133 0 0 Linux/L
scale-cluster-storage-4 <c1n3> 1 10.241.1.28 connected 0 117 0 0 Linux/L
Connection details:
<c0n1> 10.241.1.21/0 (scale-cluster-compute-3-sec)
status connected was_broken 0 err 0 reconnEnabled 1 delayedAckEnabled 1
connMode mrot shutting 0 handlerCount 0 need_notify 0 leaseSentOn 1
nMaxTcpConns 2 (2) nActiveCount 2 nActiveState 0x3 (1100000000000000)
nInuseTcpConns 0 currTcpConnIndex 0 availableTcpConns (1111111111111111)
nReservedSmallMsgTcpConns 0 currSmallMsgTcpConnIndex 0 currLargeMsgTcpConnIndex 0
reconnectTcpConns (0000000000000000) disconnectTcpConns (0000000000000000)
Inuse owner:
[ 0]:0 [ 1]:0 [ 2]:0 [ 3]:0
[ 4]:0 [ 5]:0 [ 6]:0 [ 7]:0
[ 8]:0 [ 9]:0 [10]:0 [11]:0
[12]:0 [13]:0 [14]:0 [15]:0
IpPair Table (offset 0 [555/0/1]):
idx iface status ping_cnt source destination subnet
0 eth1 up 0 10.241.1.22 10.241.1.21 10.241.1.0/24
```
**Example of mmlsconfig for Compute cluster**

In the mmlsconfig output, the subnets parameter is found in list of configuration parameters.
```
subnets 10.241.1.0/scale-cluster.compscale.com;scale-cluster.strgscale.com
```

**Example - mmdiag --network for Storage cluster**

Logical subnet can be seen under my addr list.In below result under hostname and idx destination, hostnames of nodes and destination IPs can be found.

For Storage cluster: scale is getting configured only on both IPs hence primary and secondary IPs can be seen in the result.Individually for each nodes, Details can be found under Connection details and IpPair Table will show source ip and destination ip.

```
[root@scale-cluster-storage-1 ~]# mmdiag --network
=== mmdiag: network ===
Pending messages:
(none)
Inter-node communication configuration:
tscConnMode mrot
tscTcpPort 1191
my address 10.241.1.23/24 (eth0) <c0n0>
my addr list 10.241.1.23/24 (eth0)/scale-cluster.strgscale.com;scale-cluster.compscale.com 10.241.1.26/24 (eth1)/scale-cluster.strgscale.com;scale-cluster.compscale.com
my subnet list 10.241.1.0/24
my node number 1
TCP Connections between nodes:
hostname node idx destination status err sock sent(MB) recvd(MB) ostype
scale-cluster-storage-3 <c0n1> 0 10.241.1.25 connected 0 126 0 0 Linux/L
scale-cluster-storage-3 <c0n1> 1 10.241.1.24 connected 0 130 0 0 Linux/L
scale-cluster-storage-2 <c0n2> 0 10.241.1.27 connected 0 127 0 0 Linux/L
scale-cluster-storage-2 <c0n2> 1 10.241.1.30 connected 0 131 0 0 Linux/L
scale-cluster-storage-4 <c0n3> 0 10.241.1.28 connected 0 124 0 0 Linux/L
scale-cluster-storage-4 <c0n3> 1 10.241.1.29 connected 0 133 0 0 Linux/L
scale-cluster-compute-1-sec <c0n4> 0 10.241.1.22 connected 0 128 0 0 Linux/L
scale-cluster-compute-1-sec <c0n4> 1 10.241.1.22 connected 0 137 0 0 Linux/L
scale-cluster-compute-4-sec <c0n5> 0 10.241.1.20 connected 0 138 0 0 Linux/L
scale-cluster-compute-4-sec <c0n5> 1 10.241.1.20 connected 0 141 0 0 Linux/L
scale-cluster-compute-3-sec <c0n6> 0 10.241.1.21 connected 0 139 0 0 Linux/L
scale-cluster-compute-3-sec <c0n6> 1 10.241.1.21 connected 0 143 0 0 Linux/L
scale-cluster-compute-2-sec <c0n7> 0 10.241.1.19 connected 0 140 0 0 Linux/L
scale-cluster-compute-2-sec <c0n7> 1 10.241.1.19 connected 0 142 0 0 Linux/L
Connection details:
<c0n1> 10.241.1.24/0 (scale-cluster-storage-3)
status connected was_broken 0 err 0 reconnEnabled 1 delayedAckEnabled 1
connMode mrot shutting 0 handlerCount 0 need_notify 0 leaseSentOn 1
nMaxTcpConns 2 (2) nActiveCount 2 nActiveState 0x3 (1100000000000000)
nInuseTcpConns 0 currTcpConnIndex 1 availableTcpConns (1111111111111111)
nReservedSmallMsgTcpConns 0 currSmallMsgTcpConnIndex 0 currLargeMsgTcpConnIndex 0
reconnectTcpConns (0000000000000000) disconnectTcpConns (0000000000000000)
Inuse owner:
[ 0]:0 [ 1]:0 [ 2]:0 [ 3]:0
[ 4]:0 [ 5]:0 [ 6]:0 [ 7]:0
[ 8]:0 [ 9]:0 [10]:0 [11]:0
[12]:0 [13]:0 [14]:0 [15]:0
IpPair Table (offset 1 [559/0/2]):
idx iface status ping_cnt source destination subnet
0 eth0 up 0 10.241.1.23 10.241.1.24 10.241.1.0/24
1 eth1 up 0 10.241.1.26 10.241.1.25 10.241.1.0/24
```
**Example - mmlsconfig Storage cluster**

In the mmlsconfig output, the subnets parameter is found in list of configuration parameters.

```
subnets 10.241.1.0/scale-cluster.strgscale.com;scale-cluster.compscale.com
```

78 changes: 78 additions & 0 deletions roles/mrot_config/tasks/common.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---

# Common task which will run on all nodes.

# To check and set arp_filter on cluster nodes.

- name: check | Check if arp_filter is not set
shell: sysctl net.ipv4.conf.default.arp_filter net.ipv4.conf.all.arp_filter
register: arp_filter_status
changed_when: false

- name: cluster | Set arp_filter
shell: |
sysctl -w net.ipv4.conf.default.arp_filter=1
sysctl -w net.ipv4.conf.all.arp_filter=1
when: arp_filter_status.stdout_lines[0] == "net.ipv4.conf.default.arp_filter = 0" and arp_filter_status.stdout_lines[1] == "net.ipv4.conf.all.arp_filter = 0"
ignore_errors: yes

# To get ip address of primary and secondary interface on cluster nodes.

- name: cluster | Get primary IP address
shell: ip addr show {{ scale_pri_interface_name }} | awk '$1 == "inet" {gsub(/\/.*$/, "", $2); print $2}'
register: primary_ip

- name: cluster | Get secondary IP address
shell: ip addr show {{ scale_sec_interface_name }} | awk '$1 == "inet" {gsub(/\/.*$/, "", $2); print $2}'
register: secondary_ip

# To extract network address of storage and compute cluster.

- name: cluster | Extract compute network address
shell: echo "{{ compute_subnet_cidr }}" | awk -F'/' '{print $1}'
register: compute_network_addr

- name: cluster | Extract storage network address
shell: echo "{{ storage_subnet_cidr }}" | awk -F'/' '{print $1}'
register: storage_network_addr

# To check and install NetworkManager-dispatcher-routing-rules on cluster nodes and post that enable and start it.

- name: Get RHEL version
shell: cat /etc/redhat-release | grep -oE '[0-9]+\.[0-9]+' | head -1
register: rhel_version_output

- debug:
var: rhel_version_output.stdout_lines[0]

- name: Parse RHEL version
set_fact:
rhel_version: "{{ rhel_version_output.stdout_lines[0] }}"

- name: Install tasks block for RHEL
block:
- name: Check | Check if NetworkManager-dispatcher-routing-rules is installed
shell: rpm -q NetworkManager-dispatcher-routing-rules
register: nm_dispatcher_installed
ignore_errors: yes
failed_when: nm_dispatcher_installed.rc == 2

- name: Install | Install NetworkManager-dispatcher-routing-rules if not installed
yum:
name: NetworkManager-dispatcher-routing-rules
state: present
register: nmd_installed
when: nm_dispatcher_installed.rc != 0

- name: Install | Enable NetworkManager-dispatcher service
service:
name: NetworkManager-dispatcher
enabled: yes
when: nmd_installed.changed == true

- name: Install | Start NetworkManager-dispatcher service
service:
name: NetworkManager-dispatcher
state: started
when: nmd_installed.changed == true
when: rhel_version in ["7.9", "8.6"] and 'RedHat' in ansible_facts.distribution
45 changes: 45 additions & 0 deletions roles/mrot_config/tasks/logical_subnet_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---

# Check if Logical subnet config already exists

- name: configure | Check if logical subnet is already configured
shell: |
/usr/lpp/mmfs/bin/mmlsconfig -Y | grep -q subnets
register: logical_subnets_exist
ignore_errors: yes
failed_when: logical_subnets_exist.rc == 2

- debug:
var: logical_subnets_exist.cmd

# Configure logical subnet on cluster

- name: configure | Logical subnet using mmchconfig
command: mmchconfig subnets='{{ storage_network_addr.stdout_lines[0] }}/{{ scale_cluster_clustername }};{{ opposit_cluster_clustername }}'
register: logical_subnet_configured
when: logical_subnets_exist.rc != 0

# Do shutdown the gpfs cluster

- name: cluster | Shutdown gpfs cluster
command: mmshutdown -a
register: shutdown_gpfs_cluster
when: logical_subnet_configured.changed == true

- name: Wait for 10-second
pause:
seconds: 10
when: logical_subnet_configured.changed == true

- name: cluster | Startup gpfs cluster
command: mmstartup -a
register: started_gpfs_cluster
when: shutdown_gpfs_cluster.changed == true

- name: Wait until FILESYSTEM comes up
shell: "mmhealth cluster show -Y | grep FILESYSTEM | cut -d ':' -f 12"
register: filesystem_started
until: filesystem_started.stdout == "1"
retries: 60
delay: 60
when: is_admin_node | default(false) == true and scale_cluster_type == 'storage' and started_gpfs_cluster.changed == true
34 changes: 34 additions & 0 deletions roles/mrot_config/tasks/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
# Enabling MROT for IBM Storage Scale.

# Task to be executed on the all nodes of cluster
- name: common | Executing common tasks on cluster nodes
block:
- import_tasks: common.yaml

# Task to be execute only on compute nodes to configure MROT
- name: configure | Executing specific tasks on the compute cluster to check and cofigure MROT configuration.
vars:
variable_sets:
- { network_addr_1: "{{ compute_network_addr.stdout_lines[0] }}", network_addr_2: "{{ storage_network_addr.stdout_lines[0] }}", subnet_cidr_1: "{{ compute_subnet_cidr }}", subnet_cidr_2: "{{ storage_subnet_cidr }}", primary_ip: "{{ primary_ip.stdout }}", secondary_ip: "{{ secondary_ip.stdout }}"}
block:
- import_tasks: mrot_config.yaml
when: scale_cluster_type == "compute"

# Task to be execute only on storage nodes to configure MROT
- name: configure | Executing specific tasks on the storage cluster to check and cofigure MROT configuration.
vars:
variable_sets:
- { network_addr_1: "{{ storage_network_addr.stdout_lines[0] }}", network_addr_2: "{{ storage_network_addr.stdout_lines[0] }}", subnet_cidr_1: "{{ storage_subnet_cidr }}", subnet_cidr_2: "{{ storage_subnet_cidr }}", primary_ip: "{{ primary_ip.stdout }}", secondary_ip: "{{ secondary_ip.stdout }}"}
block:
- import_tasks: mrot_config.yaml
when: scale_cluster_type == "storage"

# Task to be execute only on admin node to configure logical subnet
- name: configure | Check and configuration logical subnet
block:
- import_tasks: logical_subnet_config.yaml
when:
- (is_admin_node | default(false) == true and scale_cluster_type == 'compute') or
(is_admin_node | default(false) == true and scale_cluster_type == 'storage')

68 changes: 68 additions & 0 deletions roles/mrot_config/tasks/mrot_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---

# Task to check and add custom routing tables on cluster nodes.

- name: check | Check if custom routing tables exist
shell: grep -q 'subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }}' /etc/iproute2/rt_tables && grep -q 'subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }}' /etc/iproute2/rt_tables
register: routing_table_exists
ignore_errors: yes
with_items: "{{ variable_sets }}"
failed_when: routing_table_exists.rc == 2

- debug:
var: routing_table_exists.results[0].cmd

- name: configure | Custom routing tables
shell: |
echo "200 subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }}" >> /etc/iproute2/rt_tables
echo "201 subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }}" >> /etc/iproute2/rt_tables
with_items: "{{ variable_sets }}"
when: routing_table_exists.results[0].rc != 0

# Task to check and add custom IP rules on cluster nodes.

- name: check | Check if custom IP rules exist
shell: ip rule show | grep -q "lookup subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }}" && ip rule show | grep -q "lookup subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }}"
register: custom_ip_rules_exist
ignore_errors: yes
with_items: "{{ variable_sets }}"
failed_when: custom_ip_rules_exist.rc == 2

- debug:
var: custom_ip_rules_exist.results[0].cmd

- name: configure | Custom IP rules
shell: |
echo "from {{ item.primary_ip }}/32 table subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }}" >> /etc/sysconfig/network-scripts/rule-{{ scale_pri_interface_name }}
echo "from {{ item.secondary_ip }}/32 table subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }}" >> /etc/sysconfig/network-scripts/rule-{{ scale_sec_interface_name }}
with_items: "{{ variable_sets }}"
when: custom_ip_rules_exist.results[0].rc != 0

# Task to check and add custom IP routes on cluster nodes.

- name: check | Check if custom IP routes exist
shell: |
ip route show table subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }} | grep -q "{{ item.subnet_cidr_1 }}"
ip route show table subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }} | grep -q "{{ item.subnet_cidr_2 }}"
register: custom_ip_routes_exist
ignore_errors: yes
with_items: "{{ variable_sets }}"
failed_when: custom_ip_routes_exist.rc == 2

- debug:
var: custom_ip_routes_exist.results[0].cmd

- name: configure | Custom IP routes
shell: |
echo "{{ item.subnet_cidr_1 }} dev {{ scale_pri_interface_name }} table subnet_{{ item.network_addr_1 }}_{{ scale_pri_interface_name }}" >> /etc/sysconfig/network-scripts/route-{{ scale_pri_interface_name }}
echo "{{ item.subnet_cidr_2 }} dev {{ scale_sec_interface_name }} table subnet_{{ item.network_addr_2 }}_{{ scale_sec_interface_name }}" >> /etc/sysconfig/network-scripts/route-{{ scale_sec_interface_name }}
with_items: "{{ variable_sets }}"
when: custom_ip_routes_exist.results[0].rc != 0

# Task to check and Restart NetworkManager service on cluster nodes.

- name: configure | Restart NetworkManager service
service:
name: NetworkManager
state: restarted
when: routing_table_exists.results[0].rc != 0 or custom_ip_rules_exist.results[0].rc != 0 or custom_ip_routes_exist.results[0].rc != 0
5 changes: 5 additions & 0 deletions roles/mrot_config/vars/main.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Static Variables for MROT

# Interface names
scale_pri_interface_name: "eth0"
scale_sec_interface_name: "eth1"
Loading

0 comments on commit 5dcea25

Please sign in to comment.