-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flightcontroller: ensure more security group rules #601
base: master
Are you sure you want to change the base?
Conversation
e7a7e6b
to
30ed716
Compare
func createLoadbalancerSecurityGroup(kluster *v1.Kluster, client *gophercloud.ServiceClient) (*securitygroups.SecGroup, error) { | ||
sg, err := securitygroups.Create(client, securitygroups.CreateOpts{ | ||
Name: fmt.Sprintf(LoadBalancerSecurityGroupName, kluster.Spec.Name), | ||
Description: fmt.Sprintf(`Kubernikus: SecurityGroup containing LoadBalancer Self-IPs for cluster "%s"`, kluster.Spec.Name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description: fmt.Sprintf(`Kubernikus: SecurityGroup containing LoadBalancer Self-IPs for cluster "%s"`, kluster.Spec.Name), | |
Description: fmt.Sprintf(`Kubernikus: SecurityGroup containing LoadBalancer VIP and Self-IPs for cluster "%s"`, kluster.Spec.Name), |
}, | ||
} | ||
|
||
if lbGroup != nil && lbGroup.ID != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this defensive programming intentional? Because it seems like we ensure the lbGroup is not nil on top (e.g. we error out if it is). So I would be fine in simplifying this and moving it in the "static" wantedRules
above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be nil, eg. when there is not enough quota in the project for creating an additional security group. This ensures the other rules even if that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, good point but it looks to me like at the moment if creating the security group fails there is an early return (with en error) and no other security groups are enforced anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is it returning? I think I'm ignoring the error while getting the security group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this only exiting EnsureLoadbalancerSecurityGroup()
? To me it looks like EnsureKubernikusRulesInSecurityGroup()
runs anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, sorry I misread the code. Nvm
} | ||
allLbPages, err := loadbalancers.List(c.NetworkClient, loadbalancers.ListOpts{}).AllPages() | ||
if err != nil { | ||
return false, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, err | |
return false, fmt.Errorf("Failed to list load balancers: %w", err) |
} | ||
allLoadbalancers, err := loadbalancers.ExtractLoadBalancers(allLbPages) | ||
if err != nil { | ||
return false, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, err | |
return false, fmt.Errorf("Failed to extract load balancers: %w", err) |
} | ||
allPortPages, err := ports.List(c.NetworkClient, portListOpts{ProjectID: kluster.Labels["account"], SecurityGroups: lbGroup.ID}).AllPages() | ||
if err != nil { | ||
return false, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, err | |
return false, fmt.Errorf("Failed to list ports: %w", err) |
} | ||
allPorts, err := ports.ExtractPorts(allPortPages) | ||
if err != nil { | ||
return false, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, err | |
return false, fmt.Errorf("Failed to extract ports: %w", err) |
if !found { | ||
_, err := ports.Update(c.NetworkClient, lb.VipPortID, ports.UpdateOpts{SecurityGroups: &sg}).Extract() | ||
if err != nil { | ||
return false, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return false, err | |
return false, fmt.Errorf("Failed to add security group %s to port %s: %w", err, lbGroup.ID, lb.VipPortID, err) |
sg := []string{lbGroup.ID} | ||
for _, lb := range allLoadbalancers { | ||
if strings.HasPrefix(lb.Name, fmt.Sprintf("kube_service_%s", kluster.Spec.Name)) { | ||
found := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to replace the inner loop with
if !portsOfSecurityGroup.Has(lb.VipPortID) {
//update port
}
And to this before the whole loop construct:
portsOfSecurityGroup := sets.NewString()
for _, p := range allPorts {
portsOfSecurityGroup.Insert(p.ID)
}
continue | ||
if lbGroup == nil { | ||
lbGroup, err = createLoadbalancerSecurityGroup(kluster, c.NetworkClient) | ||
if err != nil || lbGroup == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure the || lbGroup == nil
is required. Looking at createLoadbalancerSecurityGroup it returns an error when the create call fails. I don't see how err
and lbGroup
can be nil at the same time.
return q.String(), err | ||
} | ||
|
||
func getLoadbalancerSecurityGroup(kluster *v1.Kluster, client *gophercloud.ServiceClient) (*securitygroups.SecGroup, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we are getting two different security groups in this controller maybe its worthwhile to have a single getSecurityGroupByName(name string, client *gophercloud.ServiceClient)
function instead if this special purpose function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. Would be awesome now also change the e2e test to actually test that all this works:
- use an empty security group for e2e cluster, cleanup during teardown
- add loadbalancer happy path url check to test loadbalancer <-> nodes connectivity is working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. One small nitpick but looks good otherwise
test/e2e/setup_test.go
Outdated
@@ -37,6 +39,12 @@ func (s *SetupTests) Run(t *testing.T) { | |||
require.True(t, running, "The Kluster must be Running") | |||
} | |||
|
|||
func (s *SetupTests) CreateSecurityGroup(t *testing.T) { | |||
sg, err := securitygroups.Create(s.OpenStack.Network, securitygroups.CreateOpts{Name: "kubernikus"}).Extract() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would create a dedicated security group that is named after the current soak test cluster.
055b65c
to
8b1d66d
Compare
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
8b1d66d
to
48fa697
Compare
4b8c704
to
b7c6e22
Compare
I haven't checked the code but only your comment but IIRC for UDP you want a port 123 Ingress rule for NTP and port 53 Ingress for DNS as well (because it's stateless so the firewall can't track the state). Also for DNS you might want a TCP port 53 Egress rule as well as DNS now can use both. |
Are you sure that's the case? Because allowing udp egress 53 and 123 definitlty fixes dns issues and time sync issues. So it seems to me nsx-t is doing some "connection tracking" for udp packets as well (like iptables does, e.g. state
That's a valid point |
At least for NTP a customer had this issue today (not in a kubernikus environment) so I assume that the ingress rules are needed as well, but I haven't tested it myself. Here is the discussion: https://convergedcloud.slack.com/archives/C374AQJ3W/p1656505078940829 |
0de9f29
to
e416930
Compare
* add metrics for security group reconciliation errors * add kluster event for securitu group reconcilicaiton errors * e2e: * create empty security group for cluster * run kube-detective * call loadbalancer url
836ceaf
to
4ddfb44
Compare
6cf0471
to
835adac
Compare
835adac
to
94e35de
Compare
This PR refactors the
EnsureKubernikusRuleInSecurityGroup
to support reconciling more then one rule.It now also creates the rule lazily only if no other broader rules are already covering the wanted rules.
In addition to the podCIDR rule we had, it now also added egress rules for ntp and the cluster apiserver.
This PR create the following rules lazily if there isn't already a rule allowing the traffic:
egress:
ingress:
TODO: