Skip to content

Commit

Permalink
compute: attempt to destroy the VM when network setup fails.
Browse files Browse the repository at this point in the history
When the Nomad client calls start task and an error is returned,
it will not call stop task as there is, from its view, no task to
stop.

The driver needs to attempt to rollback any work conducted when
starting a task, if a partial failure occurs. In this case it is
the network setup that fails, so we must try and destroy the
already running VM. Without this, we leak VMs in partial failure
modes.
  • Loading branch information
jrasell committed Oct 4, 2024
1 parent cff8c93 commit 6901e32
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion virt/driver.go
Original file line number Diff line number Diff line change
Expand Up @@ -665,14 +665,26 @@ func (d *VirtDriverPlugin) StartTask(cfg *drivers.TaskConfig) (*drivers.TaskHand
Resources: cfg.Resources,
}

// Build out the network now that the VM has been started.
//
// In the event of an error, we need to try and destroy the already running
// VM. Nomad will not do this, as technically the task has not been started
// from its perspective.
//
// In the future, we may want to add some retry logic when destroying a VM,
// however, at least attempting it is a good start.
netBuildResp, err := d.networkController.VMStartedBuild(&netBuildReq)
if err != nil {
if destroyDomainErr := d.virtualizer.DestroyDomain(taskName); destroyDomainErr != nil {
d.logger.Error("virt: failed to destroy domain, manual cleanup needed",
"task_name", taskName, "error", destroyDomainErr)
}
return nil, nil, fmt.Errorf("virt: failed to build task network: %w", err)
}

h.netTeardown = netBuildResp.TeardownSpec

d.logger.Info("task started successfully", "taskName", taskName)
d.logger.Info("task started successfully", "task_name", taskName)

// Generate our driver state and send this to Nomad. It stores critical
// information the driver will need to recover from failure and reattach
Expand Down

0 comments on commit 6901e32

Please sign in to comment.