Skip to content

How Otter calls Nova

Manish Tomar edited this page Mar 18, 2015 · 4 revisions

Otter uses Nova to launch and delete servers. It is also does lot of GET serverId requests for each server it is launching. This page lists in detail how otter calls Nova. otter is written Twisted that is async framework to do IO in python. So, there is only one thread doing all the networking in non-blocking manner. All the code that does launching and deleting servers lives in https://github.com/rackerlabs/otter/blob/master/otter/worker/launch_server_v1.py

How otter auths?

Before lauching/deleting servers/CLB nodes, otter needs to get auth token for the tenant it is launching servers. All the code lives here. It does this by impersonating the tenant using following procedure:

  1. Gets first admin username for the tenant using /v1.1/mosso/tenant_id API using its identity admin credentials
  2. Authenticates itself and gets token by sending admin identity credentials to tokens. It caches this token until it receives 401 on any further calls.
  3. Impersonates the user and gets user's token by calling /RAX-AUTH/impersonation-tokens with 3 hour expiry
  4. It caches the tenant's token for 5 mins so that any further call w.r.t this tenant uses the cached token.

Please note that launching servers described below is a long process and all the calls made to Nova/CLB use the same cached token got from above procedure. It is possible that the launching process can fail with 401 if the token is expired during the process. However, this shouldn't happen since each token is valid for 3 hours and the process below shouldn't ideally take that long. One of the cases where it can happen is if server built for 1:55 hours and went to ERROR. This was replaced with another server whose build went for > 1:05 hours where it has crossed 3 hour token limit causing 401 error.

When launching servers:

When a policy to launch say 50 servers is executed otter sends 2 POST servers requests at same time. When response from any one of those requests comes back it immediately sends 3rd POST servers request. And so it continues until 50 POST servers requests are sent. Hence, it has concurrency limit of 2 in one node. Please note that limit is independent of number of policies getting executed at that time. For example, if 10 policies each launching 50 servers from 10 different groups of 10 different tenants are getting executed at same time the concurrency limit of 2 still remains, i.e. 500 POST servers are sent with sliding window of 2 at a time. This is true for one API node and we've 3 API nodes per region. Hence, the overall concurrency limit is 6 per region.

Sometimes, create server request (POST servers) fails with 500 but the server is created anyways. In those cases otter will wait for 5 seconds after error, GET that particular server it tried to create to see if got created anyways. If it didn't get created, it will try the creation cycle again for 3 times eventually giving up after that.

After each successful create request, otter will call GET ../serverID on that server until the server becomes ACTIVE. It does this at 20 seconds interval and times out after 2 hours. It does not wait for all servers to be created. It starts immediately after a server create request succeeds.

Sometimes when building server it goes into ERROR state at which time otter will delete that ERRORed server and create new one after 15 seconds. This continues for max of 4 times after which otter gives up and logs error. The deletion of ERRORed is done using technique (delete->get cycle) described below.

When deleting servers:

When a policy to delete say 50 servers is executed, otter will send 50 DELETE requests at same time. There is no concurrency limit when deleting servers. For each server delete, otter will do GET serverID and check if "OS-EXT-STS:task_state" is "deleting" for that server after successful delete. If not, it will try the delete->get cycle again after 2 seconds and then continue doing that with exponential backoff and give up after 10 times.

How Autoscale is planning to use Nova:

The code shown earlier to launch and delete is going to be replaced by feature we are working on called "convergence". It is being developed here. Here, instead of polling each server status after creating server, it will fetch all the servers in the group after some time and decide what to do (launch or delete servers) based on the servers returned from Nova. This helps in multiple scenarios: a new server is created if user deletes server in a group; server is deleted and new one created if it goes into ERROR. You can find some more information at https://github.com/rackerlabs/otter/wiki/Convergence-Specification and https://github.com/rackerlabs/otter/wiki/User-facing-convergence-changes.

The important thing to note with this implementation is that there will be many GET servers/details?limit=100 calls instead of separate GET serverId call for each tenant. We've not yet decided on the frequency of this call. We are definitely planning to throttle it per group where it will ensure that this call is made once every (say) 30 seconds for each group. We think this should be better for Nova as otter will be doing 1 GET servers call instead of 50 parallel GET serverId calls. However, this can increase load on a single tenant if many groups in a tenant are converging at similar time ranges. We might throttle it per tenant basis to ease the load: i.e. fetch servers for a tenant every 30 seconds. Each GET servers/limit call that fails will be retried with exponential backoff.

One of the nice things of this design is that execution is different from deciding what to execute. This allows us to have flexible throttling policies when launching and deleting servers. We are internally discussing about the code/architecture changes that allows us to achieve this. Our intention is to have this to be flexible component that can change as Nova's reliability continues to increase. For example, one of the throttling changes would be that when deleting instead of retrying 10 times, otter might retry only once and converge again. By "converge", it means it will fetch all servers and decide to delete servers depending on servers returned. There have been some discussions in here. However, it is being tracked here.

Metrics collection:

Apart from normal otter workflow of creating/deleting servers we plan to collect metrics by doing GET servers/details?limit=100 on each tenant that is an Autoscale user in each region. This would be 300-400 tenants in ORD and reduced numbers in other regions. This servers collection on all tenants will be done every minute, i.e. 300 GET calls will be made at with concurrency limit of 10 every minute. Each of the call made will have exponential backoff based retries starting at 2 seconds and max of 5 times. We were doing this but stopped after there were some issues.