-
Notifications
You must be signed in to change notification settings - Fork 69
Arktos Scalability 430 2021 Tracks
Ying Huang edited this page Mar 31, 2021
·
59 revisions
- Multiple resource partitions for pod scheduling (2+ for 430) - primary goal
- A tenant can have pods physically located in 2 different RPs - upon scheduling
- The scheduler in a tenant partition should be able to listen to multiple api servers belong to each RP
- Performance test for 50K hollow nodes. (2 TP, 2~3 RP)
- Performance test runs with SSL enabled
- QPS optimization
- Daemon set handling in RP - non primary goal
- Remove from TP
- Support daemon set in RP
- Load test
- Dynamic add/delete TP/RP design - TBD
- For design purpose only, not for implementation - consider as trying to avoid hardcode in a lot of places
- Quality bar only
- Dynamically discover new tenant partitions based on CRD objects in its resource manager
- System partition pod handling - TBD
- API gateway
- 10Kx2 cluster: 1 resource partition, support 20K hosts; 2 tenant partitions, each support 300K pods
- Density test passed
- 10Kx1 cluster: 1 resource partition, support 10K hosts; 1 tenant partition, each support 300K pods
- Density test passed
- Load test completed (with known failures)
- Single cluster
- 8K cluster passed density test, load test completed (with known failures)
- 10K cluster density test completed with etcd too many request error, load test completed (with know failures)
- Code change for 2TPx1RP mostly completed and merged into master (v0.7.0)
- Enable SSL in performance test - WIP (Yunwen)
- Use insecure mode in local cluster setting for POC (Agreed on 3/1/2021)
- Kubelet
- Use a dedicated kube-client to talk to the resource manager.
- Use multiple kube-clients to connect to multiple tenant partitions.
- Track the mapping between tenant ID and kube-clients.
- Use the right kube-client to do CRUD for all objects (To verify)
- Controllers
- Node controllers (in resource partition)
- Use a dedicated kube-client to talk to the resource manager.
- Use multiple kube-clients to talk to multiple tenant partitions.
- Other controllers (in tenant partition)
- If the controller list/watches node objects, it needs to use multiple kube-clients to access multiple resource managers.
- DaemonSet controller (Service/PV/AttachDetach/Garbage)
- [] Move TTL/DaemonSet controllers to RP
- Disable in TP, enable in RP
- Identify resources belong to RP only
- Further perf and scalability improvements (TBD, currently non goal)
- Partition or not cache all node objects in a single process.
- Node controllers (in resource partition)
- Scheduler
- Use a dedicated kube-client to talk to its tenant partition.
- Use multiple kube-clients to connect to multiple resource managers, list/watching nodes from all resource managers.
- [?] Use the right kube-client to update nodes objects.
- Further perf and scalability improvements (TBD)
- Improve scheduling algorithm to reduce the possibility of scheduling conflicts.
- Improve scheduler sampling algorithm to reduce scheduling time.
- API server - TBD
- Current haven't identified areas that need to be changed
- Proxy
- Working on a design that will evaluate proxy vs. code change in each components (TBD)
- Performance test tools
- Cluster loader
- How to talk to node in perf test (Hongwei)
- Kubemark
- Support 2 TP scale out cluster set up, insecure (0.7)
- Support 2 TP scale out cluster set up, secure mode
- Support 2 TP, 2 RP scale out cluster set up, secure mode
- Kube-up
- Support for scale out (current only kubemark support scale out)
- Cluster loader
- Performance test
- Single RP capacity test (>= 25K, preparing for 25Kx2 goal)
- QPS optimization (x2, x3, x4, etc. in density test)
- Regular density test for 10K single cluster, 10Kx2. Each will be done after 500 node test
- 2TP (10K), 1RP (20K), 20K density test, secure mode
- Dev tools
- One box setup script for 2 TP, 1 RP (Peng, Ying)
- One box setup script for 2 TP, 2 RP (Ying)
- 1.18 Changes
- Complete golang 1.13.9 migration (Sonya)
- Metrics platform migration (YingH)
- Migrated from metrics server to Prometheus
- Get correct API responsiveness data
- Support multiple RPs
- Density test in 500 nodes (General guidelines): 2TP/2RP, scale up - 3/27
- 2TP/2RP 2x5K density test (Sonya) - 3/30
- Scale up 500 density test (Sonya) - 3/30
- 2TP/2RP 2x10K density test (Sonya) - started on 3/31
- 2TP/2RP 2x10K density test with double RS QPS (Sonya) - planed on 4/1
- Perf test for SSL mode
- 1TP/1RP 500 nodes (Sonya) - done
- 2TP/1RP 500 nodes (Sonya, Yunwen) - 3/30
- 1TP/1RP 10K nodes (Sonya) - parking due to 1.4
- 1TP/1RP limit test (Sonya, YingH)
- 15K density test in SSL mode
- QPS tuning (YingH, Sonya)
- Increase replicaset controller QPS - test
- Issue tracker
- Check node authorizer in secure mode (Yunwen, YingH)
- haproxy ssl check causes api server "TSL handshake error" - Yunwen master PR 1060 Issue 1048
- Kubelet failed to upload events due to authorization error - Yunwen Issue 1046
- KCM (deployment controller) on TP cluster failed to sync up deployment with its token - Yunwen master Issue 1039
- KCM on TP cluster didn't get nodes in RP cluster(s) - Yunwen master PR 1040 Issue 1038
- Failed to change ttl annotation for hollow-node - Yunwen Issue 1054
- TP2: Unable to authenticate the request due to an error: invalid bearer token - Yunwen Issue 1055
- RP server failed to collect pprof files - Sonya PR 1058 Issue 1057
- Change scheduler PVC binder code to support multiple RPs - Hongwei Issue 1059
- Kubeup/Kubemark improvement
- Reduce 2TP/2RP cluster start up time (Sonya, Hongwei)
- Multiple resource partition design - decided to continue multiple client connection changes in all components for multiple RP for now. Will re-design if encountered issue in current approach. (2/17)
- Setup local cluster for multiple TPs, RPs (Done - 2/24)
- Component code changes
- TP components connect to RP directly (Done - 3/15)
- RP components connect to TP directly (Done - 3/12)
- Disable/Enable controllers in TP/RP
- Support multiple RPs in kube-up (Done)
- Enable SSL in performance test
- Code change (3/12) RP 1001
- 1TP/1RP 500 nodes perf test (3/12)
- Perf test code changes (Done)
- Perf test changes needs for multiple RPs (3/18)
- Disable DaemonSet test in load (3/25) PR 1050
- Performance test (WIP)
- Test single RP limit
- 1TP/1RP achieved 40K hollow nodes (3/3). RP CPU ~44%
- Get more resource in GCP (80K US central 3/8)
- 10K density test insecure mode - benchmark (3/18)
- Multiple TPs/RPs density test
- 2TP/2RP 2x500 passed (3/27)
- Test single RP limit
- QPS tuning
- Complete golang 1.13.9 migration (Done - 3/12)
- Kube-openapi upgrade Issue 923
- Add and verify import-alias (2/10) PR 965
- Add hack/arktos_cherrypick.sh (2/19) PR 990
- Promote admission webhook API to v1. Arktos only support v1beta1 now (2/20) PR 981
- Promote admissionreview to v1. Arktos only support v1beta1 now (2/25) PR 998
- Promote CRD to v1 - (3/3) PR 1004
- Bump kube-openapi to 20200410 version and SMD to V3 (3/12) PR 1010
- Kube-openapi upgrade Issue 923
- Regression fix
- Failed to collect profiling of ETCD (3/11) Issue 1008 PR 1009
- Static pods being recycled on TP cluster Issue 1006 (Yunwen/Verifying)
- ETCD object counts issue in 3/10 run (3/16) PR 1027 Issue 1023
- Metrics platform migration (YingH)
- Regression fix
- 500 nodes load run finished with error: DaemonSets timeout Issue 1007
- System partition pod - how to handle when HA proxy is removed (TBD)
- Density test should be OK
- Issues
- GC controller queries its own master nodes' lease info and cause 404 error in haproxy Issue 1047 - appears to be in master only. Fixed in POC. Park issue till POC changes being port back to master.
- [Scale out POC] pod scheduler reported bound successfully but not appear in local Issue 1049 - related to system tenant design. Post 430
- [Scale out POC] secret not found in kubelet Issue 1052 - related to system tenant design. Post 430
- Tenant zeta request was not redirected to TP2 master correctly Issue 1056 - current proxy limitation
- Static pods being recycled on TP cluster (fixed in POC) PR 1044 Issue 1006
- Controllers on TP should union the nodes from RP cluster and local cluster - fixed in POC PR 1044 Issue 1042