Data Serving Results

Jump to bottom Edit New page

Mark Sutherland edited this page Oct 5, 2018 · 3 revisions

Experiments History

All numbers using the Huawei ARM server. Clients and servers pinned to separate NUMA nodes, all DVFS off.
16 GB dataset (all scripts found in this repo)

Results for 16 Server Cores

10GB server memory

20GB server memory

40GB server memory

Outstanding Questions

What are the avg. and 99th percentile latencies for a fully-in-memory configuration at full CPU load? (Not realistic operation point, use a different piece of software if you want in-memory).
- From 40GB graphs, est. avg. latency: < 1ms, tail still not seen.
What are the avg. and 99th percentile latencies for a "bigger-than-memory" configuration at full CPU load? (Experiment with buffer cache size, other Cassandra parameters on CS3 webpage).
- From 20GB graphs, estimate avg latency: 25-30ms. Tail latency: 300-500ms.
How to get more load into the 16c configurations? Right now we saturate about 60% server utilization.
- need more client cores. Right now, bottlenecked by client-side queueing.
- Use more physical servers?

Results for 1 Server Core

1 Client Thread

15 Client Threads

30 Client Threads

Outstanding Questions

(All questions based on the 30 thread graph, since we actually begin to see server-side queueing there, which is representative behaviour)

What are the avg. and 99th percentile latencies for a "bigger-than-memory" configuration at full CPU load? (Experiment with buffer cache size, other Cassandra parameters on CS3 webpage). What is the attained throughput?
- Rerun the 30 client thread exp. with a scaled-down memory/core ratio (use AWS or Google Compute Cloud as a baseline).
How many server CPU cores need to be added before 30 client threads cannot reach 100% utilization?