Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

posix: aws_system_info_processor_count over-estimates the number of CPUs on k8s / in container #893

Closed
grrtrr opened this issue Mar 19, 2022 · 3 comments
Labels
feature-request A feature should be added or improved. needs-review This issue or pull request needs review from a core team member. p3 This is a minor priority issue

Comments

@grrtrr
Copy link

grrtrr commented Mar 19, 2022

The posix aws_system_info_processor_count() function returns the count of physical processors via sysconf(_SC_NPROCESSORS_ONLN) where sysconf is available (on many Linux systems).

Within a container, k8s pod, or any other resource constrained environment (cpusets, cgroups), this causes the
number of available CPUs to be over-estimated.

For instance, on a k8s job with 4 requested CPUs on a 64-core node, the reported value is 64, not 4.

To avoid this over-estimation, additional queries are necessary, looking at for

  • CPU shares,
  • CPU period/quota (CFS values),
  • CPU sets.

Here is a good write-up on how this is estimated in the JDK, producing this formula that takes the above into account:

  total_available_cpus = min(cpuset_cpus, cpu_shares/1024, physical_number_of_cpus)
@bretambrose
Copy link
Contributor

I'm unsure this rises to the level of an actual issue, and if it did it would belong in aws-c-io relative to the event_loop_group_new_default usage of the API. Personally, I feel like no one knows the destination execution environment better than the developer, and so when the brain-dead default isn't appropriate (like the cases you list above), it may be best to defer pool size calculations to the library consumer.

On a side note, I'm not ignoring all of your aws-c-* PRs. Many of them are excellent diagnoses and updates; I just haven't had a chance time/priority-wise to massage/integrate/feedback them.

@jmklix jmklix added feature-request A feature should be added or improved. p3 This is a minor priority issue needs-review This issue or pull request needs review from a core team member. labels Sep 21, 2023
@jmklix
Copy link
Member

jmklix commented Sep 21, 2023

You should be able to set the pool size on you end as bretambrose discussed above. No changes needed to this repo, closing this issue

@jmklix jmklix closed this as completed Sep 21, 2023
@grrtrr
Copy link
Author

grrtrr commented Sep 21, 2023

It might be a good idea to warn the developer about the limitations of defaults. std::thread::hardware_concurrency() has the same problem (it looks as if it uses sysconf underneath).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. needs-review This issue or pull request needs review from a core team member. p3 This is a minor priority issue
Projects
None yet
Development

No branches or pull requests

3 participants