Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add nvidia MIG Settings #63

Merged
merged 1 commit into from
Feb 5, 2025

Conversation

piyush-jena
Copy link
Contributor

@piyush-jena piyush-jena commented Nov 5, 2024

Issue #, if available:

Description of changes:

  • Settings sdk changes for supporting settings.kubelet-device-plugins.nvidia.device-partitioning-strategy and settings.kubelet-device-plugins.nvidia.mig.profile in bottlerocket.

Testing:

  1. Model Default:
bash-5.1# apiclient get settings.kubelet-device-plugin
{
  "settings": {
    "kubelet-device-plugins": {
      "nvidia": {
        "device-id-strategy": "index",
        "device-list-strategy": "volume-mounts",
        "device-partitioning-strategy": "none",
        "device-sharing-strategy": "none",
        "pass-device-specs": true,
      }
    }
  }
}
  1. Model Updates:
bash-5.1# apiclient set settings.kubelet-device-plugins.nvidia.device-partitioning-strategy="mig"
bash-5.1# apiclient apply <<EOF
> [settings.kubelet-device-plugins.nvidia.mig.profile]
> "a100.40gb"="1g.5gb"
> EOF
bash-5.1# apiclient get settings.kubelet-device-plugin
{
  "settings": {
    "kubelet-device-plugins": {
      "nvidia": {
        "device-id-strategy": "index",
        "device-list-strategy": "volume-mounts",
        "device-partitioning-strategy": "mig",
        "device-sharing-strategy": "none",
        "mig": {
          "profile" : {
            "a100.40gb": "1g.5gb"
         }
        },
        "pass-device-specs": true
      }
    }
  }
}
  1. Check:
bash-5.1# apiclient apply <<EOF
> [settings.kubelet-device-plugins.nvidia.mig.profile]
> "hello"="1g.5gb"
> EOF
Failed to apply settings: Failed to PATCH settings from '-' to '/settings?tx=apiclient-apply-7NsnlaurtHEacSYL': Status 400 when PATCHing /settings?tx=apiclient-apply-7NsnlaurtHEacSYL: Json deserialize error: Unable to deserialize into NvidiaGPUModel: NVIDIA GPU Model must match '^([a-z])(\d+)\.(\d+)gb$', given: hello at line 1 column 62
bash-5.1# apiclient apply <<EOF
> [settings.kubelet-device-plugins.nvidia.mig.profile]
> "a100.40gb"="2"
> EOF
bash-5.1# apiclient apply <<EOF
> [settings.kubelet-device-plugins.nvidia.mig.profile]
> "a100.40gb"="5"
> EOF
Failed to apply settings: Failed to PATCH settings from '-' to '/settings?tx=apiclient-apply-GzUHB0axGlWNPzGw': Status 400 when PATCHing /settings?tx=apiclient-apply-GzUHB0axGlWNPzGw: Json deserialize error: Unable to deserialize into MIGProfile: MIG Profile must match '^[0-9]g\.\d+gb$', given: 5 at line 1 column 71

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@piyush-jena piyush-jena force-pushed the mig-settings branch 4 times, most recently from a5f78d1 to f716594 Compare November 11, 2024 07:35
@piyush-jena piyush-jena marked this pull request as draft November 13, 2024 20:26
@piyush-jena piyush-jena force-pushed the mig-settings branch 2 times, most recently from ef6e3dc to 7b480a9 Compare January 15, 2025 20:26
@piyush-jena piyush-jena marked this pull request as ready for review January 21, 2025 23:57
Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed this in depth, just flagged a couple of style issues that would be good to fix while reviewing bottlerocket-os/bottlerocket-core-kit#258

@piyush-jena
Copy link
Contributor Author

Fixed comments in the last force push ^

@piyush-jena
Copy link
Contributor Author

Force push fixes above comments.

Copy link
Contributor

@cbgbt cbgbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the crate versioning comment, this lgtm!

@piyush-jena
Copy link
Contributor Author

Force push updates the settings-extension-kubelet-device-plugins to 0.2.0.

@piyush-jena
Copy link
Contributor Author

Missed Cargo.toml of bottlerocket-settings-sdk

@piyush-jena piyush-jena merged commit c9e5f98 into bottlerocket-os:develop Feb 5, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants