# Major Additions
+## Service account support for spark-infrastructure
+To have more flexible and secure way to authenticate AWS service, we add service account support for spark-infrastructure helm chart to enable the AWS IRSA (IAM Roles Service Account) authentication. See _**How to Upgrade**_ for more information.
## Path to Production Alignment
To better align development processes with processes in CI/CD and higher environments, we no longer recommend using Tilt live-reloading. As such, upgrading projects should consider narrowing the scope of their Tiltfile. See _**How to Upgrade**_ for more information.
## Conditional Steps
+## AWS IRSA (IAM Roles Service Account) Authentication
+This is not a required step but a recommended way to authenticate AWS service
+1. [Create an IAM OIDC provider for your cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html)
+2. Follow the [Assign IAM roles to Kubernetes service accounts](https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html) document but **skip** the step that creates the service account
+3. In the spark-infrastructure chart template, add the service account create configuration as below:
+ serviceAccount:
+ name: service-account-name
+ enabled: true
+ metadata:
+ annotations:
+ # Ref: IAM roles arn from step 2
+ eks.amazonaws.com/role-arn: arn:aws:iam::aws-id:role/iam-role-name
+ deployment:
+ # service account name must match the service account name specified in the IAM roles trust relationships
+ serviceAccountName: service-account-name
## Final Steps - Required for All Projects
### Finalizing the Upgrade
1. Run `./mvnw org.technologybrewery.baton:baton-maven-plugin:baton-migrate` to apply the automatic migrations
# Properties
-| Property | Description | Default |
-| app.name | Sets label for app.kubernetes.io/name | Chart.Name (aissemble-spark-history-chart) |
-| enable | Enable or disable the entirety of the spark-history-server deployment. When false, equivalent to not installing the chart. | true |
-| deployment.annotations | Annotations to apply to the Spark History Server Deployment. | {} |
-| deployment.labels | Labels to apply to the Spark History Server Deployment. | {} |
-| deployment.replicas | Number of replicas for the Spark History Server Deployment. | 1 |
-| deployment.image.repository | Repository for the Spark History Server image. | "apache/spark" |
-| deployment.image.tag | Tag for the Spark History Server image. | "3.5.1" |
-| deployment.image.imagePullPolicy | Image pull policy for the Spark History Server image. | "IfNotPresent" |
-| deployment.command | Command to run in the container. | `["/opt/spark/sbin/start-spark-history-server.sh"]` |
-| deployment.env | Environment variables to set in the Spark History Server Deployment. | `SPARK_NO_DAEMONIZE: "true"` |
-| deployment.envFromSecret | Environment variables to pull from a Secret. Format:
`ENV_VAR.secretName: k8s_secret_name`
`ENV_VAR.key: k8s_secret_key` | {} |
-| deployment.volumes | Volumes to attach to the Spark History Server Deployment. | [] |
-| deployment.volumeMounts | Volume mounts to attach to the Spark History Server Deployment. | [] |
-| deployment.affinity | Node Affinity rule to constrain which nodes your Pod can be scheduled on based on node labels. | {} | |
-| deployment.tolerations | Tolerations rule to ensure that pods are not scheduled onto inappropriate nodes. | [] |
-| dependencies.packages | List of packages to install in the Spark History Server Deployment. | [] |
-| dependencies.jars | List of jars to install in the Spark History Server Deployment. | [] |
-| ingress.enabled | Enable or disable the Spark History Server Ingress. | true |
-| ingress.metadata.annotations | Annotations to apply to the Spark History Server Ingress. | {} |
-| ingress.ingressClassName | Ingress class to use for the Spark History Server Ingress. | "nginx" |
-| ingress.hosts | Hosts to apply to the Spark History Server Ingress. | `[paths: [{path: "/", pathType: "Prefix", backend: {service: {name: "spark-history-service", port: {number: 18080}}}}]]` |
-| service.annotations | Annotations to apply to the Spark History Server Service. | {} |
-| service.type | Type of service to create for the Spark History Server. | "LoadBalancer" |
-| service.port.name | Name of the service port. | "shs-http" |
-| service.port.port | Port number for the service. | 18080 |
-| service.ports.targetPort | The port that the exposed port should map to | 18080 |
-| eventVolume.enabled | Enable or disable the default Event Volume for the Spark History Server. | false |
-| eventVolume.name | Name of the Event Volume. | "spark-events" |
-| eventVolume.mountPath | Mount path for the Event Volume. | "/tmp/spark-events" |
-| eventVolume.storageType | Type of storage to use for the Event Volume. Legal values: `local`, `custom` | "local" |
-| eventVolume.size | Size of the Event Volume. | "1Gi" |
-| eventVolume.accessModes | Access modes for the Event Volume. | ["ReadWriteMany"] |
-| eventVolume.mountOptions | Mount options for the Event Volume. | ["allow-delete"] |
-| eventVolume.volumePathOnNode | Path on the underlying node to mount the Event Volume. | "/tmp" |
-| sparkConf | Configuration for the Spark History Server. | "" |
+| Property | Description | Default |
+| app.name | Sets label for app.kubernetes.io/name | Chart.Name (aissemble-spark-history-chart) |
+| enable | Enable or disable the entirety of the spark-history-server deployment. When false, equivalent to not installing the chart. | true |
+| deployment.annotations | Annotations to apply to the Spark History Server Deployment. | {} |
+| deployment.labels | Labels to apply to the Spark History Server Deployment. | {} |
+| deployment.replicas | Number of replicas for the Spark History Server Deployment. | 1 |
+| deployment.image.repository | Repository for the Spark History Server image. | "apache/spark" |
+| deployment.image.tag | Tag for the Spark History Server image. | "3.5.1" |
+| deployment.image.imagePullPolicy | Image pull policy for the Spark History Server image. | "IfNotPresent" |
+| deployment.command | Command to run in the container. | `["/opt/spark/sbin/start-spark-history-server.sh"]` |
+| deployment.env | Environment variables to set in the Spark History Server Deployment. | `SPARK_NO_DAEMONIZE: "true"` |
+| deployment.envFromSecret | Environment variables to pull from a Secret. Format:
`ENV_VAR.secretName: k8s_secret_name`
`ENV_VAR.key: k8s_secret_key` | {} |
+| deployment.volumes | Volumes to attach to the Spark History Server Deployment. | [] |
+| deployment.volumeMounts | Volume mounts to attach to the Spark History Server Deployment. | [] |
+| deployment.affinity | Node Affinity rule to constrain which nodes your Pod can be scheduled on based on node labels. | {} | |
+| deployment.tolerations | Tolerations rule to ensure that pods are not scheduled onto inappropriate nodes. | [] |
+| deployment.serviceAccountName | Set the service account for the deployment | "" |
+| dependencies.packages | List of packages to install in the Spark History Server Deployment. | [] |
+| dependencies.jars | List of jars to install in the Spark History Server Deployment. | [] |
+| ingress.enabled | Enable or disable the Spark History Server Ingress. | true |
+| ingress.metadata.annotations | Annotations to apply to the Spark History Server Ingress. | {} |
+| ingress.ingressClassName | Ingress class to use for the Spark History Server Ingress. | "nginx" |
+| ingress.hosts | Hosts to apply to the Spark History Server Ingress. | `[paths: [{path: "/", pathType: "Prefix", backend: {service: {name: "spark-history-service", port: {number: 18080}}}}]]` |
+| service.annotations | Annotations to apply to the Spark History Server Service. | {} |
+| service.type | Type of service to create for the Spark History Server. | "LoadBalancer" |
+| service.port.name | Name of the service port. | "shs-http" |
+| service.port.port | Port number for the service. | 18080 |
+| service.ports.targetPort | The port that the exposed port should map to | 18080 |
+| eventVolume.enabled | Enable or disable the default Event Volume for the Spark History Server. | false |
+| eventVolume.name | Name of the Event Volume. | "spark-events" |
+| eventVolume.mountPath | Mount path for the Event Volume. | "/tmp/spark-events" |
+| eventVolume.storageType | Type of storage to use for the Event Volume. Legal values: `local`, `custom` | "local" |
+| eventVolume.size | Size of the Event Volume. | "1Gi" |
+| eventVolume.accessModes | Access modes for the Event Volume. | ["ReadWriteMany"] |
+| eventVolume.mountOptions | Mount options for the Event Volume. | ["allow-delete"] |
+| eventVolume.volumePathOnNode | Path on the underlying node to mount the Event Volume. | "/tmp" |
+| sparkConf | Configuration for the Spark History Server. | "" |
+| serviceAccount.create | Create service account if set to true and the serviceAccount.name will be set for deployment. However, if deployment.serviceAccountName is set, it will take precedence. The service account will be created but will not be set for the deployment. | false |
+| serviceAccount.name | Service account name | aissemble-spark-history-chart-sa |
+| serviceAccount.metadata.annotations | Service account annotations | {} |
## Manually Creating a PersistentVolume
@@ -117,3 +121,27 @@ recommend removing them from your values file entirely.
## Property Removed
No properties removed.
+# Cloud Services Authentication
+This chart supports authentication with AWS via environment variables and [IRSA (IAM Roles Service Account)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
+It is recommended to use the IRSA authentication for AWS service.
+**Note 1**: to use IRSA authentication, after creating OIDC provider and IAM Roles, use this chart's serviceAccount configuration to create the service account
+where it can configure the IAM Role arn in the `metadata.annotations`. e.g.:
+ serviceAccount:
+ name: replace-with-sa-name
+ create: true
+ metadata:
+ annotations:
+ eks.amazonaws.com/role-arn: arn:aws:iam::replace-with-aws-id:role/replace-with-iam-role-name
+**Note 2**: If there is an existing IAM roles configured service account or already created by the other chart, the service account can be used by the chart. e.g:
+ deployment:
+ serviceAccountName: replace-with-sa-name
{{ toYaml .Values.deployment.labels }}
{{- end }}
+ {{- if .Values.deployment.serviceAccountName }}
+ serviceAccountName: {{ .Values.deployment.serviceAccountName }}
+ {{- else if .Values.serviceAccount.create }}
+ serviceAccountName: {{ .Values.serviceAccount.name | default .Chart.Name }}
+ {{- end }}
{{- if .Values.deployment.affinity }}
{{- toYaml .Values.deployment.affinity | nindent 8 }}
+{{ if and .Values.serviceAccount.create }}
+apiVersion: v1
+kind: ServiceAccount
+ name: {{ .Values.serviceAccount.name | default .Chart.Name }}
+ {{ $otherdata := omit .Values.serviceAccount.metadata "name" "annotations" }}
+ {{- range $key, $value := $otherdata }}
+ {{ $key }}: {{ $value }}
+ {{- end }}
+ {{- with .Values.serviceAccount.metadata.annotations }}
+ annotations:
+ {{- toYaml . | nindent 4 }}
+ {{- end }}
+{{ end }}
\ No newline at end of file
+suite: Spark History Service Account Test
+ - serviceaccount.yaml
+ - it: ServiceAccount does not exist by default
+ asserts:
+ - hasDocuments:
+ count: 0
+ - it: ServiceAccount should include appropriate default values if created
+ set:
+ serviceAccount:
+ create: true
+ asserts:
+ - containsDocument:
+ kind: ServiceAccount
+ apiVersion: v1
+ - equal:
+ path: metadata.name
+ value: aissemble-spark-history-chart-sa
+ - notExists:
+ path: metadata.annotations
+ - it: Should set values appropriately for the service account
+ set:
+ serviceAccount:
+ create: true
+ name: test
+ metadata:
+ namespace: unit-test
+ annotations:
+ eks.amazonaws.com/role-arn: arn:aws:iam::111222333444:role/test-access-role
+ asserts:
+ - equal:
+ path: metadata.name
+ value: test
+ - equal:
+ path: metadata.namespace
+ value: unit-test
+ - equal:
+ path: metadata.annotations["eks.amazonaws.com/role-arn"]
+ value: arn:aws:iam::111222333444:role/test-access-role
+ - it: Service account name uses Chart name if not set
+ set:
+ serviceAccount:
+ create: true
+ name: ""
+ asserts:
+ - equal:
+ path: metadata.name
+ value: aissemble-spark-history-chart
volumePathOnNode: /tmp
sparkConf: |-
+ create: false
+ name: "aissemble-spark-history-chart-sa"
+ metadata:
+ annotations: {}
# Properties
-| Property | Description | Default |
-| app.name | Sets label for app.kubernetes.io/name | Chart.Name (aissemble-thrift-server-chart) |
-| enable | Enable or disable the entirety of the spark-thrift-server deployment. When false, equivalent to not installing the chart. | true |
-| deployment.annotations | Annotations to apply to the Spark Thrift Server Deployment. | {} |
-| deployment.labels | Labels to apply to the Spark Thrift Server Deployment. | {} |
-| deployment.replicas | Number of replicas for the Spark Thrift Server Deployment. | 1 |
-| deployment.image.repository | Repository for the Spark Thrift Server image. | "apache/spark" |
-| deployment.image.tag | Tag for the Spark Thrift Server image. | "3.5.1" |
-| deployment.image.imagePullPolicy | Image pull policy for the Spark Thrift Server image. | "IfNotPresent" |
-| deployment.command | Command to run in the container. | `["/opt/spark/sbin/start-thriftserver.sh"]` |
-| deployment.env | Environment variables to set in the Spark Thrift Server Deployment. | `SPARK_NO_DAEMONIZE: "true"` |
-| deployment.envFromSecret | Environment variables to pull from a Secret. Format:
`ENV_VAR.secretName: k8s_secret_name`
`ENV_VAR.key: k8s_secret_key` | {} |
-| deployment.volumes | Volumes to attach to the Spark Thrift Server Deployment. | [] |
-| deployment.volumeMounts | Volume mounts to attach to the Spark Thrift Server Deployment. | [] |
-| dependencies.packages | List of packages to install in the Spark Thrift Server Deployment. | [] |
-| dependencies.jars | List of jars to install in the Spark Thrift Server Deployment. | [] |
-| ingress.enabled | Enable or disable the Spark Thrift Server Ingress. | false |
-| ingress.metadata.annotations | Annotations to apply to the Spark Thrift Server Ingress. | {} |
-| ingress.ingressClassName | Ingress class to use for the Spark Thrift Server Ingress. | "nginx" |
-| ingress.hosts | Hosts to apply to the Spark Thrift Server Ingress. | `[paths: []]` |
-| service.annotations | Annotations to apply to the Spark Thrift Server Service. | {} |
-| service.type | Type of service to create for the Spark Thrift Server. | "ClusterIP" |
-| service.ports | Name of the service port. | `[{name: "thrift", port: 10000}, {name: "thrift-http", port: 10001}]` |
-| sparkConf | Configuration for the Spark Thrift Server. | "" |
-| hiveSite | Configuration for the Hive Site. | "" |
+| Property | Description | Default |
+| app.name | Sets label for app.kubernetes.io/name | Chart.Name (aissemble-thrift-server-chart) |
+| enable | Enable or disable the entirety of the spark-thrift-server deployment. When false, equivalent to not installing the chart. | true |
+| deployment.annotations | Annotations to apply to the Spark Thrift Server Deployment. | {} |
+| deployment.labels | Labels to apply to the Spark Thrift Server Deployment. | {} |
+| deployment.replicas | Number of replicas for the Spark Thrift Server Deployment. | 1 |
+| deployment.image.repository | Repository for the Spark Thrift Server image. | "apache/spark" |
+| deployment.image.tag | Tag for the Spark Thrift Server image. | "3.5.1" |
+| deployment.image.imagePullPolicy | Image pull policy for the Spark Thrift Server image. | "IfNotPresent" |
+| deployment.command | Command to run in the container. | `["/opt/spark/sbin/start-thriftserver.sh"]` |
+| deployment.env | Environment variables to set in the Spark Thrift Server Deployment. | `SPARK_NO_DAEMONIZE: "true"` |
+| deployment.envFromSecret | Environment variables to pull from a Secret. Format:
`ENV_VAR.secretName: k8s_secret_name`
`ENV_VAR.key: k8s_secret_key` | {} |
+| deployment.volumes | Volumes to attach to the Spark Thrift Server Deployment. | [] |
+| deployment.volumeMounts | Volume mounts to attach to the Spark Thrift Server Deployment. | [] |
+| deployment.serviceAccountName | Set the service account for the deployment | "" |
+| dependencies.packages | List of packages to install in the Spark Thrift Server Deployment. | [] |
+| dependencies.jars | List of jars to install in the Spark Thrift Server Deployment. | [] |
+| ingress.enabled | Enable or disable the Spark Thrift Server Ingress. | false |
+| ingress.metadata.annotations | Annotations to apply to the Spark Thrift Server Ingress. | {} |
+| ingress.ingressClassName | Ingress class to use for the Spark Thrift Server Ingress. | "nginx" |
+| ingress.hosts | Hosts to apply to the Spark Thrift Server Ingress. | `[paths: []]` |
+| service.annotations | Annotations to apply to the Spark Thrift Server Service. | {} |
+| service.type | Type of service to create for the Spark Thrift Server. | "ClusterIP" |
+| service.ports | Name of the service port. | `[{name: "thrift", port: 10000}, {name: "thrift-http", port: 10001}]` |
+| sparkConf | Configuration for the Spark Thrift Server. | "" |
+| hiveSite | Configuration for the Hive Site. | "" |
+| serviceAccount.create | Create service account if set to true and the serviceAccount.name will be set for deployment. However, if deployment.serviceAccountName is set, it will take precedence. The service account will be created but will not be set for the deployment. | false |
+| serviceAccount.name | Service account name | aissemble-spark-history-chart-sa |
+| serviceAccount.metadata.annotations | Service account annotations | {} |
# Migration from aiSSEMBLE v1 Helm Charts
@@ -76,3 +80,29 @@ The following properties no longer exist.
| Property | Reason |
| status | This property was ignored in the original chart by default |
+# Cloud Services Authentication
+This chart supports authentication with AWS via environment variables and [IRSA (IAM Roles Service Account)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
+It is recommended to use the IRSA authentication for AWS service.
+**Note 1**: to use IRSA authentication, after creating OIDC provider and IAM Roles, use this chart's serviceAccount configuration to create the service account
+where it can configure the IAM Role arn in the `metadata.annotations`. e.g.:
+ serviceAccount:
+ name: replace-with-sa-name
+ create: true
+ metadata:
+ annotations:
+ eks.amazonaws.com/role-arn: arn:aws:iam::replace-with-aws-id:role/replace-with-iam-role-name
+**Note 2**: If there is an existing IAM roles configured service account or already created by the other chart, the service account can be used by the chart. e.g:
+ deployment:
+ serviceAccountName: replace-with-sa-name
{{ toYaml .Values.deployment.labels }}
{{- end }}
+ {{- if .Values.deployment.serviceAccountName }}
+ serviceAccountName: {{ .Values.deployment.serviceAccountName }}
+ {{- else if .Values.serviceAccount.create }}
+ serviceAccountName: {{ .Values.serviceAccount.name | default .Chart.Name }}
+ {{- end }}
{{- if or (not (empty .Values.dependencies.packages)) (not (empty .Values.dependencies.jars)) }}
- name: "populate-thrift-service-jar-volume"
+{{ if and .Values.serviceAccount.create }}
+apiVersion: v1
+kind: ServiceAccount
+ name: {{ .Values.serviceAccount.name | default .Chart.Name }}
+ {{ $otherdata := omit .Values.serviceAccount.metadata "name" "annotations" }}
+ {{- range $key, $value := $otherdata }}
+ {{ $key }}: {{ $value }}
+ {{- end }}
+ {{- with .Values.serviceAccount.metadata.annotations }}
+ annotations:
+ {{- toYaml . | nindent 4 }}
+ {{- end }}
+{{ end }}
\ No newline at end of file
+suite: Thrift Server Service Account Test
+ - serviceaccount.yaml
+ - it: ServiceAccount does not exist by default
+ asserts:
+ - hasDocuments:
+ count: 0
+ - it: ServiceAccount should include appropriate default values if created
+ set:
+ serviceAccount:
+ create: true
+ asserts:
+ - containsDocument:
+ kind: ServiceAccount
+ apiVersion: v1
+ - equal:
+ path: metadata.name
+ value: aissemble-thrift-server-chart-sa
+ - notExists:
+ path: metadata.annotations
+ - it: Should set values appropriately for the service account
+ set:
+ serviceAccount:
+ create: true
+ name: test
+ metadata:
+ namespace: unit-test
+ annotations:
+ eks.amazonaws.com/role-arn: arn:aws:iam::111222333444:role/test-access-role
+ asserts:
+ - equal:
+ path: metadata.name
+ value: test
+ - equal:
+ path: metadata.namespace
+ value: unit-test
+ - equal:
+ path: metadata.annotations["eks.amazonaws.com/role-arn"]
+ value: arn:aws:iam::111222333444:role/test-access-role
+ - it: Service account name uses Chart name if not set
+ set:
+ serviceAccount:
+ create: true
+ name: ""
+ asserts:
+ - equal:
+ path: metadata.name
+ value: aissemble-thrift-server-chart
sparkConf: |-
hiveSite: |-
+ create: false
+ name: "aissemble-thrift-server-chart-sa"
+ metadata:
+ annotations: {}
- org.apache.hadoop:hadoop-aws:3.3.4
- deployment:
- envFromSecret:
- secretName: remote-auth-config
- secretName: remote-auth-config
sparkConf: |
- spark.hadoop.fs.s3a.access.key=#[[${env:AWS_ACCESS_KEY_ID}]]#
- spark.hadoop.fs.s3a.secret.key=#[[${env:AWS_SECRET_ACCESS_KEY}]]#
@@ -55,18 +44,6 @@ aissemble-hive-metastore-service-chart:
replicationPassword: hive
password: hive
- deployment:
- env:
- valueFrom:
- secretKeyRef:
- name: remote-auth-config
- valueFrom:
- secretKeyRef:
- name: remote-auth-config