Skip to content

Commit

Permalink
Merge pull request #45 from data-platform-hq/cluster-support-single-node
Browse files Browse the repository at this point in the history
feat: support single node clusters
  • Loading branch information
owlleg6 authored Mar 11, 2024
2 parents 95ec206 + 2936d1d commit c2361a6
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 15 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,14 +164,14 @@ module "metastore_assignment" {
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >=1.0.0 |
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >=3.40.0 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >=1.14.2 |
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >=1.30.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | >=3.40.0 |
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | >=1.14.2 |
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | >=1.30.0 |

## Modules

Expand Down Expand Up @@ -213,7 +213,7 @@ No modules.

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_clusters"></a> [clusters](#input\_clusters) | Set of objects with parameters to configure Databricks clusters and assign permissions to it for certain custom groups | <pre>set(object({<br> cluster_name = string<br> spark_version = optional(string, "13.3.x-scala2.12")<br> spark_conf = optional(map(any), {})<br> cluster_conf_passthrought = optional(bool, false)<br> spark_env_vars = optional(map(any), {})<br> data_security_mode = optional(string, "USER_ISOLATION")<br> node_type_id = optional(string, "Standard_D3_v2")<br> autotermination_minutes = optional(number, 30)<br> min_workers = optional(number, 1)<br> max_workers = optional(number, 2)<br> availability = optional(string, "ON_DEMAND_AZURE")<br> first_on_demand = optional(number, 0)<br> spot_bid_max_price = optional(number, 1)<br> cluster_log_conf_destination = optional(string, null)<br> init_scripts_workspace = optional(set(string), [])<br> init_scripts_volumes = optional(set(string), [])<br> init_scripts_dbfs = optional(set(string), [])<br> init_scripts_abfss = optional(set(string), [])<br> single_user_name = optional(string, null)<br> permissions = optional(set(object({<br> group_name = string<br> permission_level = string<br> })), [])<br> pypi_library_repository = optional(set(string), [])<br> maven_library_repository = optional(set(object({<br> coordinates = string<br> exclusions = set(string)<br> })), [])<br> }))</pre> | `[]` | no |
| <a name="input_clusters"></a> [clusters](#input\_clusters) | Set of objects with parameters to configure Databricks clusters and assign permissions to it for certain custom groups | <pre>set(object({<br> cluster_name = string<br> spark_version = optional(string, "13.3.x-scala2.12")<br> spark_conf = optional(map(any), {})<br> cluster_conf_passthrought = optional(bool, false)<br> spark_env_vars = optional(map(any), {})<br> data_security_mode = optional(string, "USER_ISOLATION")<br> node_type_id = optional(string, "Standard_D3_v2")<br> autotermination_minutes = optional(number, 30)<br> min_workers = optional(number, 1)<br> max_workers = optional(number, 2)<br> availability = optional(string, "ON_DEMAND_AZURE")<br> first_on_demand = optional(number, 0)<br> spot_bid_max_price = optional(number, 1)<br> cluster_log_conf_destination = optional(string, null)<br> init_scripts_workspace = optional(set(string), [])<br> init_scripts_volumes = optional(set(string), [])<br> init_scripts_dbfs = optional(set(string), [])<br> init_scripts_abfss = optional(set(string), [])<br> single_user_name = optional(string, null)<br> single_node_enable = optional(bool, false)<br> custom_tags = optional(map(string), {})<br> permissions = optional(set(object({<br> group_name = string<br> permission_level = string<br> })), [])<br> pypi_library_repository = optional(set(string), [])<br> maven_library_repository = optional(set(object({<br> coordinates = string<br> exclusions = set(string)<br> })), [])<br> }))</pre> | `[]` | no |
| <a name="input_create_databricks_access_policy_to_key_vault"></a> [create\_databricks\_access\_policy\_to\_key\_vault](#input\_create\_databricks\_access\_policy\_to\_key\_vault) | Boolean flag to enable creation of Key Vault Access Policy for Databricks Global Service Principal. | `bool` | `true` | no |
| <a name="input_custom_cluster_policies"></a> [custom\_cluster\_policies](#input\_custom\_cluster\_policies) | Provides an ability to create custom cluster policy, assign it to cluster and grant CAN\_USE permissions on it to certain custom groups<br>name - name of custom cluster policy to create<br>can\_use - list of string, where values are custom group names, there groups have to be created with Terraform;<br>definition - JSON document expressed in Databricks Policy Definition Language. No need to call 'jsonencode()' function on it when providing a value; | <pre>list(object({<br> name = string<br> can_use = list(string)<br> definition = any<br> }))</pre> | <pre>[<br> {<br> "can_use": null,<br> "definition": null,<br> "name": null<br> }<br>]</pre> | no |
| <a name="input_global_databricks_sp_object_id"></a> [global\_databricks\_sp\_object\_id](#input\_global\_databricks\_sp\_object\_id) | Global 'AzureDatabricks' SP object id. Used to create Key Vault Access Policy for Secret Scope | `string` | `"9b38785a-6e08-4087-a0c4-20634343f21f"` | no |
Expand Down
37 changes: 26 additions & 11 deletions cluster.tf
Original file line number Diff line number Diff line change
@@ -1,31 +1,46 @@
locals {
spark_conf_single_node = {
"spark.master" = "local[*]"
"spark.databricks.cluster.profile" = "singleNode"
}
conf_passthrought = {
"spark.databricks.cluster.profile" : "serverless",
"spark.databricks.repl.allowedLanguages" : "python,sql",
"spark.databricks.passthrough.enabled" : "true",
"spark.databricks.pyspark.enableProcessIsolation" : "true"
}
}

resource "databricks_cluster" "cluster" {
for_each = { for cluster in var.clusters : cluster.cluster_name => cluster }

cluster_name = each.value.cluster_name
spark_version = each.value.spark_version
spark_conf = each.value.cluster_conf_passthrought ? merge({
"spark.databricks.cluster.profile" : "serverless",
"spark.databricks.repl.allowedLanguages" : "python,sql",
"spark.databricks.passthrough.enabled" : "true",
"spark.databricks.pyspark.enableProcessIsolation" : "true"
}, each.value.spark_conf) : each.value.spark_conf
spark_conf = merge(
each.value.cluster_conf_passthrought ? local.conf_passthrought : {},
each.value.single_node_enable == true ? local.spark_conf_single_node : {},
each.value.spark_conf)
spark_env_vars = each.value.spark_env_vars
data_security_mode = each.value.cluster_conf_passthrought ? null : each.value.data_security_mode
node_type_id = each.value.node_type_id
autotermination_minutes = each.value.autotermination_minutes
single_user_name = each.value.single_user_name

autoscale {
min_workers = each.value.min_workers
max_workers = each.value.max_workers
}
custom_tags = merge(each.value.single_node_enable ? { "ResourceClass" = "SingleNode" } : {}, each.value.custom_tags)

azure_attributes {
availability = each.value.availability
first_on_demand = each.value.first_on_demand
spot_bid_max_price = each.value.spot_bid_max_price
}

dynamic "autoscale" {
for_each = each.value.single_node_enable ? [] : [1]
content {
min_workers = each.value.min_workers
max_workers = each.value.max_workers
}
}

dynamic "cluster_log_conf" {
for_each = each.value.cluster_log_conf_destination != null ? [each.value.cluster_log_conf_destination] : []
content {
Expand Down
2 changes: 2 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,8 @@ variable "clusters" {
init_scripts_dbfs = optional(set(string), [])
init_scripts_abfss = optional(set(string), [])
single_user_name = optional(string, null)
single_node_enable = optional(bool, false)
custom_tags = optional(map(string), {})
permissions = optional(set(object({
group_name = string
permission_level = string
Expand Down
2 changes: 1 addition & 1 deletion versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ terraform {
}
databricks = {
source = "databricks/databricks"
version = ">=1.14.2"
version = ">=1.30.0"
}
}
}

0 comments on commit c2361a6

Please sign in to comment.