Terraform module for Databricks Workspace Management (Part 2)
❗️ Important
👉 This module assumes you have Databricks Workspace AWS or Azure already deployed.
👉 Workspace URL
👉 DAPI Token
- Module tested for Terraform 1.0.1.
databricks/databricks
provider version 1.3.1- AWS provider version 4.14.
main
branch: Provider versions not pinned to keep up with Terraform releases.tags
releases: Tags are pinned with versions (use ).
- This is where you would normally start with if you have just deployed your databricks workspace.
Two cluster modes are supported by this module:
Single Node
mode: To deploy cluster in Single Node mode, updatefixed_value
to0
:
fixed_value = 0
Standard
mode: To deploy in Standard mode, two options are available:
fixed_value = 1 or more
OR
auto_scaling = [1,3]
Cluster can have one of these permissions: CAN_ATTACH_TO
, CAN_RESTART
and CAN_MANAGE
.
cluster_access_control = [
{
group_name = "<group_name>"
permission_level = "CAN_RESTART"
},
{
user_name = "<user_name>"
permission_level = "CAN_RESTART"
}
]
- To build cluster with new cluster policy, use:
deploy_cluster_policy = true
policy_overrides = {
"dbus_per_hour" : {
"type" : "range",
"maxValue" : 10
},
"autotermination_minutes" : {
"type" : "fixed",
"value" : 30,
"hidden" : true
}
}
- To use existing Cluster policy, specify the existing policy id:
cluster_policy_id = "E0123456789"
To get existing policy id use:
curl -X GET --header "Authorization: Bearer $DAPI_TOKEN" https://<workspace_name>/api/2.0/policies/clusters/list \
--data '{ "sort_order": "DESC", "sort_column": "POLICY_CREATION_TIME" }'
policy_access_control = [
{
group_name = "<group_name>"
permission_level = "CAN_USE"
},
{
user_name = "<user_name>"
permission_level = "CAN_USE"
}
]
Note: To configure Instance Pool
, add below configuration:
deploy_worker_instance_pool = true
min_idle_instances = 1
max_capacity = 5
idle_instance_autotermination_minutes = 30
Instance pool can have one of these permissions: CAN_ATTACH_TO
and CAN_MANAGE
.
instance_pool_access_control = [
{
group_name = "<group_name>"
permission_level = "CAN_ATTACH_TO"
},
{
user_name = "<user_name>"
permission_level = "CAN_ATTACH_TO"
},
]
❗️ Important
If
deploy_worker_instance_pool
is set totrue
andauto_scaling
is enabled. Ensuremax_capacity
of Cluster Instance Pool is more thanauto_scaling
max value for Cluster.
Two options are available:
- Deploy Job to an existing cluster.
- Deploy new Cluster and then deploy Job.
Two options are available to attach notebooks to a job:
- Attach existing notebook to a job.
- Create new notebook and attach it to a job.
Job can have one of these permissions: CAN_VIEW
, CAN_MANAGE_RUN
, IS_OWNER
, and CAN_MANAGE
.
Admins have CAN_MANAGE
permission by default, and they can assign that permission to non-admin users, and service principals.
Job creator has IS_OWNER
permission. Destroying databricks_permissions resource for a job would revert ownership to the creator.
Note:
- A job must have exactly one owner. If resource is changed and no owner is specified, currently authenticated principal would become new owner of the job.
- A job cannot have a group as an owner.
- Jobs triggered through Run Now assume the permissions of the job owner and not the user, and service principal who issued Run Now.
jobs_access_control = [
{
group_name = "<group_name>"
permission_level = "CAN_MANAGE_RUN"
},
{
user_name = "<user_name>"
permission_level = "CAN_MANAGE_RUN"
}
]
Add instance profile at cluster creation time. It can control which data a given cluster can access through cloud-native controls.
add_instance_profile_to_workspace = true (default false)
aws_attributes = {
instance_profile_arn = "arn:aws:iam::123456789012:instance-profile/aws-instance-role"
}
Note: add_instance_profile_to_workspace
to add Instance profile to Databricks workspace. To use existing set it to false
.
Put notebooks in notebooks folder and provide below information:
notebooks = [
{
name = "demo_notebook1"
language = "PYTHON"
local_path = "notebooks/sample1.py"
path = "/Shared/demo/sample1.py"
},
{
name = "demo_notebook2"
local_path = "notebooks/sample2.py"
}
]
Notebook can have one of these permissions: CAN_READ
, CAN_RUN
, CAN_EDIT
, and CAN_MANAGE
.
notebooks_access_control = [
{
group_name = "<group_name>"
permission_level = "CAN_MANAGE"
},
{
user_name = "<user_name>"
permission_level = "CAN_MANAGE"
}
]
- Try this: If you want to test what resources are getting deployed.
terrafrom init
terraform plan -var='teamid=tryme' -var='prjid=project'
terraform apply -var='teamid=tryme' -var='prjid=project'
terraform destroy -var='teamid=tryme' -var='prjid=project'
Note: With this option please take care of remote state storage
- Create python 3.8+ virtual environment
python3 -m venv <venv name>
- Install package:
pip install tfremote
-
Set below environment variables based on cloud provider.
-
Updated
examples
directory with required values.
NOTE:
- Read more on tfremote
Please refer to examples directory link for references.
- Databricks Sync - Tool for multi cloud migrations, DR sync of workspaces. It uses TF in the backend. Run it from command line or from a notebook.
- Databricks Migrate - Tool to migrate a workspace(One time tool).
- Databricks CICD Templates
If you see error messages. Try running the same the command again.
Error: Failed to delete token in Scope <scope name>
Error: Scope <scope name> does not exist!
Name | Version |
---|---|
terraform | >= 1.0.1 |
aws | >= 4.14 |
databricks | >= 0.5.7 |
Name | Version |
---|---|
databricks | >= 0.5.7 |
No modules.
Name | Description | Type | Default | Required |
---|---|---|---|---|
add_instance_profile_to_workspace | Existing AWS instance profile ARN | bool |
false |
no |
allow_cluster_create | This is a field to allow the group to have cluster create privileges. More fine grained permissions could be assigned with databricks_permissions and cluster_id argument. Everyone without allow_cluster_create argument set, but with permission to use Cluster Policy would be able to create clusters, but within boundaries of that specific policy. | bool |
true |
no |
allow_instance_pool_create | This is a field to allow the group to have instance pool create privileges. More fine grained permissions could be assigned with databricks_permissions and instance_pool_id argument. | bool |
true |
no |
always_running | Whenever the job is always running, like a Spark Streaming application, on every update restart the current active run or start it again, if nothing it is not running. False by default. | bool |
false |
no |
auto_scaling | Number of min and max workers in auto scale. | list(any) |
null |
no |
aws_attributes | Optional configuration block contains attributes related to clusters running on AWS. | any |
null |
no |
azure_attributes | Optional configuration block contains attributes related to clusters running on Azure. | any |
null |
no |
category | Node category, which can be one of: General purpose, Memory optimized, Storage optimized, Compute optimized, GPU | string |
"General purpose" |
no |
cluster_access_control | Cluster access control | any |
null |
no |
cluster_autotermination_minutes | cluster auto termination duration | number |
30 |
no |
cluster_id | Existing cluster id | string |
null |
no |
cluster_name | Cluster name | string |
null |
no |
cluster_policy_id | Exiting cluster policy id | string |
null |
no |
create_group | Create a new group, if group already exists the deployment will fail. | bool |
false |
no |
create_user | Create a new user, if user already exists the deployment will fail. | bool |
false |
no |
custom_tags | Extra custom tags | any |
null |
no |
data_security_mode | Access mode | string |
"NONE" |
no |
databricks_username | User allowed to access the platform. | string |
"" |
no |
deploy_cluster | feature flag, true or false | bool |
false |
no |
deploy_cluster_policy | feature flag, true or false | bool |
false |
no |
deploy_driver_instance_pool | Driver instance pool | bool |
false |
no |
deploy_job_cluster | feature flag, true or false | bool |
false |
no |
deploy_jobs | feature flag, true or false | bool |
false |
no |
deploy_worker_instance_pool | Worker instance pool | bool |
false |
no |
driver_node_type_id | The node type of the Spark driver. This field is optional; if unset, API will set the driver node type to the same value as node_type_id. | string |
null |
no |
email_notifications | Email notification block. | any |
null |
no |
fixed_value | Number of nodes in the cluster. | number |
0 |
no |
gb_per_core | Number of gigabytes per core available on instance. Conflicts with min_memory_gb. Defaults to 0. | string |
0 |
no |
gcp_attributes | Optional configuration block contains attributes related to clusters running on GCP. | any |
null |
no |
gpu | GPU required or not. | bool |
false |
no |
idle_instance_autotermination_minutes | idle instance auto termination duration | number |
20 |
no |
instance_pool_access_control | Instance pool access control | any |
null |
no |
jobs_access_control | Jobs access control | any |
null |
no |
libraries | Installs a library on databricks_cluster | map(any) |
{} |
no |
local_disk | Pick only nodes with local storage. Defaults to false. | string |
true |
no |
local_notebooks | Local path to the notebook(s) that will be used by the job | any |
[] |
no |
max_capacity | instance pool maximum capacity | number |
3 |
no |
max_concurrent_runs | An optional maximum allowed number of concurrent runs of the job. | number |
null |
no |
max_retries | An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED result_state or INTERNAL_ERROR life_cycle_state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. | number |
0 |
no |
min_cores | Minimum number of CPU cores available on instance. Defaults to 0. | string |
0 |
no |
min_gpus | Minimum number of GPU's attached to instance. Defaults to 0. | string |
0 |
no |
min_idle_instances | instance pool minimum idle instances | number |
1 |
no |
min_memory_gb | Minimum amount of memory per node in gigabytes. Defaults to 0. | string |
0 |
no |
min_retry_interval_millis | An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried. | number |
null |
no |
ml | ML required or not. | bool |
false |
no |
notebooks | Local path to the notebook(s) that will be deployed | any |
[] |
no |
notebooks_access_control | Notebook access control | any |
null |
no |
policy_access_control | Policy access control | any |
null |
no |
policy_overrides | Cluster policy overrides | any |
null |
no |
prjid | (Required) Name of the project/stack e.g: mystack, nifieks, demoaci. Should not be changed after running 'tf apply' | string |
n/a | yes |
remote_notebooks | Path to notebook(s) in the databricks workspace that will be used by the job | any |
[] |
no |
retry_on_timeout | An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout. | bool |
false |
no |
schedule | Job schedule configuration. | map(any) |
null |
no |
spark_conf | Map with key-value pairs to fine-tune Spark clusters, where you can provide custom Spark configuration properties in a cluster configuration. | any |
null |
no |
spark_env_vars | Map with environment variable key-value pairs to fine-tune Spark clusters. Key-value pairs of the form (X,Y) are exported (i.e., X='Y') while launching the driver and workers. | any |
null |
no |
spark_version | Runtime version of the cluster. Any supported databricks_spark_version id. We advise using Cluster Policies to restrict the list of versions for simplicity while maintaining enough control. | string |
null |
no |
task_parameters | Base parameters to be used for each run of this job. | map(any) |
{} |
no |
teamid | (Required) Name of the team/group e.g. devops, dataengineering. Should not be changed after running 'tf apply' | string |
n/a | yes |
timeout | An optional timeout applied to each run of this job. The default behavior is to have no timeout. | number |
null |
no |
worker_node_type_id | The node type of the Spark worker. | string |
null |
no |
Name | Description |
---|---|
cluster_id | databricks cluster id |
cluster_name | databricks cluster name |
cluster_policy_id | databricks cluster policy permissions |
databricks_group | databricks group name |
databricks_group_member | databricks group members |
databricks_secret_acl | databricks secret acl |
databricks_user | databricks user name |
databricks_user_id | databricks user id |
existing_cluster_new_job_existing_notebooks_id | databricks new cluster job id |
existing_cluster_new_job_existing_notebooks_job | databricks new cluster job url |
existing_cluster_new_job_new_notebooks_id | databricks new cluster job id |
existing_cluster_new_job_new_notebooks_job | databricks new cluster job url |
instance_profile | databricks instance profile ARN |
new_cluster_new_job_existing_notebooks_id | databricks job id |
new_cluster_new_job_existing_notebooks_job | databricks job url |
new_cluster_new_job_new_notebooks_id | databricks job id |
new_cluster_new_job_new_notebooks_job | databricks job url |
notebook_url | databricks notebook url |
notebook_url_standalone | databricks notebook url standalone |