Example to create or update a Google Cloud Data Catalog tag on BigQuery tables or views with dbt Cloud metadata via a Python Cloud Function.
Data Catalog tag : dbt Run Metadata tag attached to the BigQuery table or view and containing information from the dbt Run used to create or update the BigQuery table or view : Run durations and date, dbt Project and Model, Cloud job, Cloud project and approximative size and rows count.
To activate, learn and use Cloud Data Catalog, go to https://cloud.google.com/data-catalog and https://console.cloud.google.com/datacatalog.
This repository contains the Cloud Function Python code to create or update the Data Catalog tag.
This Cloud Function uses:
In your Cloud Function, you need the 5 files:
- main.py
- config.py where you need to update your GCP project name (where the dbt Tag Template is created) and the dbt Auth Token (to use dbt Cloud API). You can also update the tag template ID if needed.
- datacatalog_functions.py
- dbt_metadata.py
- requirements.txt
Before runing the Cloud Function (and create or update tags), you need to create the Data Catalog Tag Template for dbt Run Metadata.
You can use:
-
Cloud Console where you can manage your Tag Templates
-
gcloud and the command
gcloud data-catalog tag-templates create
, full command lines in gcloud_dbt_tag_template_create.sh, more details with and example and reference. But be aware that with gcloud command line, you cannot manage template tag fields's order, fields will be in alphabetical order. -
REST API with the tag template json file dbt_metadata_tag_template.json, more details with an example and reference.
To use the Cloud Function you just have to pass the dbt Cloud Run ID and the dbt Cloud Account ID in a JSON format like {"dbt_run_id":"13161733","dbt_account_id":"11442"}
.
When the Data Catalog template tag is created and when a tag is created or updated on BigQuery tables or views, you can find all results from https://console.cloud.google.com/datacatalog.
Finally, you can also search BigQuery tables or views in Cloud Data Catalog with a dbt tag from your own application like https://github.com/dbt-content/dbt-datacatalog-explorer
Happy tagging !