diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/_model-versions.md b/docs/modeling-your-data/modeling-your-data-with-dbt/_model-versions.md index 225442c0cb..2622a7c07d 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/_model-versions.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/_model-versions.md @@ -29,8 +29,8 @@ import {versions} from '@site/src/componentVersions'; | snowplow-web version | dbt versions | BigQuery | Databricks | Redshift | Snowflake | Postgres | | -------------------------- | ------------------- | :------: | :--------: | :------: | :-------: | :------: | | ${versions.dbtSnowplowWeb} | >=1.5.0 to <2.0.0 | ✅ | ✅ | ✅ | ✅ | ✅ | -| 0.15.2 | >=1.4.0 to <2.0.0 | ✅ | ✅ | ✅ | ✅ | ✅ | -| 0.13.3* | >=1.3.0 to <2.0.0 | ✅ | ✅ | ✅ | ✅ | ✅ | +| 0.15.2 | >=1.4.0 to <2.0.0 | ✅ | ✅ | ✅ | ✅ | ✅* | +| 0.13.3** | >=1.3.0 to <2.0.0 | ✅ | ✅ | ✅ | ✅ | ✅ | | 0.11.0 | >=1.0.0 to <1.3.0 | ✅ | ✅ | ✅ | ✅ | ✅ | | 0.5.1 | >=0.20.0 to <1.0.0 | ✅ | ❌ | ✅ | ✅ | ✅ | | 0.4.1 | >=0.18.0 to <0.20.0 | ✅ | ❌ | ✅ | ✅ | ❌ | @@ -38,9 +38,9 @@ import {versions} from '@site/src/componentVersions'; -^ Since version 0.15.0 of `snowplow_web` at least version 15.0 of Postgres is required, otherwise you will need to [overwrite](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/macros-and-keys/index.md#overriding-macros) the `default_channel_group` macro to not use the `regexp_like` function. +\* Since version 0.15.0 of `snowplow_web` at least version 15.0 of Postgres is required, otherwise you will need to [overwrite](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/macros-and-keys/index.md#overriding-macros) the `default_channel_group` macro to not use the `regexp_like` function. -\* From version v0.13.0 onwards we use the `load_tstamp` field so you must be using [RDB Loader](/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-rdb-loader/index.md) v4.0.0 and above, or [BigQuery Loader](/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-rdb-loader/index.md) v1.0.0 and above. If you do not have this field because you are not using these versions, or you are using the Postgres loader, you will need to set `snowplow__enable_load_tstamp` to `false` in your `dbt_project.yml` and will not be able to use the consent models. +** From version v0.13.0 onwards we use the `load_tstamp` field so you must be using [RDB Loader](/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-rdb-loader/index.md) v4.0.0 and above, or [BigQuery Loader](/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-rdb-loader/index.md) v1.0.0 and above. If you do not have this field because you are not using these versions, or you are using the Postgres loader, you will need to set `snowplow__enable_load_tstamp` to `false` in your `dbt_project.yml` and will not be able to use the consent models. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md index 5859da586b..5c786c9a6f 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md @@ -92,7 +92,7 @@ In all states the `upper_limit` is limited by the `snowplow__backfill_limit_days If there are no enabled models already in the manifest then we process from the start date up to the backfill limit or now, whichever is older: -**`lower_limit`**: `snowplow__start_date` +**`lower_limit`**: `snowplow__start_date` **`upper_limit`**: `least(current_tstamp, snowplow__start_date + snowplow__backfill_limit_days)` ```mermaid @@ -118,7 +118,7 @@ gantt If there are enabled models that aren't in the manifest table then a new model tagged with `snowplow__incremental` has been added since the last run; this can happen with a new custom model, or you have enabled some previously disabled custom modules. In this case the package will replay all previously processed events in order to back-fill the new model. -**`lower_limit`**: `snowplow__start_date` +**`lower_limit`**: `snowplow__start_date` **`upper_limit`**: `least(max_last_success, snowplow__start_date + snowplow__backfill_limit_days)` ```mermaid @@ -145,7 +145,7 @@ gantt If the `min_last_success` is less than the `max_last_success` it means the tagged models are out of sync, for example due to a particular model failing to execute successfully during the previous run or as part of catching up on a new model. The package will attempt to sync all models as far as your backfill limit will allow. -**`lower_limit`**: `min_last_success - snowplow__lookback_window_hours` +**`lower_limit`**: `min_last_success - snowplow__lookback_window_hours` **`upper_limit`**: `least(max_last_success, min_last_success + snowplow__backfill_limit_days)` ```mermaid @@ -173,7 +173,7 @@ gantt If none of the above criteria are met, then we consider it a 'standard run' where all models are in sync and we carry on from the last processed event. -**`lower_limit`**: `max_last_success - snowplow__lookback_window_hours` +**`lower_limit`**: `max_last_success - snowplow__lookback_window_hours` **`upper_limit`**: `least(current_tstamp, max_last_success + snowplow__backfill_limit_days)` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_mobile.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_mobile.drawio.png new file mode 100644 index 0000000000..3bbef95f13 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_mobile.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_unified.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_unified.drawio.png new file mode 100644 index 0000000000..8f01d7f491 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_unified.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_web.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_web.drawio.png new file mode 100644 index 0000000000..c38bf835f0 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_dark_web.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_mobile.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_mobile.drawio.png new file mode 100644 index 0000000000..398dd0e756 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_mobile.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_unified.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_unified.drawio.png new file mode 100644 index 0000000000..8867620ca5 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_unified.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_web.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_web.drawio.png new file mode 100644 index 0000000000..1088f4ac00 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/session_stitching_light_web.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/stitching_scenarios.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/stitching_scenarios.drawio.png new file mode 100644 index 0000000000..5c15135970 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/images/stitching_scenarios.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/index.md index 9b4a4cc9bb..808f2926f5 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-user-mapping/index.md @@ -6,6 +6,7 @@ sidebar_position: 30 ```mdx-code-block import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; +import ThemedImage from '@theme/ThemedImage'; ``` :::tip @@ -17,7 +18,7 @@ On this page, `` can be one of: `web`, `mobile`, `unified` Stitching users together is not an easy task: depending on the typical user journey the complexity can range from having individually identified (logged in) users, thus not having to do any extra modelling to never identified users mainly using the same common public device (e.g. school or library) where it is technically impossible to do any user stitching. As stitching is a reiterative process as it constantly needs to be updated after each incremental run for a desirably large range of data, compute power and extra expenses as well as time constraints may limit and dictate the best course of action. -**Session stitching** +#### Session stitching For the out-of-the-box user stitching we opted for the sweet spot method: applying a logic that the majority of our users will benefit from while keeping in mind not to introduce compute-heavy calculations still reaping ideal benefits. @@ -27,6 +28,49 @@ The `domain_userid`/`device_user_id` is cookie/device based and therefore expire This mapping is applied to the sessions table by a post-hook which updates the `stitched_user_id` column with the latest mapping. If no mapping is present, the default value for `stitched_user_id` is the `domain_userid`/`device_user_id`. This process is known as session stitching, and effectively allows you to attribute logged-in and non-logged-in sessions back to a single user. + + + + + +

+ +

+
+ + +

+ +

+
+ + +

+ +

+
+ +
+ + If required, this update operation can be disabled by setting in your `dbt_project.yml` file (selecting one of web/mobile, or both, as appropriate): ```yml title="dbt_project.yml" @@ -37,10 +81,30 @@ vars: In the unified package and also in the web package, since version 0.16.0, it is also possible to stitch onto the page views table by setting the value of `snowplow__page_view_stitching` to `true`. It may be enough to apply this with less frequency than on sessions to keep costs down, by only enabling this at runtime (on the command line) on only some of the runs. -**Cross platform stitching** +#### Cross platform stitching Since the arrival of the `snowplow_unified` package all the user data is modelled in one place. This makes it easy to effectively perform cross-platform stitching, which means that as soon as a user identifies themselves by logging in as the same user on separate platforms, all the user data will be found within one package making it really convenient for perform further analysis. -**Custom solutions** +#### Custom solutions User mapping is typically not a 'one size fits all' exercise. Depending on your tracking implementation, business needs and desired level of sophistication you may want to write bespoke logic. Please refer to this [blog post](https://snowplow.io/blog/developing-a-single-customer-view-with-snowplow/) for ideas. In addition, the web and unified packages offer the possibility to change what field is used as your stitched user id, so instead of `user_id` you can use any field you wish (note that it will still be called `user_id` in your mapping table), and by taking advantage of the [custom sessionization and users](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/custom-sessionization-and-users/index.md) you can also change the field used as the `domain_user_id` (for the web model) or user_identifier (unified model). We plan to add support for these features to the mobile package in the future. + +#### Overview + +

+ +

+ +(1) it is most convenient to use the unified package so that all of these events will be modelled into the same derived tables regardless of platform + +(2) if it is the same mobile/web device and the user identifies by logging in at a later stage while still retaining the same domain_userid/device_user_id, the model will update the stitched_user_id during session_stitching + +(3) if it is the same mobile/web device and the user identifies by logging in while still retaining the same domain_userid/device_user_id, the model will update the stitched_user_id during session_stitching + +(4) if it is the same mobile device cross-navigation tracking and stitching can be applied (coming soon!) diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/ecommerce/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/ecommerce/index.md index 3a69326eab..ee62261ec4 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/ecommerce/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/ecommerce/index.md @@ -10,7 +10,7 @@ import TabItem from '@theme/TabItem'; ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::caution @@ -204,7 +204,11 @@ export const printYamlVariables = (data) => { export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` -## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. +## Config Generator +```mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/fractribution/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/fractribution/index.md index 0de976394f..e4ed5da09b 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/fractribution/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/fractribution/index.md @@ -11,7 +11,7 @@ import TabItem from '@theme/TabItem'; ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::caution @@ -124,6 +124,12 @@ export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` ## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. + +```mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` + diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/media-player/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/media-player/index.md index 37de8e0d0d..97768b312c 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/media-player/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/media-player/index.md @@ -10,7 +10,7 @@ import TabItem from '@theme/TabItem'; ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::caution @@ -170,6 +170,11 @@ export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` ## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. +```mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` + diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/mobile/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/mobile/index.md index 2acbc1f6d3..64d2799dfc 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/mobile/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/mobile/index.md @@ -10,7 +10,7 @@ import TabItem from '@theme/TabItem'; ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::caution @@ -194,6 +194,10 @@ export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` ## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. +```mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/normalize/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/normalize/index.md index 30d20b4313..f61b160636 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/normalize/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/normalize/index.md @@ -10,7 +10,7 @@ import TabItem from '@theme/TabItem'; ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::caution @@ -134,6 +134,11 @@ export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` ## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. +``mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` + diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/index.md index a6b3bd047a..ae231383c2 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/unified/index.md @@ -1,6 +1,6 @@ --- -title: "Web" -sidebar_position: 100 +title: "Unified" +sidebar_position: 50 --- ```mdx-code-block @@ -8,15 +8,9 @@ import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; ``` -:::info - -Some variables are only available in the latest version of our package, or might have changed format from older versions. If you are unable to use the latest version, check the `dbt_project.yml` file of our package for the version you are using to see what options are available to you. - -::: - ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::note @@ -34,8 +28,8 @@ All variables in Snowplow packages start with `snowplow__` but we have removed t | `events_table` | The name of the table that contains your atomic events. | `events` | | `ga4_categories_seed` | Name of the model for the GA4 category mapping seed table, either a seed or a model (if you want to use a source, create a model to select from it). | `snowplow_unified_dim_ga4_source_categories` | | `geo_mapping_seed` | Name of the model for the Geo mapping seed table, either a seed or a model (if you want to use a source, create a model to select from it). | `snowplow_unified_dim_geo_country_mapping` | -| `heartbeat` | Page ping heartbeat time as defined in your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracking-events/index.md#activity-tracking-page-pings). | `10` | -| `min_visit_length` | Minimum visit length as defined in your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracking-events/index.md#activity-tracking-page-pings). | `5` | +| `heartbeat` | Page ping heartbeat time as defined in your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracking-events/index.md#activity-tracking-page-pings). | `10` | +| `min_visit_length` | Minimum visit length as defined in your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracking-events/index.md#activity-tracking-page-pings). | `5` | | `rfc_5646_seed` | Name of the model for the RFC 5646 (language) mapping seed table, either a seed or a model (if you want to use a source, create a model to select from it). | `snowplow_unified_dim_rfc_5646_language_mapping` | | `sessions_table` | The users module requires data from the derived sessions table. If you choose to disable the standard sessions table in favor of your own custom table, set this to reference your new table e.g. `{{ ref('snowplow_unified_sessions_custom') }}`. @@ -43,7 +37,7 @@ All variables in Snowplow packages start with `snowplow__` but we have removed t | Variable Name | Description | Default | | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | | `allow_refresh` | Used as the default value to return from the `allow_refresh()` macro. This macro determines whether the manifest tables can be refreshed or not, depending on your environment. See the [Manifest Tables](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/index.md#manifest-tables) section for more details. | `false` | -| `backfill_limit_days` | The maximum numbers of days of new data to be processed since the latest event processed. Please refer to the [incremental logic](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md#identification-of-events-to-process) section for more details. | `30` | +| `backfill_limit_days` | The maximum numbers of days of new data to be processed since the latest event processed. Please refer to the [incremental logic](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md#package-state) section for more details. | `30` | | `conversion_events` | (Version 0.15.0+) A list of dictionaries that define a conversion event for your modeling, to add the relevant columns to the sessions table. The dictionary keys are `name` (required), `condition` (required), `value`, `default_value`, and `list_events`. For more information see the [package documentation](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/index.md). | | | `cwv_days_to_measure` | The number of days to use for web vital measurements (if enabled). | `28` | | `cwv_percentile` | The percentile that the web vitals measurements that are produced for all page views (if enabled). | `75` | @@ -78,7 +72,7 @@ All variables in Snowplow packages start with `snowplow__` but we have removed t | `enable_browser_context` | Flag to include browser context data in the models. | `false` | | `enable_mobile_context` | Flag to include mobile context data in the models. | `false` | | `enable_geolocation_context` | Flag to include the geolocation data in the models. | `false` | -| `enable_app_context` | Flag to include the app context data in the models. | `false` | +| `enable_application_context` | Flag to include the app context data in the models. | `false` | | `enable_screen_context` | Flag to include the mobile screen data in the models. | `false` | | `enable_app_error_event` | Flag to include the mobile app error data in the models. | `false` | | `enable_deep_link_context` | Flag to include the deep link context data in the models. | `false` | @@ -111,7 +105,7 @@ Redshift and Postgres use a [shredded](/docs/pipeline-components-and-application | `ua_parser_context` | `com_snowplowanalytics_snowplow_ua_parser_context_1` | | `yauaa_context` | `nl_basjes_yauaa_context_1` | | `consent_cmp_visible` | `com_snowplowanalytics_snowplow_cmp_visible_1` | -| `consent_preferences` | `com_snowplowanalytics_snowplow_consent_preferences_1` | +| `consent_preferences_events` | `com_snowplowanalytics_snowplow_consent_preferences_1` | | `consent_cmp_visible` |`com_snowplowanalytics_snowplow_cmp_visible_1` | | `browser_context` | `com_snowplowanalytics_snowplow_browser_context_1` | | `session_context` | `com_snowplowanalytics_snowplow_client_session_1` | @@ -119,9 +113,10 @@ Redshift and Postgres use a [shredded](/docs/pipeline-components-and-application | `geolocation_context` | `com_snowplowanalytics_snowplow_geolocation_context_1` | | `application_context` | `com_snowplowanalytics_mobile_application_1` | | `screen_context` | `com_snowplowanalytics_mobile_screen_1` | -| `app_errors_table` | `com_snowplowanalytics_snowplow_application_error_1` | +| `application_error_events` | `com_snowplowanalytics_snowplow_application_error_1` | | `screen_view_events` | `com_snowplowanalytics_mobile_screen_view_1` | -| `deep_link_context` | `contexts_com_snowplowanalytics_mobile_deep_link_1` | +| `deep_link_context` | `com_snowplowanalytics_mobile_deep_link_1` | +| `cwv_events` | `com_snowplowanalytics_snowplow_web_vitals_1` | | Variable Name | Description | Default | | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | @@ -149,11 +144,11 @@ Redshift and Postgres use a [shredded](/docs/pipeline-components-and-application ```mdx-code-block import DbtSchemas from "@site/docs/reusable/dbt-schemas/_index.md"; import CodeBlock from '@theme/CodeBlock'; -import { SchemaSetter } from '@site/src/components/DbtSchemaSelector'; +import { SchemaSetterWSeeds } from '@site/src/components/DbtSchemaSelector'; -export const printSchemaVariables = (manifestSchema, scratchSchema, derivedSchema) => { +export const printSchemaVariables = (manifestSchema, scratchSchema, derivedSchema, seedSchema) => { return( <> @@ -177,14 +172,17 @@ export const printSchemaVariables = (manifestSchema, scratchSchema, derivedSchem page_views: +schema: ${derivedSchema} scratch: - +schema: ${scratchSchema}`} + +schema: ${scratchSchema} +seeds: + snowplow_unified: + +schema: ${seedSchema}`} ) } ``` - + ```mdx-code-block import { dump } from 'js-yaml'; @@ -244,10 +242,13 @@ export const GROUPS = [ "snowplow__iab_context", "snowplow__ua_parser_context", "snowplow__yauaa_context", - "snowplow__consent_cmp_visible", - "snowplow__consent_preferences", - "snowplow__browser_context","snowplow__session_context","snowplow__mobile_context","snowplow__geolocation_context","snowplow__application_context","snowplow__screen_context","snowplow__app_errors_table","snowplow__screen_view_events","snowplow__deep_link_context", + "snowplow__cmp_visible_events", + "snowplow__consent_preferences_events", + "snowplow__browser_context", + "snowplow__session_context", + "snowplow__mobile_context","snowplow__geolocation_context","snowplow__application_context","snowplow__screen_context","snowplow__application_error_events","snowplow__screen_view_events","snowplow__deep_link_context", "snowplow__enable_load_tstamp", + "snowplow__cwv_events", "snowplow__derived_tstamp_partitioned"] } ]; @@ -262,8 +263,11 @@ export const printYamlVariables = (data) => { export const Template = ObjectFieldTemplateGroupsGenerator(GROUPS); ``` - ## Config Generator -You can use the below inputs to generate the code that you need to place into your `dbt_project.yml` file to configure the package as you require. Any values not specified will use their default values from the package. +```mdx-code-block +import ConfigGenerator from "@site/docs/reusable/data-modeling/config-generator/_index.md" + + +``` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/utils/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/utils/index.md index dea9156255..6d82104fff 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/utils/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/utils/index.md @@ -14,7 +14,7 @@ The models, functionality, and variables described below are only available from ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::note diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/web/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/web/index.md index 539600ecd5..6107d9b5b3 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/web/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-configuration/web/index.md @@ -16,7 +16,7 @@ Some variables are only available in the latest version of our package, or might ## Package Configuration Variables -This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. +This package utilizes a set of variables that are configured to recommended values for optimal performance of the models. Depending on your use case, you might want to override these values by adding to your `dbt_project.yml` file. We have provided a [tool](#config-generator) below to help you with that. :::note diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-mobile-data-model/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-mobile-data-model/index.md index 9d082896c9..6b8854ac54 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-mobile-data-model/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-mobile-data-model/index.md @@ -53,12 +53,8 @@ By default they are disabled. They can be enabled by configuring the `dbt_projec Currently the App Errors module, used for crash reporting, is the only optional module. More will be added in the future as the tracker's functionality expands. -### App Errors - -Assuming your tracker is capturing `application_error` events, the module can be enabled by configuring the `dbt_project.yml` file: +```mdx-code-block +import Apperrors from "@site/docs/reusable/data-modeling/app-errors/_index.md" -```yml title="dbt_project.yml" -vars: - snowplow_mobile: - snowplow__enable_app_errors_module: true + ``` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/app-errors-module/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/app-errors-module/index.md new file mode 100644 index 0000000000..8a777b7acd --- /dev/null +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/app-errors-module/index.md @@ -0,0 +1,16 @@ +--- +title: "App Errors Module" +sidebar_position: 50 +hide_title: true +--- + +```mdx-code-block +import Badges from '@site/src/components/Badges'; +``` + + +```mdx-code-block +import Apperrors from "@site/docs/reusable/data-modeling/app-errors/_index.md" + + +``` diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/consent-module/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/consent-module/index.md index b867e3dc22..6cf92dce9b 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/consent-module/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/consent-module/index.md @@ -7,54 +7,10 @@ hide_title: true ```mdx-code-block import Badges from '@site/src/components/Badges'; ``` - + +```mdx-code-block +import Consent from "@site/docs/reusable/data-modeling/consent/_index.md" -# Consent Tracking Custom Module - -This custom module is built as an extension of the [dbt-snowplow-unified package](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md), it transforms raw `consent_preferences` and `cmp_visible` event data into derived tables for easier querying. These events are generated by the **Enhanced Consent plugin** of the [JavaScript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md). - -:::info Important -For the incremental logic to work within the module you **must** use at least `RDB Loader v4.0.0`, as the custom module relies on the additional `load_tstamp` field for dbt native incrementalisation. - -Whenever a new consent version is added to be tracked, the model expects an `allow_all` event in order to attribute the events to the full list of latest consent scopes. It is advisable to send a test event of that kind straight after deployment so that the model can process the data accurately. -::: - -To enable this optional module, the web package must be correctly configured. Please refer to the [snowplow-unified dbt quickstart guide](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/index.md) for a full breakdown of how to set it up. - -## Overview - -This custom module consists of a series of dbt models which produce the following aggregated models from the raw consent tracking events: - - - `snowplow_unified_consent_log`: Snowplow incremental table showing the audit trail of consent and Consent Management Platform (cmp) events - - - `snowplow_unified_consent_users`: Incremental table of user consent tracking stats - - - `snowplow_unified_consent_totals`: Summary of the latest consent status, per consent version - - - `snowplow_unified_consent_scope_status`: Aggregate of current number of users consented to each consent scope - - - `snowplow_unified_cmp_stats`: Used for modeling cmp_visible events and related metrics - - - `snowplow_unified_consent_versions`: Incremental table used to keep track of each consent version and its validity - - -## Operation - -It is assumed that the dbt_snowplow_unified package is already installed and configured as per the [Quick Start](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/index.md) instructions. - -### Enable the module - -You can enable the custom module through the `snowplow__enable_consent` variable most conveniently set in your dbt_project.yml file: - -```yml title="dbt_project.yml" - -vars: - snowplow_unified: - snowplow__enable_consent: true + ``` - -### Run the module -If you have previously run the unified model without this optional module enabled, you can simply enable the module and run `dbt run --selector snowplow_unified` as many times as needed for this module to catch up with your other data. If you only wish to process this from a specific date, be sure to change your `snowplow__start_date`, or refer to the [Custom module](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/index.md) section for a detailed guide on how to achieve this the most efficient way. - -If you haven't run the web package before, then you can run it using `dbt run --selector snowplow_unified` either through your CLI, within dbt Cloud, or for Enterprise customers you can use the BDP console. In this situation, all models will start in-sync as no events have been processed. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/index.md index 96b1544a1d..7a9839deb2 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/conversions/index.md @@ -49,7 +49,7 @@ Because we know that each user may have a different concept of what a conversion :::caution -Because this is part of the sessions table within the web package, we still expect your sessions to contain at least one `page_view` or `page_ping` event, and the events must all have a `session_identifier` to be included in the `base_events_this_run_table`. Without a `session_identifier` the event will not be visible to the model, and without a `page_view` or `page_ping` in the session there will be no session record for the model to attach the conversions to. +Because this is part of the sessions table within the unified package, we still expect your sessions to contain at least one `page_view` or `page_ping` event, and the events must all have a `session_identifier` to be included in the `base_events_this_run_table`. Without a `session_identifier` the event will not be visible to the model, and without a `page_view` or `page_ping` in the session there will be no session record for the model to attach the conversions to. ::: @@ -139,7 +139,7 @@ For some self-describing event with a name of `sign_up`, where we do not want to Using a self-describing event and a context name -Using our [Snowplow e-commerce tracking](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/browser-tracker/browser-tracker-v3-reference/plugins/snowplow-ecommerce/index.md): +Using our [Snowplow e-commerce tracking](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/plugins/snowplow-ecommerce/index.md: diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/core-web-vitals-module/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/core-web-vitals-module/index.md index db293998cc..8bb7a727ea 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/core-web-vitals-module/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/core-web-vitals-module/index.md @@ -7,7 +7,7 @@ hide_title: true ```mdx-code-block import Badges from '@site/src/components/Badges'; ``` - + ```mdx-code-block import { Accelerator } from "@site/src/components/AcceleratorAdmonitions"; @@ -15,66 +15,8 @@ import { Accelerator } from "@site/src/components/AcceleratorAdmonitions"; ``` -# Core Web Vitals Custom Module - -This custom module is built as an extension of the [dbt-snowplow-unified package](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md), it transforms raw `web vitals` event data into derived tables for easier querying. These events are generated by the **Snowplow Web Vitals plugin** (@snowplow/browser-plugin-web-vitals) of the [JavaScript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md). - -To enable this optional module, the web package must be correctly configured. Please refer to the [snowplow-unified dbt quickstart guide](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/index.md) for a full breakdown of how to set it up. - -## Prerequisites - -In order to use this module you would need to: - -1. Track core web vitals using the web `Snowplow Web Vitals plugin`, which populates the column/table `unstruct_event_com_snowplowanalytics_snowplow_web_vitals_1` -2. Have the [yauaa enrichment](/docs/enriching-your-data/available-enrichments/yauaa-enrichment/index.md) enabled (for device type and device class), which populates `contexts_com_snowplowanalytics_snowplow_yauaa_context_1` -3. (Ideally, but not necessarily) Have the [spiders and bots](/docs/enriching-your-data/available-enrichments/iab-enrichment/index.md) enrichment enabled, which populates `contexts_com_iab_snowplow_spiders_and_robots_1` - -## Overview - -This custom module consists of a series of dbt models which produce the following aggregated models from the raw web vitals events: - -- `snowplow_web_vitals`: Incremental table used as a base for storing core web vital events (first event per page view). - -- `snowplow_web_vital_measurements`: Drop and recompute table to use for visualizations that takes core web vital measurements at the user specified percentile point (defaulted to 75). - - -## Operation - -It is assumed that the dbt_snowplow_unified package is already installed and configured as per the [Quick Start](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/index.md) instructions. - - -### Enable the module - -You can enable the custom module through the `snowplow__enable_cwv` variable in your `dbt_project.yml` file: - -```yml title="dbt_project.yml" - -vars: - snowplow_unified: - snowplow__enable_cwv: true -``` - -### Override the module specific macros - -:::tip - -For information about overriding our macros, see [here](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/macros-and-keys/index.md#overriding-macros) - -::: - -- The `core_web_vital_page_groups()`[source](https://github.com/snowplow/dbt-snowplow-unified/blob/main/macros/core_web_vital_page_groups.sql) macro is used to let the user classify their urls to specific page groups. It returns the sql to provide the classification (expected in the form of case when statements). - -- The `core_web_vital_results_query()`[source](https://github.com/snowplow/dbt-snowplow-unified/blob/main/macros/core_web_vital_results_query.sql) macro is used to let the user classify the tresholds to be applied for the measurements. It returns the sql to provide the logic for the evaluation based on user defined tresholds (expected in the form of case when statements). - -Please make sure you set the results you would like the measurements to pass to **`good`** or align it with the `macro_core_web_vital_pass_query()` macro. - -- The `core_web_vital_pass_query()`[source](https://github.com/snowplow/dbt-snowplow-unified/blob/main/macros/core_web_vital_pass_query.sql) +```mdx-code-block +import CoreWebVitals from "@site/docs/reusable/data-modeling/core-web-vitals/_index.md" -```sql -case when lcp_result = 'good' and fid_result = 'good' and cls_result = 'good' then 1 else 0 end passed + ``` - -### Run the module -If you have previously run the unified model without this optional module enabled, you can simply enable the module and run `dbt run --selector snowplow_unified` as many times as needed for this module to catch up with your other data. If you only wish to process this from a specific date, be sure to change your `snowplow__start_date`, or refer to the [Custom module](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/index.md) section for a detailed guide on how to achieve this the most efficient way. - -If you haven't run the web package before, then you can run it using `dbt run --selector snowplow_unified` either through your CLI, within dbt Cloud, or for Enterprise customers you can use the BDP console. In this situation, all models will start in-sync as no events have been processed. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-dark.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-dark.drawio.png index 511cd906b2..f6fd9482d2 100644 Binary files a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-dark.drawio.png and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-dark.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-light.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-light.drawio.png index 82af9ae37a..a984648322 100644 Binary files a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-light.drawio.png and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/unified-process-light.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-dark.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-dark.drawio.png new file mode 100644 index 0000000000..ae553e3f49 Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-dark.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-light.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-light.drawio.png new file mode 100644 index 0000000000..337fb8ed1e Binary files /dev/null and b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/images/web-process-light.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md index 4e6e9d16ff..c344c0bfad 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md @@ -10,6 +10,11 @@ import ThemedImage from '@theme/ThemedImage'; import DocCardList from '@theme/DocCardList'; ``` + +:::danger +The unified package is currently in pre-release state. +::: + # Snowplow Unified Package @@ -83,7 +88,7 @@ dark: require('./images/engaged_time_dark.drawio.png').default At a session level, this calculation is slightly more involved, as it needs to happen per page view and account for [stray page pings](#stray-page-pings), but the underlying idea is the same. -## Stray Page Pings +## Stray Page Pings (Web only!) Stray Page Pings are pings within a session that do not have a corresponding `page_view` event within **the same session**. The most common cause of these is someone returning to a tab after their session has timed out but not refreshing the page. The `page_view` event exists in some other session, but there is no guarantee that both these sessions will be processed in the same run, which could lead to different results. Depending on your site content and user behavior the prevalence of sessions with stray page pings could vary greatly. For example with long-form content we have seen around 10% of all sessions contain only stray page pings (i.e. no `page_view` events). We take different approaches to adjust for these stray pings at the page view and sessions levels, which can lead to differences between the two tables, but each is as accurate as we can currently make it. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/consent-module/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/consent-module/index.md index 9e95502aa9..cc98c84063 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/consent-module/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/consent-module/index.md @@ -9,52 +9,8 @@ import Badges from '@site/src/components/Badges'; ``` +```mdx-code-block +import Consent from "@site/docs/reusable/data-modeling/consent/_index.md" -# Consent Tracking Custom Module - -This custom module is built as an extension of the [dbt-snowplow-web package](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/index.md), it transforms raw `consent_preferences` and `cmp_visible` event data into derived tables for easier querying. These events are generated by the **Enhanced Consent plugin** of the [JavaScript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md). - -:::info Important -For the incremental logic to work within the module you **must** use at least `RDB Loader v4.0.0`, as the custom module relies on the additional `load_tstamp` field for dbt native incrementalisation. - -Whenever a new consent version is added to be tracked, the model expects an `allow_all` event in order to attribute the events to the full list of latest consent scopes. It is advisable to send a test event of that kind straight after deployment so that the model can process the data accurately. -::: - -To enable this optional module, the web package must be correctly configured. Please refer to the [snowplow-web dbt quickstart guide](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/index.md) for a full breakdown of how to set it up. - -## Overview - -This custom module consists of a series of dbt models which produce the following aggregated models from the raw consent tracking events: - - - `snowplow_web_consent_log`: Snowplow incremental table showing the audit trail of consent and Consent Management Platform (cmp) events - - - `snowplow_web_consent_users`: Incremental table of user consent tracking stats - - - `snowplow_web_consent_totals`: Summary of the latest consent status, per consent version - - - `snowplow_web_consent_scope_status`: Aggregate of current number of users consented to each consent scope - - - `snowplow_web_cmp_stats`: Used for modeling cmp_visible events and related metrics - - - `snowplow_web_consent_versions`: Incremental table used to keep track of each consent version and its validity - - -## Operation - -It is assumed that the dbt_snowplow_web package is already installed and configured as per the [Quick Start](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/index.md) instructions. - -### Enable the module - -You can enable the custom module through the `snowplow__enable_consent` variable most conveniently set in your dbt_project.yml file: - -```yml title="dbt_project.yml" - -vars: - snowplow_web: - snowplow__enable_consent: true + ``` - -### Run the module -If you have previously run the web model without this optional module enabled, you can simply enable the module and run `dbt run --selector snowplow_web` as many times as needed for this module to catch up with your other data. If you only wish to process this from a specific date, be sure to change your `snowplow__start_date`, or refer to the [Custom module](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/index.md) section for a detailed guide on how to achieve this the most efficient way. - -If you haven't run the web package before, then you can run it using `dbt run --selector snowplow_web` either through your CLI, within dbt Cloud, or for Enterprise customers you can use the BDP console. In this situation, all models will start in-sync as no events have been processed. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/core-web-vitals-module/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/core-web-vitals-module/index.md index be406c288d..49f3836f32 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/core-web-vitals-module/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/core-web-vitals-module/index.md @@ -15,66 +15,8 @@ import { Accelerator } from "@site/src/components/AcceleratorAdmonitions"; ``` -# Core Web Vitals Custom Module - -This custom module is built as an extension of the [dbt-snowplow-web package](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/index.md), it transforms raw `web vitals` event data into derived tables for easier querying. These events are generated by the **Snowplow Web Vitals plugin** (@snowplow/browser-plugin-web-vitals) of the [JavaScript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md). - -To enable this optional module, the web package must be correctly configured. Please refer to the [snowplow-web dbt quickstart guide](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/web/index.md) for a full breakdown of how to set it up. - -## Prerequisites - -In order to use this module you would need to: - -1. Track core web vitals using the web `Snowplow Web Vitals plugin`, which populates the column/table `unstruct_event_com_snowplowanalytics_snowplow_web_vitals_1` -2. Have the [yauaa enrichment](/docs/enriching-your-data/available-enrichments/yauaa-enrichment/index.md) enabled (for device type and device class), which populates `contexts_com_snowplowanalytics_snowplow_yauaa_context_1` -3. (Ideally, but not necessarily) Have the [spiders and bots](/docs/enriching-your-data/available-enrichments/iab-enrichment/index.md) enrichment enabled, which populates `contexts_com_iab_snowplow_spiders_and_robots_1` - -## Overview - -This custom module consists of a series of dbt models which produce the following aggregated models from the raw web vitals events: - -- `snowplow_web_vitals`: Incremental table used as a base for storing core web vital events (first event per page view). - -- `snowplow_web_vital_measurements`: Drop and recompute table to use for visualizations that takes core web vital measurements at the user specified percentile point (defaulted to 75). - - -## Operation - -It is assumed that the dbt_snowplow_web package is already installed and configured as per the [Quick Start](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/index.md) instructions. - - -### Enable the module - -You can enable the custom module through the `snowplow__enable_cwv` variable in your `dbt_project.yml` file: - -```yml title="dbt_project.yml" - -vars: - snowplow_web: - snowplow__enable_cwv: true -``` - -### Override the module specific macros - -:::tip - -For information about overriding our macros, see [here](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/macros-and-keys/index.md#overriding-macros) - -::: - -- The `core_web_vital_page_groups()`[source](https://github.com/snowplow/dbt-snowplow-web/blob/main/macros/core_web_vital_page_groups.sql) macro is used to let the user classify their urls to specific page groups. It returns the sql to provide the classification (expected in the form of case when statements). - -- The `core_web_vital_results_query()`[source](https://github.com/snowplow/dbt-snowplow-web/blob/main/macros/core_web_vital_results_query.sql) macro is used to let the user classify the tresholds to be applied for the measurements. It returns the sql to provide the logic for the evaluation based on user defined tresholds (expected in the form of case when statements). - -Please make sure you set the results you would like the measurements to pass to **`good`** or align it with the `macro_core_web_vital_pass_query()` macro. - -- The `core_web_vital_pass_query()`[source](https://github.com/snowplow/dbt-snowplow-web/blob/main/macros/core_web_vital_pass_query.sql) +```mdx-code-block +import CoreWebVitals from "@site/docs/reusable/data-modeling/core-web-vitals/_index.md" -```sql -case when lcp_result = 'good' and fid_result = 'good' and cls_result = 'good' then 1 else 0 end passed + ``` - -### Run the module -If you have previously run the web model without this optional module enabled, you can simply enable the module and run `dbt run --selector snowplow_web` as many times as needed for this module to catch up with your other data. If you only wish to process this from a specific date, be sure to change your `snowplow__start_date`, or refer to the [Custom module](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-custom-models/index.md) section for a detailed guide on how to achieve this the most efficient way. - -If you haven't run the web package before, then you can run it using `dbt run --selector snowplow_web` either through your CLI, within dbt Cloud, or for Enterprise customers you can use the BDP console. In this situation, all models will start in-sync as no events have been processed. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/normalize/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/normalize/index.md index 0ee865e28f..deba733954 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/normalize/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/normalize/index.md @@ -36,7 +36,7 @@ If you do not do this the package will still work, but the incremental upserts w ### 2. Adding the `selectors.yml` file -Within the packages we have provided a suite of suggested selectors to run and test the models within the package together with the web model. This leverages dbt's [selector flag](https://docs.getdbt.com/reference/node-selection/syntax). You can find out more about each selector in the [YAML Selectors](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/index.md#yaml-selectors) section. +Within the packages we have provided a suite of suggested selectors to run and test the models within the package together with the normalize model. This leverages dbt's [selector flag](https://docs.getdbt.com/reference/node-selection/syntax). You can find out more about each selector in the [YAML Selectors](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/index.md#yaml-selectors) section. These are defined in the `selectors.yml` file ([source](https://github.com/snowplow/dbt-snowplow-normalize/blob/main/selectors.yml)) within the package, however in order to use these selections you will need to copy this file into your own dbt project directory. This is a top-level file and therefore should sit alongside your `dbt_project.yml` file. If you are using multiple packages in your project you will need to combine the contents of these into a single file. diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/unified/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/unified/index.md index 0d2da346fc..b9c81ac9fc 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/unified/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/unified/index.md @@ -4,6 +4,10 @@ sidebar_position: 100 title: "Unified Quickstart" --- +:::danger +The unified package is currently in pre-release state. +::: + ## Requirements In addition to [dbt](https://github.com/dbt-labs/dbt) being installed: @@ -12,12 +16,12 @@ To model web events: - web events dataset being available in your database - [Snowplow Javascript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md) version 2 or later implemented. -- Web Page context [enabled](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v2/tracker-setup/initializing-a-tracker-2/index.md#webPage_context) (enabled by default in [v3+](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracker-setup/initialization-options/index.md#webPage_context)). -- [Page view events](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracking-events/index.md#page-views) implemented. +- Web Page context [enabled](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracker-setup/initialization-options/index.md#webpage-context) (enabled by default in [v3+](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracker-setup/initialization-options/index.md#webpage-context)). +- [Page view events](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracking-events/index.md#page-views) implemented. To model mobile events: - mobile events dataset being available in your database -- Snowplow [Android](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/android-tracker/index.md) or [iOS](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/objective-c-tracker/index.md) mobile tracker version 1.1.0 or later implemented. +- Snowplow [Android](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/android-tracker/index.md), [iOS](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/objective-c-tracker/index.md) mobile tracker version 1.1.0 (or later) or [React Native tracker](https://docs.snowplow.io/docs/collecting-data/collecting-from-own-applications/react-native-tracker/) implemented - Mobile session context enabled ([ios](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/objective-c-tracker/ios-tracker-1-7-0/index.md#session-context) or [android](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/android-tracker/android-1-7-0/index.md#session-tracking)). - Screen view events enabled ([ios](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/objective-c-tracker/ios-tracker-1-7-0/index.md#tracking-features) or [android](/docs/collecting-data/collecting-from-own-applications/mobile-trackers/previous-versions/android-tracker/android-1-7-0/index.md#tracking-features)). @@ -48,7 +52,7 @@ If you do not do this the package will still work, but the incremental upserts w ### 2. Adding the `selectors.yml` file -Within the packages we have provided a suite of suggested selectors to run and test the models within the package together with the web model. This leverages dbt's [selector flag](https://docs.getdbt.com/reference/node-selection/syntax). You can find out more about each selector in the [YAML Selectors](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/index.md#yaml-selectors) section. +Within the packages we have provided a suite of suggested selectors to run and test the models within the package together with the unified model. This leverages dbt's [selector flag](https://docs.getdbt.com/reference/node-selection/syntax). You can find out more about each selector in the [YAML Selectors](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/index.md#yaml-selectors) section. These are defined in the `selectors.yml` file ([source](https://github.com/snowplow/dbt-snowplow-web/blob/main/selectors.yml)) within the package, however in order to use these selections you will need to copy this file into your own dbt project directory. This is a top-level file and therefore should sit alongside your `dbt_project.yml` file. If you are using multiple packages in your project you will need to combine the contents of these into a single file. @@ -70,7 +74,7 @@ Please note that your `target.database` is NULL if using Databricks. In Databric ### 4. Enabled desired contexts -The unified package has the option to join in data from the following Snowplow enrichments and out-of-the-box contexts: +The unified package has the option to join in data from the following Snowplow enrichments and out-of-the-box context entities: - [IAB enrichment](/docs/enriching-your-data/available-enrichments/iab-enrichment/index.md) - [UA Parser enrichment](/docs/enriching-your-data/available-enrichments/ua-parser-enrichment/index.md) @@ -80,8 +84,10 @@ The unified package has the option to join in data from the following Snowplow e - Geolocation context - App context - Screen context -- App Error event - Deep Link context +- App Error context +- Core Web Vitals +- Consent (Preferences & cmp visible) By default these are **all disabled** in the unified package. Assuming you have the enrichments turned on in your Snowplow pipeline, to enable the contexts within the package please add the following to your `dbt_project.yml` file: @@ -91,13 +97,15 @@ vars: snowplow__enable_iab: true snowplow__enable_ua: true snowplow__enable_yauaa: true - snowplow__enable_browser_context: false - snowplow__enable_mobile_context: false - snowplow__enable_geolocation_context: false - snowplow__enable_app_context: false - snowplow__enable_screen_context: false - snowplow__enable_app_error_event: false - snowplow__enable_deep_link_context: false + snowplow__enable_browser_context: true + snowplow__enable_mobile_context: true + snowplow__enable_geolocation_context: true + snowplow__enable_application_context: true + snowplow__enable_screen_context: true + snowplow__enable_deep_link_context: true + snowplow__enable_consent: true + snowplow__enable_cwv: true + snowplow__enable_app_errors: true ``` ### 5. Filter your data set @@ -114,7 +122,7 @@ vars: ### 6. Verify page ping variables -The unified package processes page ping events to calculate web page engagement times. If your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/javascript-tracker/javascript-tracker-v3/tracking-events/index.md#activity-tracking-page-pings) for `min_visit_length` (default 5) and `heartbeat` (default 10) differs from the defaults provided in this package, you can override by adding to your `dbt_project.yml`: +The unified package processes page ping events to calculate web page engagement times. If your [tracker configuration](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/web-tracker/tracking-events/index.md#activity-tracking-page-pings) for `min_visit_length` (default 5) and `heartbeat` (default 10) differs from the defaults provided in this package, you can override by adding to your `dbt_project.yml`: ```yml title="dbt_project.yml" vars: diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/utils/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/utils/index.md index e02c45a000..d6cd24bd75 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/utils/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-quickstart/utils/index.md @@ -132,7 +132,7 @@ Be sure to specify your `PACKAGE_NAME` when calling the `get_enabled_snowplow_mo ::: ```jinja2 -{{ +{{ config( post_hook=["{{snowplow_utils.print_run_limits(this)}}"] ) @@ -329,13 +329,9 @@ on-run-end: The `snowplow_delete_from_manifest` macro is called to remove models from manifest if specified using the `models_to_remove` variable, in case of a partial or full refresh. The `snowplow_incremental_post_hook` is used to update the manifest table with the timestamp of the last event consumed successfully for each Snowplow incremental model - make sure to change the `base_events_this_run_table_name` if you used a different table name. -:::tip +:::tip -<<<<<<< HEAD -The `package_name` variable here is not necessarily the name of your project (although it keeps things simple to make it the same), instead it is what is used to identify your tagged incremental models as they should be tagged with `_incremental`. -======= -The `package_name` variable here is not the name of your project, instead it is what is used to identify your tagged incremental models as they should be tagged with `_incremental`. ->>>>>>> db4c1fb3 (Update utils dbt docs (#622)) +The `package_name` variable here is not necessarily the name of your project (although it keeps things simple to make it the same), instead it is what is used to identify your tagged incremental models as they should be tagged with `_incremental`. ::: diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-dark.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-dark.drawio.png index 640777285a..e2fa80e72e 100644 Binary files a/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-dark.drawio.png and b/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-dark.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-light.drawio.png b/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-light.drawio.png index 7468c43a8a..540e52404f 100644 Binary files a/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-light.drawio.png and b/docs/modeling-your-data/modeling-your-data-with-dbt/images/dbt_packages-light.drawio.png differ diff --git a/docs/modeling-your-data/modeling-your-data-with-dbt/index.md b/docs/modeling-your-data/modeling-your-data-with-dbt/index.md index 238d1b6185..d78f81c39a 100644 --- a/docs/modeling-your-data/modeling-your-data-with-dbt/index.md +++ b/docs/modeling-your-data/modeling-your-data-with-dbt/index.md @@ -36,7 +36,7 @@ For Snowplow BDP customers, dbt projects can be configured and scheduled in the Our dbt packages come with powerful built-in features such as an [optimization to the incremental materialization](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-materialization/index.md) to save you cost on warehouse compute resources compared to the standard method, a custom [incremental logic](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-advanced-usage/dbt-incremental-logic/index.md) to ensure we process just the required data for each run and keep your models in sync, plus the ability to build your own custom models using both of these! -There are 5 core snowplow dbt packages: +There are 6 core snowplow dbt packages: - [Snowplow Unified](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-unified-data-model/index.md) ([dbt model docs](https://snowplow.github.io/dbt-snowplow-web/#!/overview/snowplow_unified)): for modeling your web and mobile data for page and screen views, sessions, users, and consent - [Snowplow Web](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-web-data-model/index.md) ([dbt model docs](https://snowplow.github.io/dbt-snowplow-web/#!/overview/snowplow_web)): for modeling your web data for page views, sessions, users, and consent - [Snowplow Mobile](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-models/dbt-mobile-data-model/index.md) ([dbt model docs](https://snowplow.github.io/dbt-snowplow-mobile/#!/overview/snowplow_mobile)): for modeling your mobile app data for screen views, sessions, users, and crashes diff --git a/docs/reusable/data-modeling/app-errors/_index.md b/docs/reusable/data-modeling/app-errors/_index.md new file mode 100644 index 0000000000..93f2e48a0b --- /dev/null +++ b/docs/reusable/data-modeling/app-errors/_index.md @@ -0,0 +1,15 @@ +```mdx-code-block +import ReactMarkdown from 'react-markdown'; +import CodeBlock from '@theme/CodeBlock'; +``` + +### App Errors + +Assuming your tracker is capturing `application_error` events, the module can be enabled by configuring the `dbt_project.yml` file: + +{` +vars: + snowplow_${props.packageName}: + ${props.variable}: true + `} + diff --git a/docs/reusable/data-modeling/config-generator/_index.md b/docs/reusable/data-modeling/config-generator/_index.md new file mode 100644 index 0000000000..59749fe54f --- /dev/null +++ b/docs/reusable/data-modeling/config-generator/_index.md @@ -0,0 +1,3 @@ +Over time you may need to customise the package further. To help you generate the project variables code block you should put in your `dbt_project.yml` to overwrite the package defaults, we have provided a list of input fields with explanations in the expandable text blocks below. Based on the inputs given you can see the code being generated gradually. + +Any values not specified will use their default values from the package, therefore it is enough to modify the variables that fit your business needs. If your selections do not generate anything it means you specified default values in which case there is no need to overwrite them on your project level. diff --git a/docs/reusable/data-modeling/consent/_index.md b/docs/reusable/data-modeling/consent/_index.md new file mode 100644 index 0000000000..12553d8f03 --- /dev/null +++ b/docs/reusable/data-modeling/consent/_index.md @@ -0,0 +1,51 @@ +```mdx-code-block +import ReactMarkdown from 'react-markdown'; +import CodeBlock from '@theme/CodeBlock'; +``` + +# Consent Tracking Custom Module + + + +It transforms raw `consent_preferences` and `cmp_visible` event data into derived tables for easier querying. These events are generated by the **Enhanced Consent plugin** of the [JavaScript tracker](/docs/collecting-data/collecting-from-own-applications/javascript-trackers/index.md). + +:::info Important +For the incremental logic to work within the module you **must** use at least `RDB Loader v4.0.0`, as the custom module relies on the additional `load_tstamp` field for dbt native incrementalisation. + +Whenever a new consent version is added to be tracked, the model expects an `allow_all` event in order to attribute the events to the full list of latest consent scopes. It is advisable to send a test event of that kind straight after deployment so that the model can process the data accurately. +::: + +## Overview + +This custom module consists of a series of dbt models which produce the following aggregated models from the raw consent tracking events: + + + +## Operation + + + +### Enable the module + +You can enable the custom module through the `snowplow__enable_consent` variable most conveniently set in your dbt_project.yml file: + +{` +vars: + snowplow_${props.packageName}: + snowplow__enable_cwv: true + `} + + +### Run the module + + + diff --git a/docs/reusable/data-modeling/core-web-vitals/_index.md b/docs/reusable/data-modeling/core-web-vitals/_index.md new file mode 100644 index 0000000000..8b14a2cd11 --- /dev/null +++ b/docs/reusable/data-modeling/core-web-vitals/_index.md @@ -0,0 +1,69 @@ +```mdx-code-block +import ReactMarkdown from 'react-markdown'; +import CodeBlock from '@theme/CodeBlock'; +``` +# Core Web Vitals Custom Module + + + + + +## Prerequisites + +In order to use this module you would need to: + +1. Track core web vitals using the web `Snowplow Web Vitals plugin`, which populates the column/table `unstruct_event_com_snowplowanalytics_snowplow_web_vitals_1` +2. Have the [yauaa enrichment](/docs/enriching-your-data/available-enrichments/yauaa-enrichment/index.md) enabled (for device type and device class), which populates `contexts_com_snowplowanalytics_snowplow_yauaa_context_1` +3. (Ideally, but not necessarily) Have the [spiders and bots](/docs/enriching-your-data/available-enrichments/iab-enrichment/index.md) enrichment enabled, which populates `contexts_com_iab_snowplow_spiders_and_robots_1` + +## Overview + +This custom module consists of a series of dbt models which produce the following aggregated models from the raw web vitals events: + + + +## Operation + + + + +### Enable the module + +You can enable the custom module through the `snowplow__enable_cwv` variable in your `dbt_project.yml` file: + +{` +vars: + snowplow_${props.packageName}: + snowplow__enable_cwv: true + `} + + +### Override the module specific macros + +:::tip + +For information about overriding our macros, see [here](/docs/modeling-your-data/modeling-your-data-with-dbt/dbt-operation/macros-and-keys/index.md#overriding-macros) + +::: + +- The `core_web_vital_page_groups()` (source) macro is used to let the user classify their urls to specific page groups. It returns the sql to provide the classification (expected in the form of case when statements). + +- The `core_web_vital_results_query()` (source) macro is used to let the user classify the thresholds to be applied for the measurements. It returns the sql to provide the logic for the evaluation based on user defined thresholds (expected in the form of case when statements). + +Please make sure you set the results you would like the measurements to pass to **`good`** or align it with the `macro_core_web_vital_pass_query()` macro. + +- The `core_web_vital_pass_query()` (source) + +```sql +case when lcp_result = 'good' and fid_result = 'good' and cls_result = 'good' then 1 else 0 end passed +``` + +### Run the module + + + diff --git a/src/components/JsonSchemaValidator/dbtUnified.js b/src/components/JsonSchemaValidator/dbtUnified.js index 3ab12069ba..0e489a7240 100644 --- a/src/components/JsonSchemaValidator/dbtUnified.js +++ b/src/components/JsonSchemaValidator/dbtUnified.js @@ -225,7 +225,7 @@ export const dbtSnowplowUnifiedConfigSchema = { type: 'boolean', title: 'Enable Geolocation Context', }, - snowplow__enable_app_context: { + snowplow__enable_application_context: { type: 'boolean', title: 'Enable App Context', }, @@ -233,7 +233,7 @@ export const dbtSnowplowUnifiedConfigSchema = { type: 'boolean', title: 'Enable Screen Context', }, - snowplow__enable_app_error_event: { + snowplow__enable_app_errors: { type: 'boolean', title: 'Enable App Error Context', }, @@ -274,11 +274,11 @@ export const dbtSnowplowUnifiedConfigSchema = { type: 'string', title: '(Redshift) YAUAA Context Table', }, - snowplow__consent_cmp_visible: { + snowplow__cmp_visible_events: { type: 'string', title: '(Redshift) CMP Visible Context Table', }, - snowplow__consent_preferences: { + snowplow__consent_preferences_events: { type: 'string', title: '(Redshift) Consent Preferences Context Table', }, @@ -316,8 +316,9 @@ export const dbtSnowplowUnifiedConfigSchema = { type: 'object', title: "Identifier", properties: { - schema: { type: 'string' }, // TODO: add regex here to make valid context/unstruct or atomic? - field: { type: 'string' } // TODO: add regex here to make valid SQL name? + schema: { type: 'string', description: 'The schema name of your events table, atomic in most use cases, alternatively for sdes/contexts this should instead be the name of the field itself' }, // TODO: add regex here to make valid context/unstruct or atomic? + field: { type: 'string', description: 'The name of the field to use as session identifier, alternatively, in case of sdes/contexts it is the name of the element that refers to the field to be extracted' } // TODO: add regex here to make valid SQL name? + }, }, required: ['schema', 'field'], additionalProperties: false @@ -330,7 +331,8 @@ export const dbtSnowplowUnifiedConfigSchema = { }, snowplow__session_timestamp: { type: 'string', - title: 'Timestamp used for incremental processing, should be your partition field', + title: 'Session Timestamp', + description: 'Timestamp used for incremental processing, should be your partition field' }, snowplow__user_identifiers: { type: 'string', @@ -342,8 +344,8 @@ export const dbtSnowplowUnifiedConfigSchema = { type: 'object', title: "Identifier", properties: { - schema: { type: 'string' }, // TODO: add regex here to make valid context/unstruct or atomic? - field: { type: 'string' } // TODO: add regex here to make valid SQL name? + schema: { type: 'string', description: 'The schema name of your events table, atomic in most use cases, alternatively for sdes/contexts this should instead be the name of the field itself' }, // TODO: add regex here to make valid context/unstruct or atomic? + field: { type: 'string', description: 'The name of the field to use as user identifier, alternatively, in case of sdes/contexts it is the name of the element that refers to the field to be extracted' } // TODO: add regex here to make valid SQL name? }, required: ['schema', 'field'], additionalProperties: false @@ -353,11 +355,13 @@ export const dbtSnowplowUnifiedConfigSchema = { snowplow__user_sql: { type: 'string', - title: 'SQL for your user identifier', + title: 'User sql', + description: 'SQL for your user identifier' }, snowplow__user_stitching_id: { type: 'string', - title: 'Field used when stitching together users', + title: 'User Stitching Id', + description: 'Field used when stitching together users' }, diff --git a/src/components/JsonSchemaValidator/dbtWeb.js b/src/components/JsonSchemaValidator/dbtWeb.js index 5dddd67b18..6c7f7da029 100644 --- a/src/components/JsonSchemaValidator/dbtWeb.js +++ b/src/components/JsonSchemaValidator/dbtWeb.js @@ -292,8 +292,8 @@ export const dbtSnowplowWebConfigSchema = { type: 'object', title: "Identifier", properties: { - schema: { type: 'string' }, // TODO: add regex here to make valid context/unstruct or atomic? - field: { type: 'string' } // TODO: add regex here to make valid SQL name? + schema: { type: 'string', description: 'The schema name of your events table, atomic in most use cases, alternatively for sdes/contexts this should instead be the name of the field itself' }, // TODO: add regex here to make valid context/unstruct or atomic? + field: { type: 'string', description: 'The name of the field to use as session identifier, alternatively, in case of sdes/contexts it is the name of the element that refers to the field to be extracted' } // TODO: add regex here to make valid SQL name? }, required: ['schema', 'field'], additionalProperties: false @@ -306,7 +306,8 @@ export const dbtSnowplowWebConfigSchema = { }, snowplow__session_timestamp: { type: 'string', - title: 'Timestamp used for incremental processing, should be your partition field', + title: 'Session Timestamp', + description: 'Timestamp used for incremental processing, should be your partition field' }, snowplow__user_identifiers: { type: 'string', @@ -318,8 +319,8 @@ export const dbtSnowplowWebConfigSchema = { type: 'object', title: "Identifier", properties: { - schema: { type: 'string' }, // TODO: add regex here to make valid context/unstruct or atomic? - field: { type: 'string' } // TODO: add regex here to make valid SQL name? + schema: { type: 'string', description: 'The schema name of your events table, atomic in most use cases, alternatively for sdes/contexts this should instead be the name of the field itself' }, // TODO: add regex here to make valid context/unstruct or atomic? + field: { type: 'string', description: 'The name of the field to use as user identifier, alternatively, in case of sdes/contexts it is the name of the element that refers to the field to be extracted' } // TODO: add regex here to make valid SQL name? }, required: ['schema', 'field'], additionalProperties: false @@ -329,15 +330,16 @@ export const dbtSnowplowWebConfigSchema = { snowplow__user_sql: { type: 'string', - title: 'SQL for your user identifier', + title: 'User sql', + description: 'SQL for your user identifier' }, snowplow__user_stitching_id: { type: 'string', - title: 'Field used when stitching together users', + title: 'User Stitching Id', + description: 'Field used when stitching together users' }, - snowplow__page_view_passthroughs: { title: 'Page View Passthroughs', $ref: '#/definitions/passthrough_vars'