Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let WorkflowLinter.refresh_report lint jobs from JobsCrawler #3732

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

JCZuurmond
Copy link
Member

@JCZuurmond JCZuurmond commented Feb 21, 2025

Changes

Let WorkflowLinter.refresh_report lint jobs from JobsCrawler so that we only lint what is within scope

Linked issues

Resolves #3662
Progresses #3722

Functionality

  • modified workflow linting code
  • modified existing workflow: assessment

Tests

  • modified unit tests
  • modified integration tests

@JCZuurmond JCZuurmond added step/assessment go/uc/upgrade - Assessment Step migrate/code Abstract Syntax Trees and other dark magic labels Feb 21, 2025
@JCZuurmond JCZuurmond self-assigned this Feb 21, 2025
@JCZuurmond JCZuurmond requested a review from a team as a code owner February 21, 2025 10:06
Copy link

❌ 84/85 passed, 1 failed, 7 skipped, 54m1s total

❌ test_running_real_assessment_job: AssertionError: assert 'UNKNOWN' != 'UNKNOWN' (15m4.057s)
AssertionError: assert 'UNKNOWN' != 'UNKNOWN'
[gw8] linux -- Python 3.10.16 /home/runner/work/ucx/ucx/.venv/bin/python
10:08 INFO [tests.integration.conftest] Dashboard Created ucx_DdSj2_ra78b362a0: https://DATABRICKS_HOST/sql/dashboards/2b16e50d-1ce3-45f5-a972-14d91c1c3ec1
10:08 INFO [tests.integration.conftest] Dashboard Created ucx_Do9bX_ra78b362a0: https://DATABRICKS_HOST/sql/dashboards/db2cd9e2-a96d-4da9-a1f7-7c32a6bccaf1
10:08 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.jhTi/config.yml) doesn't exist.
10:08 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
10:08 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
10:08 INFO [databricks.labs.ucx.install] Fetching installations...
10:08 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
10:08 DEBUG [tests.integration.conftest] Waiting for clusters to start...
10:08 DEBUG [tests.integration.conftest] Waiting for clusters to start...
10:08 INFO [databricks.labs.ucx.install] Installing UCX v0.55.1+4320250221100831
10:08 INFO [databricks.labs.ucx.install] Creating ucx schemas...
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-legacy
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migration-progress-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
10:08 INFO [databricks.labs.ucx.install] Creating dashboards...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/progress...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/progress/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.jhTi/README for the next steps.
10:08 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/395348751966813
10:08 INFO [databricks.labs.ucx.installer.workflows] Started assessment job: https://DATABRICKS_HOST#job/395348751966813/runs/201792256994719
10:08 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/395348751966813
10:08 INFO [databricks.labs.ucx.installer.workflows] Identified a run in progress waiting for run completion
10:22 DEBUG [databricks.labs.ucx.framework.crawlers] [hive_metastore.dummy_svhnf.tables] fetching tables inventory
10:08 INFO [tests.integration.conftest] Dashboard Created ucx_DdSj2_ra78b362a0: https://DATABRICKS_HOST/sql/dashboards/2b16e50d-1ce3-45f5-a972-14d91c1c3ec1
10:08 INFO [tests.integration.conftest] Dashboard Created ucx_Do9bX_ra78b362a0: https://DATABRICKS_HOST/sql/dashboards/db2cd9e2-a96d-4da9-a1f7-7c32a6bccaf1
10:08 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.jhTi/config.yml) doesn't exist.
10:08 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
10:08 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
10:08 INFO [databricks.labs.ucx.install] Fetching installations...
10:08 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
10:08 DEBUG [tests.integration.conftest] Waiting for clusters to start...
10:08 DEBUG [tests.integration.conftest] Waiting for clusters to start...
10:08 INFO [databricks.labs.ucx.install] Installing UCX v0.55.1+4320250221100831
10:08 INFO [databricks.labs.ucx.install] Creating ucx schemas...
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-legacy
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migration-progress-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
10:08 INFO [databricks.labs.ucx.install] Creating dashboards...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/progress...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
10:08 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/progress/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
10:08 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
10:08 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
10:08 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
10:08 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.jhTi/README for the next steps.
10:08 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/395348751966813
10:08 INFO [databricks.labs.ucx.installer.workflows] Started assessment job: https://DATABRICKS_HOST#job/395348751966813/runs/201792256994719
10:08 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/395348751966813
10:08 INFO [databricks.labs.ucx.installer.workflows] Identified a run in progress waiting for run completion
10:22 DEBUG [databricks.labs.ucx.framework.crawlers] [hive_metastore.dummy_svhnf.tables] fetching tables inventory
10:22 INFO [databricks.labs.ucx.install] Deleting UCX v0.55.1+4320250221100831 from https://DATABRICKS_HOST
10:22 INFO [databricks.labs.ucx.install] Deleting inventory database dummy_svhnf
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=813304251852716, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=1093899626074216, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=395348751966813, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=1058771482971452, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=392887168575074, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=1109542128367416, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=526062605547614, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=805843899301511, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=251514237580400, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=308868154697366, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=992671664035079, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=391343978401208, as it is no longer needed
10:22 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=958346974562793, as it is no longer needed
10:22 INFO [databricks.labs.ucx.install] Deleting cluster policy
10:22 INFO [databricks.labs.ucx.install] Deleting secret scope
10:22 INFO [databricks.labs.ucx.install] UnInstalling UCX complete
[gw8] linux -- Python 3.10.16 /home/runner/work/ucx/ucx/.venv/bin/python

Running from acceptance #8364


def refresh_report(self, sql_backend: SqlBackend, inventory_database: str) -> None:
tasks = []
items_listed = 0
for job in self._ws.jobs.list():
if self._include_job_ids is not None and job.job_id not in self._include_job_ids:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we replicate the filtering capability with the Job Crawler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migrate/code Abstract Syntax Trees and other dark magic step/assessment go/uc/upgrade - Assessment Step
Projects
Status: Ready for Review
Development

Successfully merging this pull request may close these issues.

[TECH DEBT]: Refactor WorkflowLinter to use the JobsCrawler instead of crawling the jobs itself
2 participants