Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lockfile + fixes for latest package versions #2514

Merged
merged 8 commits into from
Nov 21, 2024
Merged

Conversation

ADBond
Copy link
Contributor

@ADBond ADBond commented Nov 20, 2024

Upgrade packages in lockfiles, and add relevant fixes.

Splink changes:

  • Change default duckdb timestamp format to be UTC rather than arbitrary time-zone, as the latter only worked accidentally (duckdb doesn't accept 'Z' as a timezone specifier for UTC, so existing timestamp formats we used in tests did not work). This shouldn't affect functionality in cases with timestamps coded in ISO-8601 UTC standard, but will mean that anyone using other timezones may need to specify explicit timestamp formats rather than relying on defaults.
  • More direct inheritance in custom Spark sqlglot dialect, preventing size function being incorrectly transpiled in more recent package versions

Test-only changes:

  • In SQL transformation tests use substring instead of substr, as sqlglot seems to now standardise to the former
  • when we mock DatabaseAPI._sql_to_splink_dataframe(...), explicitly register the dummy frame in the connexion. Previously duckdb could still access as it was defined in the same module, but something has changed in how duckdb looks up python objects, so we need to explicitly register it on every call to ensure we can look up the frame in the backend. We need to do it every call, as actions in the test can drop the physical table (as the nature of the mock means that Splink thinks it created this table, even if it is pre-registered)

Closes #2491.
Closes #2511.

For extra constraints in dev dependencies see #2518.

duckdb changed how it deals with timezones, and doesn't recognise 'Z' (iso-8601 utc zero-offset specifier) as a timezone. So instead let's just treat it as a literal for default purposes
for sake of consistency let's pin dialect to duckdb also
this ensures the dummy frame is accessible to the db_api connexion, which is no longer guaranteed in duckdb
@ADBond ADBond added dependencies Pull requests that update a dependency file testing maintenance labels Nov 20, 2024
to deal with getting inconsistent pandas + numpy versions from dependency resolver for python >= 3.10
@ADBond ADBond merged commit 9161712 into master Nov 21, 2024
25 checks passed
@ADBond ADBond deleted the maint/deps branch November 21, 2024 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file maintenance testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New sqlglot breaks custom Spark dialect Tests fail on latest packages
1 participant