Skip to content

Commit

Permalink
Merge pull request #30 from TidierOrg/add-snowflake-support
Browse files Browse the repository at this point in the history
- Add snowflake support
- Truncates HTTP messages to  take out stacktrace and show relevant issue 
- Adds docs for snowflake use and best practices
  • Loading branch information
drizk1 authored Jun 20, 2024
2 parents 7b03379 + cbfa527 commit cb2c96a
Show file tree
Hide file tree
Showing 11 changed files with 371 additions and 8 deletions.
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# TidierDB.jl updates

## v0.1.8 - 2024-06-17
- Adds support for Snowflake SQL Rest API using OAuth token connection
- Adds Snowflake support for `connect()`
- Adds docs for Snowflake use

## v0.1.7 - 2024-06-17
- Adds support for Oracle backend via ODBC.jl connection

Expand Down
10 changes: 7 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "TidierDB"
uuid = "86993f9b-bbba-4084-97c5-ee15961ad48b"
authors = ["Daniel Rizk <rizk.daniel.12@gmail.com> and contributors"]
version = "0.1.7"
version = "0.1.8"

[deps]
AWS = "fbe9abb3-538b-5e4e-ba9e-bc94f4f92ebc"
Expand All @@ -11,7 +11,9 @@ ClickHouse = "82f2e89e-b495-11e9-1d9d-fb40d7cf2130"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DuckDB = "d2f5444f-75bc-4fdf-ac35-56f514c445e1"
GZip = "92fee26a-97fe-5a0c-ad85-20a5f3185b63"
GoogleCloud = "55e21f81-8b0a-565e-b5ad-6816892a5ee7"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
LibPQ = "194296ae-ab2e-5f79-8cd4-7183a0a5a0d1"
MacroTools = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
Expand All @@ -28,9 +30,11 @@ ClickHouse = "0.2"
DataFrames = "1.5"
Documenter = "0.27, 1"
DuckDB = "0.10"
LibPQ = "1.17"
JSON3 = "1.1"
GoogleCloud = "0.11"
HTTP = "1.1"
JSON3 = "1.1"
GZip = "0.6"
LibPQ = "1.17"
MacroTools = "0.5"
MySQL = "1.4"
ODBC = "1.1"
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ The main goal of TidierDB.jl is to bring the syntax of Tidier.jl to multiple SQL
- MSSQL `set_sql_mode(:mssql)`
- Postgres `set_sql_mode(:postgres)`
- Athena `set_sql_mode(:athena)`
- Snowflake `set_sql_mode(:snowflake)`
- Google Big Query `set_sql_mode(:gbq)`
- Oracle `set_sql_mode(:oracle)`

Expand Down
47 changes: 47 additions & 0 deletions docs/examples/UserGuide/Snowflake.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Establishing a connection with the Snowflake SQL Rest API requires a OAuth token specific to the Role the user will use to query tables with.

# ## Connecting
# Connection is established with the `connect` function as shown below. Connection requires 5 items as strings
# - account identifier
# - OAuth token
# - Database Name
# - Schema Name
# - Compute Warehouse name

# Three things to note:
# - Your OAuth Token may frequently expire, which may require you to rerun your connection line.
# - For the time being, to properly track columns in the local metadata, you must write them using ALL CAPS - this will likely be addressed and rectified in the future
# - Since each time `db_table` runs, it runs a query to pull the metadata, you may choose to use run `db_table` and save the results, and use these results with`from_query()`
# - This will reduce the number of queries to your database
# - Allow you to build a a SQL query and `@show_query` even if the OAuth_token has expired. To `@collect` you will have to reconnect and rerun db_table if your OAuth token has expired

# ```julia
# ac_id = "string_id"
# token = "OAuth_token_string"
# con = connect(:snowflake, ac_id, token, "DEMODB", "PUBLIC", "COMPUTE_WH")
# # After connection is established, a you may begin querying.
# stable_table_metadata = db_table(con, "MTCARS")
# @chain from_query(stable_table_metadata) begin
# @select(WT)
# @mutate(TEST = WT *2)
# #@aside @show_query _
# @collect
# end
# ```
# ```
# 32×2 DataFrame
# Row │ WT TEST
# │ Float64 Float64
# ─────┼──────────────────
# 1 │ 2.62 5.24
# 2 │ 2.875 5.75
# 3 │ 2.32 4.64
# 4 │ 3.215 6.43
# ⋮ │ ⋮ ⋮
# 29 │ 3.17 6.34
# 30 │ 2.77 5.54
# 31 │ 3.57 7.14
# 32 │ 2.78 5.56
# 24 rows omitted
# ```

1 change: 1 addition & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,5 +120,6 @@ nav:
- "Getting Started" : "examples/generated/UserGuide/getting_started.md"
- "Reusing Part of a Query" : "examples/generated/UserGuide/from_queryex.md"
- "Using Athena" : "examples/generated/UserGuide/athena.md"
- "Using Snowflake" : "examples/generated/UserGuide/Snowflake.md"
- "Writing Functions/Macros with TidierDB Chains" : "examples/generated/UserGuide/functions_pass_to_DB.md"
- "Reference" : "reference.md"
1 change: 1 addition & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ The main goal of TidierDB.jl is to bring the syntax of Tidier.jl to multiple SQL
- MSSQL `set_sql_mode(:mssql)`
- Postgres `set_sql_mode(:postgres)`
- Athena `set_sql_mode(:athena)`
- Snowflake `set_sql_mode(:snowflake)`
- Google Big Query `set_sql_mode(:gbq)`
- Oracle `set_sql_mode(:oracle)`

Expand Down
4 changes: 3 additions & 1 deletion src/TBD_macros.jl
Original file line number Diff line number Diff line change
Expand Up @@ -665,7 +665,9 @@ macro collect(sqlquery)
selected_columns_order = sq.metadata[sq.metadata.current_selxn .== 1, :name]
df_result = df_result[:, selected_columns_order]
elseif db isa GoogleSession{JSONCredentials}
df_result = collect_gbq(sq.db, final_query)
df_result = collect_gbq(sq.db, final_query)
elseif current_sql_mode[] == :snowflake
df_result = execute_snowflake(db, final_query)
elseif current_sql_mode[] == :athena
exe_query = Athena.start_query_execution(final_query, sq.athena_params; aws_config = db)
status = "RUNNING"
Expand Down
17 changes: 15 additions & 2 deletions src/TidierDB.jl
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ using Arrow
using AWS
using JSON3
using GoogleCloud
using HTTP
using JSON3
using GZip

@reexport using DataFrames: DataFrame
@reexport using Chain
Expand All @@ -39,6 +42,7 @@ include("parsing_mssql.jl")
include("parsing_clickhouse.jl")
include("parsing_athena.jl")
include("parsing_gbq.jl")
include("parsing_snowflake.jl")
include("parsing_oracle.jl")
include("joins_sq.jl")
include("slices_sq.jl")
Expand Down Expand Up @@ -71,6 +75,8 @@ function expr_to_sql(expr, sq; from_summarize::Bool = false)
return expr_to_sql_gbq(expr, sq; from_summarize=from_summarize)
elseif current_sql_mode[] == :oracle
return expr_to_sql_oracle(expr, sq; from_summarize=from_summarize)
elseif current_sql_mode[] == :snowflake
return expr_to_sql_snowflake(expr, sq; from_summarize=from_summarize)
else
error("Unsupported SQL mode: $(current_sql_mode[])")
end
Expand Down Expand Up @@ -249,14 +255,21 @@ function db_table(db, table, athena_params::Any=nothing)
table_name = string(table)
metadata = if current_sql_mode[] == :lite
get_table_metadata(db, table_name)
elseif current_sql_mode[] == :postgres || current_sql_mode[] == :duckdb || current_sql_mode[] == :mysql || current_sql_mode[] == :mssql || current_sql_mode[] == :clickhouse || current_sql_mode[] == :gbq || current_sql_mode[] == :oracle
elseif current_sql_mode[] in [:postgres, :duckdb, :mysql, :mssql, :clickhouse, :gbq, :oracle]
get_table_metadata(db, table_name)
elseif current_sql_mode[] == :athena
get_table_metadata_athena(db, table_name, athena_params)
elseif current_sql_mode[] == :snowflake
get_table_metadata(db, table_name)
else
error("Unsupported SQL mode: $(current_sql_mode[])")
end
return SQLQuery(from=table_name, metadata=metadata, db=db, athena_params=athena_params)
formatted_table_name = if current_sql_mode[] == :snowflake
"$(db.database).$(db.schema).$table_name"
else
table_name
end
return SQLQuery(from=formatted_table_name, metadata=metadata, db=db, athena_params=athena_params)
end

"""
Expand Down
2 changes: 2 additions & 0 deletions src/docstrings.jl
Original file line number Diff line number Diff line change
Expand Up @@ -984,6 +984,8 @@ This function establishes a database connection based on the specified backend a
# conn = connect(:lite)
# Connect to Google Big Query
# conn = connect(:gbq, "json_user_key_path", "project_id")
# Connect to Snowflake
# conn = connect(:snowflake, "ac_id", "token", "Database_name", "Schema_name", "warehouse_name")
# Connect to DuckDB
julia> db = connect(:duckdb)
DuckDB.Connection(":memory:")
Expand Down
Loading

0 comments on commit cb2c96a

Please sign in to comment.