Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix materialized view creation #194

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

csalvato
Copy link

@csalvato csalvato commented Feb 5, 2025

This PR fixes an issue with the creation of materialized views in the ClickHouse adapter, specifically addressing the order of SQL clauses and engine specification.

Motivation

When using rails db:schema:load:clickhouse on a fresh DB, materialized views are not created correctly because the schema dumper and loader don't properly handle the full ClickHouse materialized view syntax.

Example of what happens:

  1. We create a materialized view in a migration and the schema dumper correctly captures the full SQL in a comment:
CREATE MATERIALIZED VIEW dev.aggregated_meter_events_mv TO dev.aggregated_meter_events ( `meter_id` String, `date` Date, `total_quantity` UInt64, `event_count` UInt64 ) AS SELECT meter_id, created_at AS date, sum(quantity) AS total_quantity, count() AS event_count FROM dev.meter_events GROUP BY meter_id, created_at
  1. But when loading the schema, it generates incorrect SQL:
CREATE MATERIALIZED VIEW aggregated_meter_events_mv AS SELECT meter_id, created_at AS date, sum(quantity) AS total_quantity, count() AS event_count FROM dev.meter_events GROUP BY meter_id, created_at

Note, the TO statement is just missing 😲

Changes

  1. Restructured the visit_TableDefinition method to properly handle different types of table/view creation:

    • Regular tables with column definitions
    • Materialized views with TO clause
    • Materialized views without TO clause
    • Regular views
  2. Fixed the order of SQL clauses for materialized views:

    • For materialized views without a TO clause, the ENGINE clause is now correctly placed before the AS clause
    • Engine is now set to Memory() for materialized views without TO clause
    • Removed engine handling from regular views
    • Simplified table options to only apply to regular tables
  3. Improved code organization and readability by:

    • Grouping related operations together
    • Adding clear comments for each section
    • Separating the logic for different types of tables/views

The changes ensure that materialized views are created with the correct SQL syntax, fixing issues where view creation would fail due to incorrect clause ordering. All tests are now passing, including specific tests for materialized view creation and dropping.

Testing

  • All 78 existing test examples are passing + added another test for this case.
  • Verified proper SQL generation for different view types

These changes make the adapter more robust and reliable when working with ClickHouse materialized views.

@csalvato csalvato force-pushed the fix-materialized-view-schema-load branch 2 times, most recently from b6bbc05 to 605fb2d Compare February 5, 2025 22:45
@csalvato csalvato force-pushed the fix-materialized-view-schema-load branch from 605fb2d to b05d2a4 Compare February 6, 2025 00:29
@@ -82,6 +82,19 @@ def assign_database_to_subquery!(subquery)
"#{current_database}.#{match[:table_name].sub('.', '')}"
end

def add_materialized_to_clause!(create_sql, options)
if !options.to
create_sql << " ENGINE = Memory()"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it stores this in the RAM?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If the create_table definition does not include a to value but the view: materialized k/v pair is set so that we need to make a to clauses, we will default to storing the data in memory (since we don't know what tables we can use to store it, and Null() engine wouldn't make sense).

@csalvato
Copy link
Author

csalvato commented Feb 6, 2025

@PNixx What's the process for reviewing and merging here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants