Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

view takes 20 times as long with annotations #1

Open
miguelcleon opened this issue Jan 31, 2017 · 11 comments
Open

view takes 20 times as long with annotations #1

miguelcleon opened this issue Jan 31, 2017 · 11 comments

Comments

@miguelcleon
Copy link
Member

This view https://github.com/ODM2/ODM2-performance-optimization/blob/master/timeseriesresultvaluesextwannotations.sql

takes over 20 times longer then this view https://github.com/ODM2/ODM2-performance-optimization/blob/master/timeseriesresultvaluesext.sql

I'm trying to create a view for exporting data including time series result annotations for data quality but the view with annotations is taking over 20 times as long as the one without. Any suggestions on a better approach? @smrgeoinfo maybe?

The only thing I can think to do right now is not use the left joins and add time series values with annotations as a seperate file, but I don't really want to do that.

@miguelcleon
Copy link
Member Author

miguelcleon commented Jan 31, 2017

Nevermind the long running view has a bug, I see it now. This is fixed now, for the one sample query I tried, the view with annotations only took a few dozen milliseconds longer then the other view.

@miguelcleon
Copy link
Member Author

I'm thinking that I can further improve these views by making these materialized views.

@miguelcleon
Copy link
Member Author

@emiliom @valentinedwv @horsburgh Do you guys think materialized views might be a good approach to denormalizing ODM2 for improved performance? One draw back is that the materialized views need to be recreated when there are changes to the underlaying data.

@miguelcleon
Copy link
Member Author

I tried it out with the view in timeseriesresultvaluesextwannotations.sql a test query went from 3.4 seconds to 2.1 seconds so it does seem to help.

@emiliom
Copy link
Member

emiliom commented Feb 1, 2017

@miguelcleon, I have some, but very limited experience with materialized views in postgresql, with ODM2. It was promising, but I don't remember doing any formal performance comparisons.

But more broadly, any and all optimization strategies are worth exploring, specially if they have limited side effects or are not hard to implement. Materialized views should be explored, I think. Of course, tools like odm2api probably can't take advantage of them directly if they operate on the ODM2 "raw" schema.

One caveat to keep in mind, possibly, is that it's a feature that may not exist in all 4 RDBMS we're trying to support with ODM2. I'm not sure they exist in MySQL (doubt it, but don't know), and I'd be surprised if they exist in SQLite. Then again, the topic of optimization is likely to come up to system-specific issues and solutions pretty quickly.

@valentinedwv
Copy link
Member

valentinedwv commented Feb 1, 2017 via email

@smrgeoinfo
Copy link

smrgeoinfo commented Feb 1, 2017 via email

@miguelcleon
Copy link
Member Author

miguelcleon commented Feb 1, 2017

@valentinedwv then it's a question how that table gets created, would odm2api create it? That is something that would be interesting to explore with SQLAlchemy, to see if it would work in all 4 RDBMS. Otherwise it seems like it would be a headache to come up with another standard way.

The materialized view is really just postgreSQL creating a a table from a view. So it's easy to create a flattened table once you know what your view should be. However, then the question of the other RDBMS' come into play as @emiliom suggests.

@miguelcleon
Copy link
Member Author

miguelcleon commented Feb 1, 2017

Yeah, so this is two questions, What is the definition of the flattened tables? It would be best if we can all agree on this.

Two, how do you create the flattened tables? This is probably RDBMS dependent, they would probably just be views in SQLite, for example.

@emiliom
Copy link
Member

emiliom commented Feb 2, 2017

I'm not sure how productive it is to have generic discussions about flattened tables or materialized views, in the abstract. Specific ones are probably more constructive. And for that, opening specific issues would be more helpful.

@miguelcleon
Copy link
Member Author

Ok, I've added a new issue here #5
I've also posted an additional view and changed the file structure for this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants