-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intake Take2 #153
Comments
@charles-turner-1 I think it would be good to start looking into this. |
Agreed. There's some discussion regarding v2 on intake-esm here, but it looks like the plan was just to pin the version for the time being. I'll do a bit of poking into this and see what we might break if we upgrade intake to v2. At the very least we'll get an improved understanding of how hard the upgrade might be. |
It looks at first glance like we might be able to upgrade from ❯ pip list | rg 'intake'
access_nri_intake 0.1.4+47.g303f786.dirty /Users/u1166368/catalog/access-nri-intake-catalog
intake 2.0.7
intake_dataframe_catalog 0.2.4+0.g725e9f3.dirty /Users/u1166368/catalog/intake-dataframe-catalog
intake-esm 2024.2.6.post16+g6ba67e1.d20241021 /Users/u1166368/catalog/intake-esm
❯ python Python 3.12.7 | packaged by conda-forge | (main, Oct 4 2024, 15:57:01) [Clang 17.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> N.B I'm on 3.12.7 on my local machine - I haven't tested 3.9..3.11 yet.
I'm not convinced that these tests will be comprehensive - I still want to go through and double check that we can build the catalog as before, but it's definitely a very good sign. |
Further investigation shows that upgrading I'm planning to use tox to run our test suite against |
@charles-turner-1 it's worth finding a way to test the catalog build process as well - the tests have pretty good coverage, but making sure everything locks together well is another matter. Then testing that we can actually read back what's been built. |
How long does a full catalogue build take? I haven't actually run a full build myself - is it going to be prohibitively expensive to implement a full e2e integration test? |
The build_all.sh scripts suggests a full catalogue build is quite expensive. However, I can imagine a scenario where we only build a small subset of the full catalogue just to make sure things work right. Ultimately, you can't really test the full catalogue build without just doing the full catalogue build, but in the case of a major upgrade like going to Intake Take2, I think it's worth the extra step before plunging in to a 3 hr, 48 cpu job. |
That's great! Then there's the question of whether/how to use any of the cool new functionality in v2.
It took about 1.5 hours (on 48 cores) before I went on leave... |
Yeah, I agree - running a full build is also more of a moving target than building a subset. Maybe we should think about a smoke test where we build & run some queries against a fixed & hopefully representative subset of the full catalogue then. |
Intake Take2 is currently under development. It is a complete rewrite of Intake that aims "to be largely backward compatible with pre-V2 Intake sources and catalogs." However, Intake-ESM, which the access-nri-intake-catalog is built on, is a somewhat unusual application of Intake. At this point, it's not clear that there will be backwards compatibility for our application. This said, some of the new features promised by Intake Take2 may allow for a newer and better Intake-ESM, but this will obviously be a lot of work.
Strategy for now is to pin to Intake v1 and keep an eye on Intake Take2 progression/developments.
The text was updated successfully, but these errors were encountered: