Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

153-End to End build test. #266

Merged
merged 30 commits into from
Nov 26, 2024
Merged

153-End to End build test. #266

merged 30 commits into from
Nov 26, 2024

Conversation

charles-turner-1
Copy link
Collaborator

Closes #153.

This contains an end to end build test, where a subset of the catalog is built, and queries executed against it.

  • All queries on the subset catalog have been tested for correctness against the production catalog.
  • I've used CMIP5 & a single OM2 experiment to test against.
  • I've added a few minor refactors to the clip.py module to make testing easier.

@rbeucher can you link me to the Gadi SSH workflows you showed me the other day & I'll create a trigger for this based off those.

Copy link

codecov bot commented Nov 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.61%. Comparing base (6092573) to head (6d543fd).
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #266      +/-   ##
==========================================
+ Coverage   97.90%   98.61%   +0.71%     
==========================================
  Files          11       11              
  Lines         811     1010     +199     
==========================================
+ Hits          794      996     +202     
+ Misses         17       14       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -87,18 +94,19 @@ def _check_build_args(args_list):

if len(names) != len(set(names)):
seen = set()
dupes = [name for name in names if name in seen or seen.add(name)]
dupes = [name for name in names if name in seen or seen.add(name)] # type: ignore
# seen.add(name) returns None & so is always Falsey - so what is it doing?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please tell me Falsey is a British programming idiom :P

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seriously though, I think the or is being used as a sneaky way to avoid an if else statement to add things to the "I've now seen this name" set (seen).

It's going in order through the names list. The or is evaluated left to right, so:

  • If the name has been seen earlier in the list (i.e., name in seen == True), then it gets added to dupes. The seen.add(name) is never reached.
  • If name in seen == False, then it moves on to the second part of the or statement, which adds name to the seen set. You're right that it's always False, but that's the point - it's False on this pass, but now that it's been added to the seen set, if the same name recurs on a future pass, it is now known to be a duplicate.

(Interestingly, False or None evaluates to None, not False, but for this list comprehension I'm assuming anything not True is close enough to False to not care.)

In [13]: (False or None) is None
Out[13]: True

In [14]: None or False
Out[14]: False

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if a duplicated is repeated N times, then it will appear (N-1) times in the error message below - not sure if that's a bug or feature.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funnily enough, I think Falsey is pretty common worldwide - just googled it and it looks like it mostly crops up in javascript land though.

More importantly though, I had completely forgotten that the or will get executed left -> right & that seen.add() will still evaluate, even though it returns None.

I think the False or None == None and None or False == False thing is the result of the same trick - None and False are both falsey, so we wind up getting the second value in either case. Makes perfect sense, but completely counterintuitive.

Either way, I'll remove the unnecessary comment & see if I can strip it down to 1 error message, not N-1.

e2e/conftest.py Outdated
Comment on lines 16 to 18
Get the XFAILS environment variable. We use a default of 1, indicating we expect
to add xfail marker to `test_parse_access_ncfile[AccessOm2Builder-access-om2/output000/ocean/ocean_grid.nc-expected0-True]`
unless specified.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be updated to reference which test a value of 1 will point to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot

e2e/conftest.py Outdated
"""
This function is called by pytest to modify the items collected during test
collection. We use it here to mark the xfail tests in
test_builders::test_parse_access_ncfile when we check the file contents & to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

e2e/conftest.py Outdated
if (
item.name
in (
"test_parse_access_ncfile[AccessOm2Builder-access-om2/output000/ocean/ocean_grid.nc-expected0-True]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

Return the current catalog as an intake catalog.
"""
metacat = intake.cat.access_nri
yield metacat
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the idea behind yielding here instead of returning? (Not an issue, just for my own edification.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm not entirely sure. Pytest recommend using yield for fixtures, in particular if you might want some teardown code.

I originally wrote this using the tempfile module - I'd forgotten pytest contains its own temporary filesystem handlers - so I thought I might have needed to include some teardown code.

Given there is no teardown, I think it makes no practical difference.

e2e/test_end_to_end.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@marc-white marc-white left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the risk of causing the build time to blow out, is it worth making sure each Builder is represented in the E2E test?

@marc-white
Copy link
Collaborator

A couple of other things:

  • I presume the e2e directory is outside the tests directory so that it can have a separate conftest.py. However, I think that should still work as expected if e2e is a subdirectory of tests. If that's not the case, could e2e be renamed to something like tests_e2e so we have better visibility that it's a test suite, not code?
  • I think it would be nice if the standard pytest call doesn't try to execute the E2E tests. I think there's ways that a skipif can be used to make this happen, e.g.: https://stackoverflow.com/questions/33084190/default-skip-test-unless-command-line-parameter-present-in-py-test

@rbeucher
Copy link
Member

Hi @charles-turner-1,

Regarding the Gadi deployment actions, the workflow file can be found here.

This workflow triggers the build.sh PBS script located in the admin/access-nri-intake-catalog/bin folder.

I’ve deployed the current main branch to the conda/access-med-0.10 environment. Fingers crossed, it works as expected!

@rbeucher
Copy link
Member

Just tested with a subset that only includes CMIP5 and it worked fine:
https://github.com/ACCESS-NRI/access-nri-intake-catalog/actions/runs/11945531317
We should have a way to trigger a build test from GitHub.

@charles-turner-1
Copy link
Collaborator Author

A couple of other things:

  • I presume the e2e directory is outside the tests directory so that it can have a separate conftest.py. However, I think that should still work as expected if e2e is a subdirectory of tests. If that's not the case, could e2e be renamed to something like tests_e2e so we have better visibility that it's a test suite, not code?
  • I think it would be nice if the standard pytest call doesn't try to execute the E2E tests. I think there's ways that a skipif can be used to make this happen, e.g.: https://stackoverflow.com/questions/33084190/default-skip-test-unless-command-line-parameter-present-in-py-test

I virtually always run pytest using pytest tests - hence sticking it in a separate E2E folder & adding that to the .github_worfkows/ci.yml. Funnily enough, I originally called it e2e, not test_e2e or similar to avoid pytest discovering & running it erroneously. Obviously, that didn't work (see 1.)

I think both your suggestions would make this much cleaner.

@charles-turner-1
Copy link
Collaborator Author

I've done a bit more looking to this - frustratingly, looks like there might be some complications into the pytest skipif decorator not working on fixtures. Hopefully this should have little/no impact, but it's gonna take a touch more digging to make sure we can merge the confests & not cause any issues.

@charles-turner-1 charles-turner-1 marked this pull request as ready for review November 24, 2024 22:57
Copy link
Collaborator

@marc-white marc-white left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good to go now, merge when ready (and if you're satisfied the E2E testing is all set up correctly on the GitHub end).

steps:
- name: Checkout repository
### Latest at time of writing
uses: actions/checkout@v4.2.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just assume we have a copy on Gadi for now. You can delete the checkout and the rsync.

@charles-turner-1 charles-turner-1 merged commit 39bea88 into main Nov 26, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

Intake Take2
3 participants