Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model edits and bug fixes #17922

Merged
merged 15 commits into from
Apr 12, 2024
Merged

Model edits and bug fixes #17922

merged 15 commits into from
Apr 12, 2024

Conversation

jdavcs
Copy link
Member

@jdavcs jdavcs commented Apr 5, 2024

Builds on #17897

To do:

  • The added type hints in ee06700 exposed a bug that needs to be fixed by adding an implementation for the _serialize method to the WorkflowInvocationMessage class definition (ref) UPADATE: type-ignore for now.

Misc. post-SA20 model edits and bug fixes.
Remove pre-SA20 code (autocommit arg in session, future arg in session and engine)

A note on nullability in the database schema versus optionality in the python app:

The nullability of the field in the database schema is derived from the type hint. So:
foo: Mapped[int] = mapped_column() will add NOT NULL to the table field definition, whereas
foo: Optional[Mapped[int]] = mapped_column() will not. Furthermore, the nullable argument to mapped_column() takes precedence over this derivation. Thus, it is possible to have a mapped attribute that can contain None values, but will require a value when saved to the database, and vice versa. Here's an example (comments indicate nullability in the db definition):

In SQLAlchemy 1.4:

class Foo(Base):
    __tablename__ = "foo"

    id = Column(Integer, primary_key=True)  # NOT NULL (pkey)
    data1 = Column(String)                  # null (because nullable defaults to True)
    data2 = Column(String, nullable=True)   # null
    data3 = Column(String, nullable=False)  # NOT NULL

In SQLAlchemy 2.0:

class Foo(Base):
    __tablename__ = "foo"

    id: Mapped[int] = mapped_column(primary_key=True)               # NOT NULL (pkey)
    data1: Mapped[str]                                              # NOT NULL (derived from type hint)
    data2: Mapped[Optional[str]]                                   # null (derived from type hint)
    data3: Mapped[str] = mapped_column(nullable=True)               # null
    data4: Mapped[Optional[str]] = mapped_column(nullable=True)    # null
    data5: Mapped[str] = mapped_column(nullable=False)              # NOT NULL
    data6: Mapped[Optional[str]] = mapped_column(nullable=False)   # NOT NULL

https://docs.sqlalchemy.org/en/20/orm/declarative_tables.html#mapped-column-derives-the-datatype-and-nullability-from-the-mapped-annotation

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@jdavcs jdavcs added kind/bug kind/refactoring cleanup or refactoring of existing code, no functional changes area/database Galaxy's database or data access layer labels Apr 5, 2024
@jdavcs jdavcs added this to the 24.1 milestone Apr 5, 2024
@jdavcs jdavcs force-pushed the dev_model_edits1 branch from 8449d13 to e4cf654 Compare April 5, 2024 20:56
@jdavcs jdavcs marked this pull request as ready for review April 5, 2024 21:15
@jdavcs jdavcs mentioned this pull request Apr 10, 2024
4 tasks
@jdavcs jdavcs requested a review from a team April 11, 2024 13:06
@jmchilton
Copy link
Member

This is hard to manually inspect - any chance you generated a schema for our model before and after and checked the diff was empty or made sense where there are differences?

@@ -100,7 +100,7 @@ def _set_previous_progress(self, outputs):

workflow_invocation_step_state = model.WorkflowRequestStepState()
workflow_invocation_step_state.workflow_step_id = step_id
workflow_invocation_step_state.value = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to fix these column definitions. This is just a wild line of code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please merge #17902 and follow up with a proper type definitions for that column. The truth is it can be any JSON serializable type - it is sort of up to workflow module to process it - so maybe Any or maybe start with bool and create a comment somewhere that it will need to be unioned with other types if we add more workflow module types. I'm not sure if conditionals use this or not - probably worth looking into but maybe the typing alone would tell us.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@jdavcs
Copy link
Member Author

jdavcs commented Apr 11, 2024

This is hard to manually inspect - any chance you generated a schema for our model before and after and checked the diff was empty or made sense where there are differences?

It is a pain to review indeed! (and thanks!) I intentionally reviewed these model edits multiple times. I didn't run a script, but I can do that; it won't be a full model comparison (I tried doing that when moving to declarative - it's next to impossible to cover all model details), but I can compare field types and nullability - that's what's most relevant here. I'll do that before merging, to be on the safe side.

@jdavcs
Copy link
Member Author

jdavcs commented Apr 12, 2024

I've compared the model definitions on dev and this branch programmatically, comparing column name, type and nullability. There is no difference.

from galaxy.model import mapping
for t in mapping.metadata.tables.values():
    for c in t.columns:
        print(f"{c.name} {c.type} {c.nullable}")
# run the above on both branches, then diff the output.

@jdavcs jdavcs merged commit bf62b25 into galaxyproject:dev Apr 12, 2024
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer kind/bug kind/refactoring cleanup or refactoring of existing code, no functional changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants