New features coming soon to LineaPy #715
lionsardesai
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
More Efficient Code Refactoring in Pipeline Building
Code refactoring under the current
to_pipeline()
API is suboptimal as it repeats the same block of code for related artifacts. For instance, if we have two artifactstrained_model
andmodel_predictions
created in the same Jupyter session, we would havetrained_model
's code being repeated in the pipeline node responsible formodel_predictions
. This behavior (i.e., re-training the model whenever making new predictions) is undesirable as we often want to make different predictions using the same model (hence, no need to re-train model). It even becomes a problem if the repeated code block (model training, in this case) involves long, expensive computation.Hence, we started working on a better refactoring mechanism that can identify these "common" operations and factor them out as own pipeline nodes. This way, we can further modularize the code, helping data engineers have finer control over different pipeline components and their execution frequencies. Differently put, we are trying to provide engineers with more efficient "lego blocks" with which they can architect their system.
Beta Was this translation helpful? Give feedback.
All reactions