New features coming soon to LineaPy #715

lionsardesai · 2022-07-05T21:18:52Z

lionsardesai
Jul 5, 2022

More Efficient Code Refactoring in Pipeline Building

Code refactoring under the current to_pipeline() API is suboptimal as it repeats the same block of code for related artifacts. For instance, if we have two artifacts trained_model and model_predictions created in the same Jupyter session, we would have trained_model's code being repeated in the pipeline node responsible for model_predictions. This behavior (i.e., re-training the model whenever making new predictions) is undesirable as we often want to make different predictions using the same model (hence, no need to re-train model). It even becomes a problem if the repeated code block (model training, in this case) involves long, expensive computation.

Hence, we started working on a better refactoring mechanism that can identify these "common" operations and factor them out as own pipeline nodes. This way, we can further modularize the code, helping data engineers have finer control over different pipeline components and their execution frequencies. Differently put, we are trying to provide engineers with more efficient "lego blocks" with which they can architect their system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New features coming soon to LineaPy #715

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

New features coming soon to LineaPy #715

lionsardesai Jul 5, 2022

More Efficient Code Refactoring in Pipeline Building

Replies: 0 comments

lionsardesai
Jul 5, 2022