Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(backend): parallelFor resolve upstream inputs. Fixes #11520 #11627

Merged
merged 1 commit into from
Feb 21, 2025

Conversation

zazulam
Copy link
Contributor

@zazulam zazulam commented Feb 13, 2025

Description of your changes:

cc: @droctothorpe

Checklist:

Copy link

Hi @zazulam. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@droctothorpe
Copy link
Contributor

/ok-to-test

@zazulam
Copy link
Contributor Author

zazulam commented Feb 14, 2025

/retest

@zazulam
Copy link
Contributor Author

zazulam commented Feb 14, 2025

Seems like this PR is also being affected by the flaky tests. I ran the failing pipelines locally and all passed.

image

@droctothorpe
Copy link
Contributor

cc @HumairAK this resolves #11520.

@zazulam
Copy link
Contributor Author

zazulam commented Feb 19, 2025

Adding some details here related to .after() scenario, running an argo lint on the workflow file generated from the IR can show the actual error from the argoworkflow api that was being raised in this comment.

image

With removing that -loop
image

Only diff in the workflow
image

The example used was from the comment and also added as the test case in the samples/v2 suite:

from kfp import dsl


@dsl.component
def print_op(message: str) -> str:
    print(message)
    return message

@dsl.component
def reduce_op(message: str) -> str:
    print(message)
    return message[0]


@dsl.pipeline()
def my_pipeline():
    with dsl.ParallelFor([1, 2, 3]):
        one = print_op(message='foo')
    two = print_op(message='bar').after(one)

@zazulam zazulam force-pushed the parallelfor-upstream branch 2 times, most recently from 3adfc00 to bb48a9a Compare February 20, 2025 19:26
@zazulam zazulam changed the title fix(backend): parallelFor consume upstream inputs. Fixes #11520 fix(backend): parallelFor resolve upstream inputs. Fixes #11520 Feb 20, 2025
@HumairAK
Copy link
Collaborator

/lgtm
/approve

Thanks for the quick turn around on this folks!

@HumairAK
Copy link
Collaborator

@zazulam can you rebase? there are conflicts

@HumairAK HumairAK removed the approved label Feb 21, 2025
@zazulam
Copy link
Contributor Author

zazulam commented Feb 21, 2025

@HumairAK I actually started to work on implementing the backend support for dsl.Collected 😅
I just need to add some tests for artifacts and parameters, then this PR should be good to fully resolve #10050.

Signed-off-by: zazulam <m.zazula@gmail.com>
@zazulam zazulam force-pushed the parallelfor-upstream branch from bb48a9a to c8a49fc Compare February 21, 2025 15:13
@google-oss-prow google-oss-prow bot added approved and removed lgtm labels Feb 21, 2025
@zazulam
Copy link
Contributor Author

zazulam commented Feb 21, 2025

@HumairAK I actually started to work on implementing the backend support for dsl.Collected 😅
I just need to add some tests for artifacts and parameters, then this PR should be good to fully resolve #10050.

I'm going to save the dsl.Collected for a separate PR as I learned from #6161 that there exists certain Iterator classes in the pipelinespec (I'm not aware atm if they are being used or set anywhere in the backend). My current solution is not leveraging those classes to fan in the outputs, but I would rather not hold back this PR.

@HumairAK
Copy link
Collaborator

HumairAK commented Feb 21, 2025

@zazulam I think separate pr makes sense, if we can keep this one light weight I might be able to cherry pick this for the 2.4.1 patch release I'll make next week so we can address the regression for kubeflow 1.10. Feel free to hit me up on slack once the pr is ready, or if you get hit with flaky tests.

@HumairAK
Copy link
Collaborator

/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: droctothorpe, HumairAK

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@HumairAK
Copy link
Collaborator

/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Feb 21, 2025
@google-oss-prow google-oss-prow bot merged commit f7c0616 into kubeflow:master Feb 21, 2025
34 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
3 participants