High provision time on Vertex AI pipelines #127
Replies: 3 comments 2 replies
-
We have been experiencing the same behaviour with Vertex AI recently. There is not much we can do on that part as it's a cloud provider issue 🤷🏻♂️ |
Beta Was this translation helpful? Give feedback.
-
Yeah, I also thought about grouping nodes but it will be quite a limit. https://cloud.google.com/vertex-ai/docs/training/persistent-resource-train that supposedly does the job given manual creation and deletion of the resources after the pipeline run. The usage of that is available on beta cli as a parameter for customjob that is the same type of job created from pipelinejob (I could be wrong) and there is the same in the sdk beta for the Customjob class so maybe there is a way to pass that parameter down. |
Beta Was this translation helpful? Give feedback.
-
Then a node shouldn't be a simple function right? In the Kedro documentation the basic example for a node is:
Shouldn't the docs separate the work on more comprehensive steps? Maybe all data preprocessing in a single node and training in another. Would that improve performance? |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have a pipeline that contains about 40 nodes. Before deploying it from kedro I tried a basic pipeline and I found that the pod takes 2 about 2 minutes to be provisioned. So 40x2min = 80 minutes of waiting for the pipeline, assuming no parallelization.
It's a bit too much for my task that takes 5 minutes in total running locally.
Is there a way to reuse the same provisioned pod that I can configure or are you guys using some other kedro related workaround to avoid this provision time disaster?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions