Skip to content

[BUG] Does spark/rapids really has the gpu-aware task scheduling ability? #5391

Answered by tgravescs
JustPlay asked this question in General
Discussion options

You must be logged in to vote

this is a spark scheduling question not a spark-rapids plugin question.
I'm assuming you are allowing more then 1 task to run on the GPU? (ie spark.task.resource.gpu.amount=(1/24). If so then spark is doing just fine.

More than likely the reason is locality. Spark uses locality to decide where to put thing where it thinks is most efficient. Meaning the most data is local on that node so you don't have to transfer as much over the network.

We were actually doing a little experimenting with adding an option to spark to force it to spread, but in that particular case the performance was the same or worse. You may get some efficiencies in spreading them by having more resources there but at t…

Replies: 8 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by sameerz
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
5 participants
Converted from issue

This discussion was converted from issue #678 on April 28, 2022 23:29.