[QST] RAPIDS can't work if I use more than one executors for a server in yarn mode，What can I do? #5393

jiajia2017 · 2020-09-01T11:34:08Z

jiajia2017
Sep 1, 2020

What is your question?
I tested TPC-DS with RAPIDS, I found that
（1）In yarn mode if I use only one executor for a server，RAPIDS can work and my job can be finished rightly.
（2) In yarn mode if I use more than one executors for a server，RAPIDS can't work and my job can't be finished rightly.
（3）In standalone mode if I use more than one executors for a server, RAPIDS can work and my job can be finished rightly.

I have two servers and one server's configuration is as follows:
CPU:intel 61482
MEM: 512GB
SSD: nvme 4T3
GPU: V100*8

Software configuration: spark3.0+rapids-0.2+hadoop-3.2.1

The command which I run my job with and some logs are following:
(1)the command of my job
${SPARK_HOME}/bin/spark-shell
-I output/query3.scala
--master yarn
--num-executors 8
--driver-memory 48G
--conf spark.executorEnv.LIBCUDF_KERNEL_CACHE_PATH=${local_cudf_kernel_cache_path}
--conf spark.executor.cores=4
--conf spark.executor.memory=32g
--conf spark.rapids.sql.concurrentGpuTasks=1
--conf spark.rapids.memory.pinnedPool.size=16G
--conf spark.executor.memoryOverhead=32g
--conf spark.sql.broadcastTimeout=1500
--conf spark.locality.wait=0s
--conf spark.sql.files.maxPartitionBytes=2048m
--conf spark.rapids.sql.explain=ALL
--conf spark.rapids.sql.castFloatToString.enabled=true
--conf spark.sql.shuffle.partitions=1000
--conf spark.rapids.sql.batchSizeBytes=$((192 * 1024 * 1024))
--conf spark.rapids.sql.variableFloatAgg.enabled=true
--conf spark.rapids.sql.exec.BroadcastNestedLoopJoinExec=true
--conf spark.rapids.sql.exec.CartesianProductExec=true
--conf spark.rapids.memory.host.spillStorageSize=$((1073741824 * 16))
--conf spark.sql.optimizer.inSetConversionThreshold=1000
--conf spark.rapids.memory.gpu.pooling.enabled=true
--conf spark.rapids.memory.gpu.allocFraction=0.85
--conf spark.executor.resource.gpu.amount=1
--conf spark.task.resource.gpu.amount=0.25
--files ${local_log4j_properties_file}
--conf spark.plugins=com.nvidia.spark.SQLPlugin
--conf spark.executor.extraJavaOptions=-Xms48g
--conf spark.executor.extraJavaOptions='-Dai.rapids.cudf.prefer-pinned=true -Djava.io.tmpdir=/home/javaio_tmp -Dlog4j.configuration=executor-log4j.properties'
--conf spark.executorEnv.LD_LIBRARY_PATH='$LD_LIBRARY_PATH:/home/work/cuda-10.2/lib64/'
--conf spark.executor.resource.gpu.discoveryScript=/ssd1/software/getGpusResources.sh
--files /ssd1/software/getGpusResources.sh
--jars ${local_rapids_plugin_jar_file},${local_rapids_cudf_jar_file}

(2)the part of the driver's log

20/09/01 17:46:50 [dispatcher-BlockManagerMaster] INFO BlockManagerMasterEndpoint: Registering block manager yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com:39805 with 36.9 GiB RAM, BlockManagerId(4, yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com, 39805, None)
20/09/01 17:46:50 [dispatcher-CoarseGrainedScheduler] INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 2.
20/09/01 17:46:50 [dispatcher-CoarseGrainedScheduler] ERROR YarnScheduler: Lost an executor 2 (already removed): Pending loss reason.
20/09/01 17:46:51 [dispatcher-event-loop-33] WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Requesting driver to remove executor 2 for reason Container from a bad node: container_1598950248077_0005_01_000003 on host: yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com. Exit status: 1. Diagnostics: [2020-09-01 17:46:50.955]Exception from container-launch.
Container id: container_1598950248077_0005_01_000003
Exit code: 1

[2020-09-01 17:46:50.957]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
.IllegalArgumentException: Spark GPU Plugin only supports 1 gpu per executor
at com.nvidia.spark.rapids.GpuDeviceManager$.getGPUAddrFromResources(GpuDeviceManager.scala:106)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpu(GpuDeviceManager.scala:117)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:125)
at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:126)
at org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:111)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.internal.plugin.ExecutorPluginContainer.(PluginContainer.scala:99)

(3)the part of executor's log
One error:
20/09/01 17:47:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(28, yq01-sys-hic-k8s-v100-box-a223-0151.yq01.baidu.com, 12232, None)
20/09/01 17:47:15 INFO Executor: Using REPL class URI: spark://yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com:34335/classes
20/09/01 17:47:16 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
20/09/01 17:47:22 WARN GpuDeviceManager: Initial RMM allocation(13736.4248046875 MB) is larger than free memory(74.4375 MB)
20/09/01 17:47:22 INFO GpuDeviceManager: Initializing RMM POOLED 13736.4248046875 MB on gpuId 0
20/09/01 17:47:22 ERROR GpuDeviceManager: Could not initialize RMM
ai.rapids.cudf.CudfException: RMM failure at: /usr/local/rapids/include/rmm/mr/device/pool_memory_resource.hpp:100: Initial pool size exceeds the maximum pool size!
at ai.rapids.cudf.Rmm.initializeInternal(Native Method)
at ai.rapids.cudf.Rmm.initialize(Rmm.java:192)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeRmm(GpuDeviceManager.scala:209)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeMemory(GpuDeviceManager.scala:239)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:126)
at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:126)
at org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:111)

Another error:
20/09/01 17:47:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(30, yq01-sys-hic-k8s-v100-box-a223-0151.yq01.baidu.com, 16630, None)
20/09/01 17:47:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(30, yq01-sys-hic-k8s-v100-box-a223-0151.yq01.baidu.com, 16630, None)
20/09/01 17:47:15 INFO Executor: Using REPL class URI: spark://yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com:34335/classes
20/09/01 17:47:15 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
20/09/01 17:47:22 ERROR RapidsExecutorPlugin: Exception in the executor plugin
ai.rapids.cudf.CudaException: out of memory
at ai.rapids.cudf.Cuda.freeZero(Native Method)
at com.nvidia.spark.rapids.GpuDeviceManager$.setGpuDeviceAndAcquire(GpuDeviceManager.scala:94)
at com.nvidia.spark.rapids.GpuDeviceManager$.$anonfun$initializeGpu$1(GpuDeviceManager.scala:117)
at scala.runtime.java8.JFunction1$mcII$sp.apply(JFunction1$mcII$sp.java:23)
at scala.Option.map(Option.scala:230)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpu(GpuDeviceManager.scala:117)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:125)
at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:126)

Another error:
20/09/01 17:47:14 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
20/09/01 17:47:21 WARN GpuDeviceManager: Initial RMM allocation(13736.4248046875 MB) is larger than free memory(992.4375 MB)
20/09/01 17:47:21 INFO GpuDeviceManager: Initializing RMM POOLED 13736.4248046875 MB on gpuId 0
20/09/01 17:47:21 ERROR GpuDeviceManager: Could not initialize RMM
ai.rapids.cudf.CudfException: RMM failure at: /usr/local/rapids/include/rmm/mr/device/pool_memory_resource.hpp:100: Initial pool size exceeds the maximum pool size!
at ai.rapids.cudf.Rmm.initializeInternal(Native Method)
at ai.rapids.cudf.Rmm.initialize(Rmm.java:192)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeRmm(GpuDeviceManager.scala:209)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeMemory(GpuDeviceManager.scala:239)
at com.nvidia.spark.rapids.GpuDeviceManager$.initializeGpuAndMemory(GpuDeviceManager.scala:126)
at com.nvidia.spark.rapids.RapidsExecutorPlugin.init(Plugin.scala:126)

Another error:
20/09/01 17:47:27 INFO TorrentBroadcast: Reading broadcast variable 0 took 145 ms
20/09/01 17:47:27 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 94.8 KiB, free 36.9 GiB)
20/09/01 17:47:29 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
20/09/01 17:47:29 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2336 bytes result sent to driver
20/09/01 17:47:29 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 2336 bytes result sent to driver
20/09/01 17:47:29 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 2336 bytes result sent to driver
20/09/01 17:47:29 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 2379 bytes result sent to driver
20/09/01 17:47:29 INFO MemoryStore: MemoryStore cleared
20/09/01 17:47:29 INFO BlockManager: BlockManager stopped
20/09/01 17:47:29 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already stopped.
at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:167)
at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:150)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:684)
at org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:253)

Another error:
20/09/01 17:47:26 INFO BlockManager: Initialized BlockManager: BlockManagerId(41, yq01-sys-hic-k8s-v100-box-a223-0151.yq01.baidu.com, 45990, None)
20/09/01 17:47:26 INFO Executor: Using REPL class URI: spark://yq01-sys-hic-v100-box-a223-0117.yq01.baidu.com:34335/classes
20/09/01 17:47:26 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
20/09/01 17:47:29 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
20/09/01 17:47:29 ERROR Utils: Uncaught exception in thread shutdown-hook-0
java.lang.NullPointerException
at org.apache.spark.executor.Executor.$anonfun$stop$3(Executor.scala:289)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:221)
at org.apache.spark.executor.Executor.stop(Executor.scala:289)
at org.apache.spark.executor.Executor.$anonfun$new$2(Executor.scala:74)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)

I found driver's error log "Spark GPU Plugin only supports 1 gpu per executor" in rapids source code
def getGPUAddrFromResources(resources: Map[String, ResourceInformation]): Option[Int] = {
if (resources.contains("gpu")) {
val addrs = resources("gpu").addresses
if (addrs.size > 1) {
// Throw an exception since we assume one GPU per executor.
// If multiple GPUs are allocated by spark, then different tasks could get assigned
// different GPUs but RMM would only be initialized for 1. We could also just get
// weird results that are hard to debug.
throw new IllegalArgumentException("Spark GPU Plugin only supports 1 gpu per executor")
}
Some(addrs.head.toInt)
} else {
None
}

I run getGpusResources.sh，output is {"name": "gpu", "addresses":["0","1","2","3","4","5","6","7"]}

My question: if I have a sever with many gpu cards, for the above source code ,I always get errors log "Spark GPU Plugin only supports 1 gpu per executor" ?

Answered by tgravescs

Sep 1, 2020

it depends on how you are running yarn. It looks like you are not running isolated so you need to have a different method of making sure the GPU is only assigned to a single executor. See our instructions here: https://nvidia.github.io/spark-rapids/docs/get-started/getting-started.html#yarn-without-isolation

Or if you can run yarn with docker and cgroups enabled to get isolation that would also work.

View full answer

tgravescs · 2020-09-01T13:14:35Z

tgravescs
Sep 1, 2020
Maintainer

it depends on how you are running yarn. It looks like you are not running isolated so you need to have a different method of making sure the GPU is only assigned to a single executor. See our instructions here: https://nvidia.github.io/spark-rapids/docs/get-started/getting-started.html#yarn-without-isolation

Or if you can run yarn with docker and cgroups enabled to get isolation that would also work.

0 replies

jiajia2017 · 2020-09-02T06:12:38Z

jiajia2017
Sep 2, 2020
Author

hi，@tgravescs
I saw yours link and found that for yarn mode there are two methods： YARN 3.1.3 with Isolation and GPU Scheduling Enabled，YARN without Isolation
I used YARN without Isolation and set GPU to EXCLUSIVE_PROCESS mode.

It works but I found a new question: RAPIDS can work only in the master nodes ,can't work in worker nodes.

My command is as follows:

$SPARK_HOME/bin/spark-shell
--master yarn
-I output/query3.scala
--num-executors 16
--conf spark.rapids.sql.concurrentGpuTasks=1
--driver-memory 32G
--conf spark.executor.memory=16G
--conf spark.executor.resource.gpu.amount=1
--conf spark.executor.cores=4
--conf spark.task.cpus=1
--conf spark.task.resource.gpu.amount=0.25
--conf spark.rapids.memory.pinnedPool.size=16G
--conf spark.locality.wait=0s
--conf spark.executor.extraJavaOptions=-Xms2g
--conf spark.sql.files.maxPartitionBytes=512m
--conf spark.sql.shuffle.partitions=200
--conf spark.plugins=com.nvidia.spark.SQLPlugin
--conf spark.resources.discoveryPlugin=com.nvidia.spark.ExclusiveModeGpuDiscoveryPlugin
--conf spark.executor.resource.gpu.discoveryScript=/ssd1/software/getGpusResources.sh
--files /ssd1/software/getGpusResources.sh
--jars ${local_rapids_plugin_jar_file},${local_rapids_cudf_jar_file}

0 replies

jiajia2017 · 2020-09-02T09:11:46Z

jiajia2017
Sep 2, 2020
Author

hi，@tgravescs
I have solved the question. Thank you!

0 replies

tgravescs · 2020-09-02T17:28:11Z

tgravescs
Sep 2, 2020
Maintainer

by your last comment I think you have figured it out so I'm going to close this. If not please reopen and let me know what issue you are seeing. Spark should run executors on all the yarn node manager nodes.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] RAPIDS can't work if I use more than one executors for a server in yarn mode，What can I do? #5393

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

[QST] RAPIDS can't work if I use more than one executors for a server in yarn mode，What can I do? #5393

jiajia2017 Sep 1, 2020

Replies: 4 comments

tgravescs Sep 1, 2020 Maintainer

jiajia2017 Sep 2, 2020 Author

jiajia2017 Sep 2, 2020 Author

tgravescs Sep 2, 2020 Maintainer

jiajia2017
Sep 1, 2020

tgravescs
Sep 1, 2020
Maintainer

jiajia2017
Sep 2, 2020
Author

jiajia2017
Sep 2, 2020
Author

tgravescs
Sep 2, 2020
Maintainer