Is InferenceSession.Run thread-safe when using DirectML provider? #9441

wanglvhang · 2021-10-19T17:19:44Z

wanglvhang
Oct 19, 2021

i'm using OnnxRuntime for inference with yolov3 model and try to using async method to improve the performance because i see my graphic card only 40% loaded. but when i call InferenceSession.Run in Task.Run below error thrown:"Attempted to read or write protected memory. This is often an indication that other memory is corrupt." , so i guess InferenceSession.Run is not thread-safe, so if this is true will it be thread-safe in future and right now is there's any other way to improve performance to fully utilize the performance of grahpic car? can any one please help? thanks very much.

Microsoft.ML.OnnxRuntime 1.9
DirectML 1.7
platform: windows 10 x64

Answered by wanglvhang

Oct 20, 2021

i found this document https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html

Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects.

Performance Tuning
The DirectML execution provider works most efficiently when tensor shapes are known at the time a session is created. This provides a few performance benefits: 1) Because constant foldi…

View full answer

wanglvhang · 2021-10-20T14:27:30Z

wanglvhang
Oct 20, 2021
Author

i found this document https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html

Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to Run on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call Run at a time. Multiple threads are permitted to call Run simultaneously if they operate on different inference session objects.

Performance Tuning
The DirectML execution provider works most efficiently when tensor shapes are known at the time a session is created. This provides a few performance benefits: 1) Because constant folding can occur more often, there may be fewer CPU / GPU copies and stalls during evaluations. 2) More initialization work occurs when sessions are created rather than during the first evaluation. 3) Weights may be pre-processed within DirectML, enabling more efficient algorithms to be used. 4) Graph optimization occurs within DirectML. For example, Concat operators may be removed, and more optimal tensor layouts may be used for the input and output of operators.

Normally when the shapes of model inputs are known during session creation, the shapes for the rest of the model are inferred by OnnxRuntime when a session is created. However if a model input contains a free dimension (such as for batch size), steps must be taken to retain the above performance benefits.

In this case, there are three options:

Edit the model to replace an input’s free dimension (specified through ONNX using “dim_param”) with a fixed size (specified through ONNX using “dim_value”).
Specify values of named dimensions within model inputs when creating the session using the OnnxRuntime AddFreeDimensionOverrideByName ABI.
Edit the model to ensure that an input’s free dimension has a denotation (such as “DATA_BATCH,” or a custom denotation). Then when creating the session, specify the dimension size for each denotation. This can be done using the OnnxRuntime AddFreeDimensionOverride ABI.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is InferenceSession.Run thread-safe when using DirectML provider? #9441

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Is InferenceSession.Run thread-safe when using DirectML provider? #9441

wanglvhang Oct 19, 2021

Replies: 1 comment

wanglvhang Oct 20, 2021 Author

wanglvhang
Oct 19, 2021

wanglvhang
Oct 20, 2021
Author