ONNX model outputs an array of null #90

sidharthg-couture · 2025-01-22T22:23:08Z

I have configured my ONNX model (based on stella_400M) to have a fixed batch size for inputs and outputs. It even runs with response 200 on the server.

My issue is the entire output of the result is an array of null of size 1024 (which is expected, that is the dimension of the model output). When I use the same ONNX model with the python ONNXRuntime library, it works as expected.

Output from Python ONNX-Runtime

array([ 0.10638401,  0.09311324, -0.555768  , ...,  0.99843943,
        0.2404    ,  0.164792  ], shape=(1024,), dtype=float32)

Output from ONNX-Runtime on onnxruntime-server

Response Body: {'sentence_embedding': [[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, ... None, None]

Also I created the ONNX model from a Stella-400M model through the following instructions:
I have converted a Stella-400M model to ONNX through the following configuration:

torch.onnx.export(
    wrapper_model,
    (input_data['input_ids'][:1], input_data['attention_mask'][:1], input_data['token_type_ids'][:1]),
    os.path.join(base_model_path, 'model_static.onnx'),
    export_params=True,
    do_constant_folding=True,
    input_names=['input_ids', 'attention_mask', 'token_type_ids'],
    output_names=['sentence_embedding'],
    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
    },
    opset_version=18,
)

I am running the kibaes/onnxruntime-server:1.20.1b-linux-cpu docker image.

Would you have any idea why this would be happening? Or any possible fixes?
Thanks!

The text was updated successfully, but these errors were encountered:

kibae · 2025-02-01T18:13:55Z

Hello. @sidharthg-couture :)

Could you please display the ONNX session information? It can be queried via HTTP GET /api/sessions or HTTP GET /api/sessions/{model}/{version}. It would be great if you could show the response from this API.

sidharthg-couture · 2025-02-04T09:52:54Z

Hi, here is the response from the API:

[
  {
    "created_at": 1738662696,
    "execution_count": 0,
    "inputs": {
      "attention_mask": "int64[1,-1]",
      "input_ids": "int64[1,-1]",
      "token_type_ids": "int64[1,-1]"
    },
    "last_executed_at": 0,
    "model": "stella-v3",
    "option": {
      "cuda": false
    },
    "outputs": {
      "sentence_embedding": "float32[-1,1024]"
    },
    "version": "model"
  }
]

kibae · 2025-02-04T10:39:25Z

Hello, @sidharthg-couture
Thank you for providing the information. :)

According to the response from the Python code, the shape appears to be (1024, -1).

Output from Python ONNX-Runtime

array([ 0.10638401,  0.09311324, -0.555768  , ...,  0.99843943,
        0.2404    ,  0.164792  ], shape=(1024,), dtype=float32)

However, the response from the ONNX session API indicates that the shape of sentence_embedding is (-1, 1024).

    "outputs": {
      "sentence_embedding": "float32[-1,1024]"
    },

How about modifying the dynamic_axes values during the ONNX export so that the shape becomes (-1, 1024)?

    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
        'sentence_embedding': {0: 'sequence_length'},
    },

sidharthg-couture · 2025-02-04T10:45:28Z

Hello, @sidharthg-couture Thank you for providing the information. :)

According to the response from the Python code, the shape appears to be (1024, -1).

Output from Python ONNX-Runtime

How about modifying the dynamic_axes values during the ONNX export so that the shape becomes (-1, 1024)?
    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
        'sentence_embedding': {0: 'sequence_length'},
    },

Do you mean (1024, -1) here? in the last bit about ONNX export? If not, then isn't the shape already (-1, 1024)

kibae · 2025-02-05T09:11:58Z

In your Python code, the shape is specified as (1024,), which implies a shape of (1024, -1).
However, the session information from onnxruntime-server shows the sentence_embedding output as float32[-1, 1024]. Therefore, when exporting to ONNX, you should add 'sentence_embedding': {0: 'sequence_length'} so that onnxruntime-server's session information displays it as float32[1024, -1].

sidharthg-couture · 2025-02-14T13:34:11Z

To make the axes [1] as dynamic, I would have to use 'sentence_embedding': {1: 'sequence_length'} right, ottherwise it would be

"outputs": {
    "sentence_embedding": "float32[-1,1024]"
  },

This was the output after trying the suggested dynamic_axes configuration.

kibae self-assigned this Feb 1, 2025

kibae added the help wanted Extra attention is needed label Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX model outputs an array of null #90

ONNX model outputs an array of null #90

sidharthg-couture commented Jan 22, 2025 •

edited

Loading

kibae commented Feb 1, 2025

sidharthg-couture commented Feb 4, 2025 •

edited

Loading

kibae commented Feb 4, 2025

sidharthg-couture commented Feb 4, 2025 •

edited

Loading

kibae commented Feb 5, 2025

sidharthg-couture commented Feb 14, 2025

ONNX model outputs an array of null #90

ONNX model outputs an array of null #90

Comments

sidharthg-couture commented Jan 22, 2025 • edited Loading

kibae commented Feb 1, 2025

sidharthg-couture commented Feb 4, 2025 • edited Loading

kibae commented Feb 4, 2025

sidharthg-couture commented Feb 4, 2025 • edited Loading

kibae commented Feb 5, 2025

sidharthg-couture commented Feb 14, 2025

sidharthg-couture commented Jan 22, 2025 •

edited

Loading

sidharthg-couture commented Feb 4, 2025 •

edited

Loading

sidharthg-couture commented Feb 4, 2025 •

edited

Loading