Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX model outputs an array of null #90

Open
sidharthg-couture opened this issue Jan 22, 2025 · 6 comments
Open

ONNX model outputs an array of null #90

sidharthg-couture opened this issue Jan 22, 2025 · 6 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@sidharthg-couture
Copy link

sidharthg-couture commented Jan 22, 2025

I have configured my ONNX model (based on stella_400M) to have a fixed batch size for inputs and outputs. It even runs with response 200 on the server.

My issue is the entire output of the result is an array of null of size 1024 (which is expected, that is the dimension of the model output). When I use the same ONNX model with the python ONNXRuntime library, it works as expected.

  • Output from Python ONNX-Runtime
array([ 0.10638401,  0.09311324, -0.555768  , ...,  0.99843943,
        0.2404    ,  0.164792  ], shape=(1024,), dtype=float32)
  • Output from ONNX-Runtime on onnxruntime-server
Response Body: {'sentence_embedding': [[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, ... None, None]

Also I created the ONNX model from a Stella-400M model through the following instructions:
I have converted a Stella-400M model to ONNX through the following configuration:

torch.onnx.export(
    wrapper_model,
    (input_data['input_ids'][:1], input_data['attention_mask'][:1], input_data['token_type_ids'][:1]),
    os.path.join(base_model_path, 'model_static.onnx'),
    export_params=True,
    do_constant_folding=True,
    input_names=['input_ids', 'attention_mask', 'token_type_ids'],
    output_names=['sentence_embedding'],
    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
    },
    opset_version=18,
)

I am running the kibaes/onnxruntime-server:1.20.1b-linux-cpu docker image.

Would you have any idea why this would be happening? Or any possible fixes?
Thanks!

@kibae kibae self-assigned this Feb 1, 2025
@kibae
Copy link
Owner

kibae commented Feb 1, 2025

Hello. @sidharthg-couture :)

Could you please display the ONNX session information? It can be queried via HTTP GET /api/sessions or HTTP GET /api/sessions/{model}/{version}. It would be great if you could show the response from this API.

@kibae kibae added the help wanted Extra attention is needed label Feb 1, 2025
@sidharthg-couture
Copy link
Author

sidharthg-couture commented Feb 4, 2025

Hi, here is the response from the API:

[
  {
    "created_at": 1738662696,
    "execution_count": 0,
    "inputs": {
      "attention_mask": "int64[1,-1]",
      "input_ids": "int64[1,-1]",
      "token_type_ids": "int64[1,-1]"
    },
    "last_executed_at": 0,
    "model": "stella-v3",
    "option": {
      "cuda": false
    },
    "outputs": {
      "sentence_embedding": "float32[-1,1024]"
    },
    "version": "model"
  }
]

@kibae
Copy link
Owner

kibae commented Feb 4, 2025

Hello, @sidharthg-couture
Thank you for providing the information. :)

According to the response from the Python code, the shape appears to be (1024, -1).

  • Output from Python ONNX-Runtime
array([ 0.10638401,  0.09311324, -0.555768  , ...,  0.99843943,
        0.2404    ,  0.164792  ], shape=(1024,), dtype=float32)

However, the response from the ONNX session API indicates that the shape of sentence_embedding is (-1, 1024).

    "outputs": {
      "sentence_embedding": "float32[-1,1024]"
    },

How about modifying the dynamic_axes values during the ONNX export so that the shape becomes (-1, 1024)?

    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
        'sentence_embedding': {0: 'sequence_length'},
    },

@sidharthg-couture
Copy link
Author

sidharthg-couture commented Feb 4, 2025

Hello, @sidharthg-couture Thank you for providing the information. :)

According to the response from the Python code, the shape appears to be (1024, -1).

  • Output from Python ONNX-Runtime

How about modifying the dynamic_axes values during the ONNX export so that the shape becomes (-1, 1024)?

    dynamic_axes={
        'input_ids': {1: 'sequence_length'},
        'attention_mask': {1: 'sequence_length'},
        'token_type_ids': {1: 'sequence_length'},
        'sentence_embedding': {0: 'sequence_length'},
    },

Do you mean (1024, -1) here? in the last bit about ONNX export? If not, then isn't the shape already (-1, 1024)

@kibae
Copy link
Owner

kibae commented Feb 5, 2025

In your Python code, the shape is specified as (1024,), which implies a shape of (1024, -1).
However, the session information from onnxruntime-server shows the sentence_embedding output as float32[-1, 1024]. Therefore, when exporting to ONNX, you should add 'sentence_embedding': {0: 'sequence_length'} so that onnxruntime-server's session information displays it as float32[1024, -1].

@sidharthg-couture
Copy link
Author

To make the axes [1] as dynamic, I would have to use 'sentence_embedding': {1: 'sequence_length'} right, ottherwise it would be

"outputs": {
    "sentence_embedding": "float32[-1,1024]"
  },

This was the output after trying the suggested dynamic_axes configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants