You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One thing that would be cool is if we could get some basic post inference work carried out in the same session. The most obvious thing I can think of is calculating the softmax values of the logits array. At the moment I work on the returned logit array myself in my Python app, but it makes sense to get that work carried out directly in the C++ session instead.
I was thinking there could be an option flag called softmax=true, or softmax=false.
e.g. model1:v1(cuda=false, softmax=true)
The text was updated successfully, but these errors were encountered:
The output format of ONNX models can vary significantly. If the output were always a simple array like [0.x, 0.x, 0.x], applying a softmax operation directly would make sense. However, some models might return more complex outputs, such as {"x": [...], "y": [...]}. In such cases, it's unclear whether the softmax should be applied to x, y, or perhaps another part of the output. This variability makes implementing a universal softmax=true/false flag challenging.
One potential approach could be to add a simple Python binding to the C++ program. This way, users could define custom output filters or transformations, like applying softmax, using Python code. While I haven't personally implemented this kind of functionality before, it seems like an interesting area to explore when I have some time to dive deeper into it.
One thing that would be cool is if we could get some basic post inference work carried out in the same session. The most obvious thing I can think of is calculating the softmax values of the logits array. At the moment I work on the returned logit array myself in my Python app, but it makes sense to get that work carried out directly in the C++ session instead.
I was thinking there could be an option flag called softmax=true, or softmax=false.
e.g. model1:v1(cuda=false, softmax=true)
The text was updated successfully, but these errors were encountered: