Adding llama models support #33

petric3 · 2024-05-01T08:23:41Z

Thank you for your program.

I was trying to do some sentiment analysis. I took your example and tried to switch the models, simply from hfapigo.RecommendedTextClassificationModel to meta-llama/Meta-Llama-3-8B, but the response is not returned (waiting for it indefinitely). I also tried to make it work with the llama models on other examples, but no response. Could you add an example on How to use the llama models via the HugFace API/Interface?

The text was updated successfully, but these errors were encountered:

Kardbord · 2024-05-02T02:38:08Z

Hi @petric3, thanks for the issue! I think the problem you're seeing is caused by a couple of things.

The first is that meta-llama/Meta-Llama-3-8B is unfortunately not available for free (serverless) access via the Inference API. From the model page:

Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

The second issue is that the llama models seem to all be set up for text-generation rather than text-classification. You might try one of the other text-classification models hosted on HF: https://huggingface.co/models?pipeline_tag=text-classification&sort=downloads

Thanks for letting me know that the example hangs indefinitely when given a llama model. I've opened #34 to track that issue. Unfortunately it may be awhile before I'm able to get to it. Life is busy right now with my day job and an upcoming house-move. :)

Kardbord · 2024-05-02T03:42:16Z

As for why llama models don't work with the text-generation example, it seems that they only accept single inputs, rather than lists of inputs.

# 1) Does not work
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct  \
    -X POST \
    -d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

# 2) Works
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct  \
    -X POST \
    -d '{"inputs": "The quick brown fox"}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

# 3) Works
curl https://api-inference.huggingface.co/models/gpt2 \
    -X POST  \
    -d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

Currently, hfapigo treats all inputs as lists because the docs specify them as being supported.

Return value is either a dict or a list of dicts if you sent a list of inputs

Clearly that isn't the case for all models though. I've opened #35 to track that issue.

petric3 · 2024-05-02T05:51:22Z

@Kardbord Thank you for the explanation, it makes sense. Also thanks for the links to the dedicated endpoints, may come handy. Best of wishes with your house move 🌞

Kardbord mentioned this issue May 2, 2024

Text-classification example hangs indefinitely when provided with an invalid model #34

Open

petric3 closed this as completed May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding llama models support #33

Adding llama models support #33

petric3 commented May 1, 2024

Kardbord commented May 2, 2024

Kardbord commented May 2, 2024

petric3 commented May 2, 2024

Adding llama models support #33

Adding llama models support #33

Comments

petric3 commented May 1, 2024

Kardbord commented May 2, 2024

Kardbord commented May 2, 2024

petric3 commented May 2, 2024