Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding llama models support #33

Closed
petric3 opened this issue May 1, 2024 · 3 comments
Closed

Adding llama models support #33

petric3 opened this issue May 1, 2024 · 3 comments

Comments

@petric3
Copy link

petric3 commented May 1, 2024

Thank you for your program.

I was trying to do some sentiment analysis. I took your example and tried to switch the models, simply from hfapigo.RecommendedTextClassificationModel to meta-llama/Meta-Llama-3-8B, but the response is not returned (waiting for it indefinitely). I also tried to make it work with the llama models on other examples, but no response. Could you add an example on How to use the llama models via the HugFace API/Interface?

@Kardbord
Copy link
Owner

Kardbord commented May 2, 2024

Hi @petric3, thanks for the issue! I think the problem you're seeing is caused by a couple of things.

The first is that meta-llama/Meta-Llama-3-8B is unfortunately not available for free (serverless) access via the Inference API. From the model page:

Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

The second issue is that the llama models seem to all be set up for text-generation rather than text-classification. You might try one of the other text-classification models hosted on HF: https://huggingface.co/models?pipeline_tag=text-classification&sort=downloads

Thanks for letting me know that the example hangs indefinitely when given a llama model. I've opened #34 to track that issue. Unfortunately it may be awhile before I'm able to get to it. Life is busy right now with my day job and an upcoming house-move. :)

@Kardbord
Copy link
Owner

Kardbord commented May 2, 2024

As for why llama models don't work with the text-generation example, it seems that they only accept single inputs, rather than lists of inputs.

# 1) Does not work
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct  \
    -X POST \
    -d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

# 2) Works
curl https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct  \
    -X POST \
    -d '{"inputs": "The quick brown fox"}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

# 3) Works
curl https://api-inference.huggingface.co/models/gpt2 \
    -X POST  \
    -d '{"inputs": ["The quick brown fox", "jumps over the lazy dog"]}'  \
    -H "Authorization: Bearer ${HUGGING_FACE_TOKEN}" \
    -H 'Content-Type: application/json'

Currently, hfapigo treats all inputs as lists because the docs specify them as being supported.

Return value is either a dict or a list of dicts if you sent a list of inputs

Clearly that isn't the case for all models though. I've opened #35 to track that issue.

@petric3
Copy link
Author

petric3 commented May 2, 2024

@Kardbord Thank you for the explanation, it makes sense. Also thanks for the links to the dedicated endpoints, may come handy. Best of wishes with your house move 🌞

@petric3 petric3 closed this as completed May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants