Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint configuration for Llama2/Ollama running on runpod #53

Open
nstuhrmann opened this issue Dec 10, 2024 · 1 comment
Open

Endpoint configuration for Llama2/Ollama running on runpod #53

nstuhrmann opened this issue Dec 10, 2024 · 1 comment

Comments

@nstuhrmann
Copy link

nstuhrmann commented Dec 10, 2024

Hi,

I have ollama running on a runpod, basically by following these steps:
https://docs.runpod.io/tutorials/pods/run-ollama

I can connect to the API using curl, e.g.:

curl -X POST https://{POD_ID}-11434.proxy.runpod.net/api/generate -d '{
  "model": "mistral",
  "prompt":"Here is a story about llamas eating grass"
 }' 

When I try to generate a title for a document with paperless-gpt's webinterface, I get:
"Failed to generate suggestions."
The logs show:

time="2024-12-10T08:07:30Z" level=info msg="Processing Document ID 2288..."
time="2024-12-10T08:07:30Z" level=debug msg="Title suggestion prompt: I will provide you with the content of a document that has been partially read by OCR (so it may contain errors).\nYour task is to find a suitable document title that I can use as the title in the paperless-ngx program.\nRespond only with the title, without any additional information. The content is likely in English.\nAGAIN: ONLY RESPOND WITH A TITLE, nothing else, no intro.\n\n\nContent:\n[...]\n"
time="2024-12-10T08:07:30Z" level=error msg="Error processing document 2288: error getting response from LLM: invalid character 'p' after top-level value"
time="2024-12-10T08:07:30Z" level=error msg="Error processing documents: Document 2288: error getting response from LLM: invalid character 'p' after top-level value"
[GIN] 2024/12/10 - 08:07:30 | 500 |  401.095454ms |   192.168.2.220 | POST     "/api/generate-suggestions"

I looked into the TCP traffic and I saw that paperless-gpt seems to be using an API endpoint that's not compatible with my setup:
Request:

GET /api/chat HTTP/1.1
Host: 100.65.23.55:60201
User-Agent: langchaingo (amd64 linux) Go/go1.22.10
Accept: application/x-ndjson
Accept-Encoding: gzip, br
Cdn-Loop: cloudflare; loops=1
Cf-Connecting-Ip: 92.218.117.29
Cf-Ipcountry: DE
Cf-Ray: 8efbbdde8d583660-FRA
Cf-Visitor: {"scheme":"https"}
Content-Type: application/json
Referer: http://...-11434.proxy.runpod.net/api/chat
X-Forwarded-For: 92.218.117.29, 162.158.110.130
X-Forwarded-Host: ...-11434.proxy.runpod.net
X-Forwarded-Proto: https

Response:

HTTP/1.1 404 Not Found
Content-Type: text/plain
Date: Tue, 10 Dec 2024 08:07:30 GMT
Content-Length: 18

Am I doing something wrong? Can I configure the endpoints?

@nstuhrmann
Copy link
Author

docker compose is:

  paperless-gpt:
    image: paperless-gpt #icereed/paperless-gpt:latest
    environment:
      PAPERLESS_BASE_URL: 'http://192.168.x.x:8000'
      PAPERLESS_API_TOKEN: '...'
      LLM_PROVIDER: 'ollama' # or 'ollama'
      LLM_MODEL: 'llama2'     # or 'llama2'
      OPENAI_API_KEY: 'your_openai_api_key' # Required if using OpenAI
      LLM_LANGUAGE: 'English' # Optional, default is 'English'
      OLLAMA_HOST: 'http://...-11434.proxy.runpod.net' # If using Ollama
      VISION_LLM_PROVIDER: 'ollama' # Optional (for OCR) - ollama or openai
      VISION_LLM_MODEL: 'minicpm-v' # Optional (for OCR) - minicpm-v, for example for ollama, gpt-4o for openai
      LOG_LEVEL: 'debug' # Optional or 'debug', 'warn', 'error'
    volumes:
      - ./prompts:/app/prompts # Mount the prompts directory
    ports:
      - '8080:8080'
    depends_on:
      - webserver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant