Endpoint configuration for Llama2/Ollama running on runpod #53

nstuhrmann · 2024-12-10T08:41:40Z

Hi,

I have ollama running on a runpod, basically by following these steps:
https://docs.runpod.io/tutorials/pods/run-ollama

I can connect to the API using curl, e.g.:

curl -X POST https://{POD_ID}-11434.proxy.runpod.net/api/generate -d '{
  "model": "mistral",
  "prompt":"Here is a story about llamas eating grass"
 }'

When I try to generate a title for a document with paperless-gpt's webinterface, I get:
"Failed to generate suggestions."
The logs show:

time="2024-12-10T08:07:30Z" level=info msg="Processing Document ID 2288..."
time="2024-12-10T08:07:30Z" level=debug msg="Title suggestion prompt: I will provide you with the content of a document that has been partially read by OCR (so it may contain errors).\nYour task is to find a suitable document title that I can use as the title in the paperless-ngx program.\nRespond only with the title, without any additional information. The content is likely in English.\nAGAIN: ONLY RESPOND WITH A TITLE, nothing else, no intro.\n\n\nContent:\n[...]\n"
time="2024-12-10T08:07:30Z" level=error msg="Error processing document 2288: error getting response from LLM: invalid character 'p' after top-level value"
time="2024-12-10T08:07:30Z" level=error msg="Error processing documents: Document 2288: error getting response from LLM: invalid character 'p' after top-level value"
[GIN] 2024/12/10 - 08:07:30 | 500 |  401.095454ms |   192.168.2.220 | POST     "/api/generate-suggestions"

I looked into the TCP traffic and I saw that paperless-gpt seems to be using an API endpoint that's not compatible with my setup:
Request:

GET /api/chat HTTP/1.1
Host: 100.65.23.55:60201
User-Agent: langchaingo (amd64 linux) Go/go1.22.10
Accept: application/x-ndjson
Accept-Encoding: gzip, br
Cdn-Loop: cloudflare; loops=1
Cf-Connecting-Ip: 92.218.117.29
Cf-Ipcountry: DE
Cf-Ray: 8efbbdde8d583660-FRA
Cf-Visitor: {"scheme":"https"}
Content-Type: application/json
Referer: http://...-11434.proxy.runpod.net/api/chat
X-Forwarded-For: 92.218.117.29, 162.158.110.130
X-Forwarded-Host: ...-11434.proxy.runpod.net
X-Forwarded-Proto: https

Response:

HTTP/1.1 404 Not Found
Content-Type: text/plain
Date: Tue, 10 Dec 2024 08:07:30 GMT
Content-Length: 18

Am I doing something wrong? Can I configure the endpoints?

The text was updated successfully, but these errors were encountered:

nstuhrmann · 2024-12-10T11:41:24Z

docker compose is:

  paperless-gpt:
    image: paperless-gpt #icereed/paperless-gpt:latest
    environment:
      PAPERLESS_BASE_URL: 'http://192.168.x.x:8000'
      PAPERLESS_API_TOKEN: '...'
      LLM_PROVIDER: 'ollama' # or 'ollama'
      LLM_MODEL: 'llama2'     # or 'llama2'
      OPENAI_API_KEY: 'your_openai_api_key' # Required if using OpenAI
      LLM_LANGUAGE: 'English' # Optional, default is 'English'
      OLLAMA_HOST: 'http://...-11434.proxy.runpod.net' # If using Ollama
      VISION_LLM_PROVIDER: 'ollama' # Optional (for OCR) - ollama or openai
      VISION_LLM_MODEL: 'minicpm-v' # Optional (for OCR) - minicpm-v, for example for ollama, gpt-4o for openai
      LOG_LEVEL: 'debug' # Optional or 'debug', 'warn', 'error'
    volumes:
      - ./prompts:/app/prompts # Mount the prompts directory
    ports:
      - '8080:8080'
    depends_on:
      - webserver

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Endpoint configuration for Llama2/Ollama running on runpod #53

Endpoint configuration for Llama2/Ollama running on runpod #53

nstuhrmann commented Dec 10, 2024 •

edited

Loading

nstuhrmann commented Dec 10, 2024

Endpoint configuration for Llama2/Ollama running on runpod #53

Endpoint configuration for Llama2/Ollama running on runpod #53

Comments

nstuhrmann commented Dec 10, 2024 • edited Loading

nstuhrmann commented Dec 10, 2024

nstuhrmann commented Dec 10, 2024 •

edited

Loading