You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
not sure if this is an issue you can do anything about...
I often see automatic OCR go into some weird kind of endless loop. The call to ollama never comes back, it seems and the GPU of the ollama host is very busy all the time.
If I forcefully restart ollama, paperless-gpt stops the process and in the log I can see that some random (?) text, in parts extracted from the document, is repeated indefinitively.
Like this:
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
paperless-gpt-1 | Bescheibung nach 1893 Absatz der Gewerbeordnung.
paperless-gpt-1 | Steuergestellstandteil: Brutto
paperless-gpt-1 | Gesetzliche Abzüge:
paperless-gpt-1 | - Lohnsteuer, Ird.: 0.00 / Rentensteuer, Ird.: No
paperless-gpt-1 |
Used model for OCR is minicpm-v.
I am not sure how much control you have over the OCR stuf... maybe you have some hints to better investigate that?
The text was updated successfully, but these errors were encountered:
We are discussing this on our Discord channel and possible ways out. The thing is: It might be an issue with the combination of prompt and LLM. The paperless-gpt code has only very little influence on that.
I am also having this same issue with ollama never responding and being stuck at high resource usage, with the same kind of setup (ollama docker, using minicpm-v as the OCR model).
I'm attaching my ollama logs just in case it helps, though I fully acknowledge @icereed's comment above that this seems like an ollama/model problem, not a paperless-gpt problem
paperless-gpt-ollama.log
(no further output for 15 minutes + after the end of the logs in that file, though ollama is still maxing out 6 CPU threads)
Hi,
not sure if this is an issue you can do anything about...
I often see automatic OCR go into some weird kind of endless loop. The call to ollama never comes back, it seems and the GPU of the ollama host is very busy all the time.
If I forcefully restart ollama, paperless-gpt stops the process and in the log I can see that some random (?) text, in parts extracted from the document, is repeated indefinitively.
Like this:
Used model for OCR is
minicpm-v
.I am not sure how much control you have over the OCR stuf... maybe you have some hints to better investigate that?
The text was updated successfully, but these errors were encountered: