Move Tags Cache time into a parameter, handling of 50k+ documents #371
Replies: 3 comments 2 replies
-
Yeah ... this is a huge amount of documents. Wir Deutschen lieben unsere Papiere und Ordnung 😆 I will try some things out to reduce the data that is transferred on requests. |
Beta Was this translation helpful? Give feedback.
-
Wouldn’t it be possible to fetch the tag cache only once at the beginning of the scan and then work with this “copy”? For me, the time between |
Beta Was this translation helpful? Give feedback.
-
So in my case, the chances are higher that someone creates a new tag during the scanning process rather than during normal operation. And the TAGs your system generates you already know ;-) In any case, this is quite a performance killer. I would be super happy if you could find a solution for this in the future. And of course, a big thumbs up from me for your great work! |
Beta Was this translation helpful? Give feedback.
-
Hey, I really love the work you've done and I stumbled upon this project whilst being in the middle of building a similar solution myself to wrangle the existing library of scanned documents I've built over the last 10+ years - maybe it's a German thing ;) ?
One thing that I noticed is that the system slows down in processing once the number of tags grows. I allowed for custom tags to be generated and am now stuck with 6000 custom tags. Re-scanning the list of tags every three seconds seems to slow down the processing significantly.
I am currently experimenting with upping the tag cache time a little bit and would really appreciate some experience around that and how to speed up bulk processing.
Thanks a lot for everybody's input on this.
Beta Was this translation helpful? Give feedback.
All reactions