Replies: 1 comment 1 reply
-
Hi @bostjangec, thanks for your comment. I'm assuming this is about how PyHanko supports writing Unicode text, but unfortunately it's going to be a bit more complicated than just writing UTF-8 to the content stream. PDF's text display features are older than Unicode, and displaying non-Latin text properly requires some effort. While there are a number of very simple "standard" fonts that (virtually) all PDF readers will offer, (oversimplifying a little bit) those all work with the Latin character set. That works fine for very simple things, but (as you have discovered) it doesn't really generalise well. This is also why pyHanko uses Now, in your case, what you want to do is choose a font of your liking (that supports the characters you need), and embed a subset of it. PyHanko implements that using a font engine called Under the hood, pyHanko will invoke HarfBuzz to handle shaping, and use that to translate your Unicode strings to PDF display operators ("regular" character encodings don't really enter into the equation). The font is then subsetted using TL;DR: Text handling in PDF is complicated, and the output of EDIT: moved to discussion since this isn't a bug. |
Beta Was this translation helpful? Give feedback.
-
Describe the bug
when signing non latin1 charaters cause error. I reccomend 'utf-8' instead of 'latin1' in fonts/basic.py
Beta Was this translation helpful? Give feedback.
All reactions