-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
project.py: Specify the manifest file to use UTF-8 encoding #710
Conversation
Could you please file a bug with some reproduction steps? Encoding issues are never as simple as they seem... |
src/west/app/project.py
Outdated
@@ -321,7 +321,7 @@ def bootstrap(self, args) -> Path: | |||
# Parse the manifest to get "self: path:", if it declares one. | |||
# Otherwise, use the URL. Ignore imports -- all we really | |||
# want to know is if there's a "self: path:" or not. | |||
manifest = Manifest.from_data(temp_manifest.read_text(), | |||
manifest = Manifest.from_data(temp_manifest.read_text('utf-8'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
locale.getencoding() is used by default. If it does not already return utf8 on a system then why should west
ignore the system's default and hardcode it?
https://docs.python.org/3/library/functions.html#open
Please try this interactively on your system and share the output:
python
import locale
locale.getencoding()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed 3.8 was different :-( https://docs.python.org/3.8/library/functions.html#open
Please also run this:
python
import locale
locale.getpreferredencoding()
locale.getdefaultlocale()
locale.getencoding()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> import locale
>>> locale.getencoding()
'cp936'
>>>
>>> import locale
>>> locale.getpreferredencoding()
'cp936'
>>> locale.getdefaultlocale()
<stdin>:1: DeprecationWarning: Use setlocale(), getencoding() and getlocale() instead
('zh_CN', 'cp936')
>>> locale.getencoding()
'cp936'
>>>
how to reproduction:
west init -m URL --mr main --mf west.yml myworkspace
zhe west.yml using UTF-8 encoding has Chinese annotation.
error out:
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Python311\Scripts\west.exe_main.py", line 7, in
File "C:\Python311\Lib\site-packages\west\app\main.py", line 1085, in main
app.run(argv or sys.argv[1:])
File "C:\Python311\Lib\site-packages\west\app\main.py", line 244, in run
self.run_command(argv, early_args)
File "C:\Python311\Lib\site-packages\west\app\main.py", line 503, in run_command
self.run_builtin(args, unknown)
File "C:\Python311\Lib\site-packages\west\app\main.py", line 611, in run_builtin
self.cmd.run(args, unknown, self.topdir,
File "C:\Python311\Lib\site-packages\west\commands.py", line 194, in run
self.do_run(args, unknown)
File "C:\Python311\Lib\site-packages\west\app\project.py", line 224, in do_run
topdir = self.bootstrap(args)
^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\west\app\project.py", line 313, in bootstrap
manifest = Manifest.from_data(temp_manifest.read_text(),
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\pathlib.py", line 1059, in read_text
return f.read()
^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 151: illegal multibyte sequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try $Env:PYTHONUTF8 = 1
(in powershell) and then try west again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It process successfully, use $Env:PYTHONUTF8 = 1 than west init -m URL --mr main --mf west.yml myworkspace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for testing! So you don't need this PR, correct?
If you prefer to switch your entire Windows system to UTF8, you can do it like this:
https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window
Then you won't have to choose $Env:PYTHONUTF8 = 1
(or not) for every application and west project.
There is a lot of other, useful information on that page, it's not just about powershell.
Also note the Python default will change in Python 3.15: https://peps.python.org/pep-0686/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When other places(manifest.py) call the read_text interface, they specify the encoding method, and I think it’s necessary here as well. 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When other places(manifest.py) call the read_text interface, they specify the encoding method,
Good point. Can you please fetch and test #711 instead?
Fixes issue reported in PR zephyrproject-rtos#710 where most places are hardcoded to 'utf-8' while this one is (Windows) locale-dependent. In the future, we may want to make this more flexible but the most urgent fix is consistency: with this commit, manifest decoding should be hardcoded to 'utf-8' everywhere. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Superseded by larger #711
Fixes issue reported in PR zephyrproject-rtos#710 where most places are hardcoded to 'utf-8' while this one is (Windows) locale-dependent. In the future, we may want to make this more flexible but the most urgent fix is consistency: with this commit, manifest decoding should be hardcoded to 'utf-8' everywhere. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Fixes issue reported in PR #710 where most places are hardcoded to 'utf-8' while this one is (Windows) locale-dependent. In the future, we may want to make this more flexible but the most urgent fix is consistency: with this commit, manifest decoding should be hardcoded to 'utf-8' everywhere. Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Specify the manifest file to use UTF-8 encoding