Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' #155

Open
choppin opened this issue Dec 22, 2024 · 1 comment
Open

UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' #155

choppin opened this issue Dec 22, 2024 · 1 comment

Comments

@choppin
Copy link

choppin commented Dec 22, 2024

请在 Issues 中搜索是否有同样的问题,如果没有,再提问

Issue 需要包含以下信息:
操作系统版本:Windows 7
Python 版本:3.8.10
问题描述:下载笔记内容过程中报错后中断了,重试也一样。
不知道是我环境问题 还是工具本身就是这样,日志里显示的文件名路径是乱码的,我根据路径内容确认 下载的笔记文件应该是这些:
1111
目录:D:\ydnote_source\notes\1_算法相关\10堂算法入门课
2024/12/22 22:59

.
2024/12/22 22:59 ..
2024/12/22 22:59 images
2024/12/22 22:59 7,755 【最简单易懂的10堂算法入门课——7个最常用的编程技巧 - 今日头条】.md
2024/12/22 22:59 8,561 【最简单易懂的10堂算法入门课——初阶数据结构 - 今日头条】.md
2024/12/22 22:59 1,811 【最简单易懂的10堂算法入门课——算法思想之二:分治算法 - 今日头条】.md
2024/12/22 22:59 6,169 【最简单易懂的10堂算法入门课——算法思想之四:穷举搜索 - 今日头条】.md
2024/12/22 22:58 11,063 最简单易懂的10堂算法入门课——动态规划.md
2024/12/22 22:59 6,680 最简单易懂的10堂算法入门课——数据结构与数学模型.md
2024/12/22 22:58 7,313 最简单易懂的10堂算法入门课——程序结构.md
2024/12/22 22:58 7,091 最简单易懂的10堂算法入门课——算法思想之一:贪心算法.md
2024/12/22 22:58 6,307 最简单易懂的10堂算法入门课——算法是什么.md
2024/12/22 22:58 4,885 最简单易懂的10堂算法入门课——高阶数据结构.md

如下是其中1个的报错日志信息:
2024/12/23 00:03:32 INFO MainProcess-MainThread-7664 pull.py:189 get_file_action : ▒▒▒ļ▒▒▒D:/ydnote_source/notes/▒▒▒▒▒▒β▒.md▒▒▒▒▒▒▒£▒▒▒▒▒
--- Logging error ---
Traceback (most recent call last):
File "D:\dev_tools\Python\Python38\lib\logging_init
.py", line 1088, in emit
stream.write(msg + self.terminator)
UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 151: illegal multibyte sequence
Call stack:
File "pull.py", line 357, in
youdaonote_pull.pull_dir_by_id_recursively(
File "pull.py", line 233, in pull_dir_by_id_recursively
self.pull_dir_by_id_recursively(id, sub_dir)
File "pull.py", line 233, in pull_dir_by_id_recursively
self.pull_dir_by_id_recursively(id, sub_dir)
File "pull.py", line 237, in pull_dir_by_id_recursively
self._add_or_update_file(id, name, local_dir, modify_time, create_time)
File "pull.py", line 274, in _add_or_update_file
file_action = self._get_file_action(local_file_path, modify_time)
File "pull.py", line 189, in _get_file_action
logging.info("▒▒▒ļ▒▒▒%s▒▒▒▒▒▒▒£▒▒▒▒▒", local_file_path)
Message: '▒▒▒ļ▒▒▒%s▒▒▒▒▒▒▒£▒▒▒▒▒'
Arguments: ('D:/ydnote_source/notes/1_▒㷨▒▒▒/10▒▒▒㷨▒▒▒ſ▒/▒▒▒▒▒▒׶▒▒▒10▒▒▒㷨▒▒▒ſΡ▒▒▒▒▒▒▒▒▒▒ݽṹ\xa0-\xa0▒▒▒▒ͷ▒▒▒▒.md',)

@choppin
Copy link
Author

choppin commented Dec 24, 2024

执行输出日志文件名中文乱码问题 是我环境问题,我之前是直接在git-gui命令窗口里执行的。换成windows cmd命令窗口执行 中文就没乱码了,这个问题 看起来是把 空格解析成了\xa0 ,然后再用gbk转码回去就报错了

文件链接:【有道云笔记】【最简单易懂的10堂算法入门课——初阶数据结构 - 今日头条】
https://note.youdao.com/s/9qDom1eY

报错是这样的:
捕获11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant