Skip to content

Commit

Permalink
Merge pull request #4 from sudoskys/dev
Browse files Browse the repository at this point in the history
Fix underline supports, fix `\(`, add some intro
  • Loading branch information
sudoskys authored May 25, 2024
2 parents 425514b + d3e1611 commit 67dd16d
Show file tree
Hide file tree
Showing 11 changed files with 235 additions and 102 deletions.
Binary file added .github/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 48 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,30 @@ or if you use `pdm`:
pdm add telegramify-markdown
```

## Supported Features

- [x] Headings (1-6)
- [x] Links [text](url)
- [x] Images ![alt]
- [x] Lists (Ordered, Unordered)
- [x] Tables |-|-|
- [x] Horizontal Rule ----
- [x] *Text* **Styles**
- [x] __Underline__
- [x] Code Blocks
- [x] `Inline Code`
- [x] Block Quotes
- [x] ~~Strikethrough~~
- [ ] Task Lists
- [ ] ~Strikethrough~
- [ ] ||Spoiler||
- [ ] Tg Emoji
- [ ] Tg User At

> [!NOTE]
> Since mistletoe doesn't parse TODO and Spoiler, we can't apply it.
~Strikethrough~ is incorrect, even if it comes from official documentation, please use ~~Strikethrough~~ format.

## Use case

````python3
Expand All @@ -34,37 +58,34 @@ from telegramify_markdown.customize import markdown_symbol
markdown_symbol.head_level_1 = "📌" # If you want, Customizing the head level 1 symbol
markdown_symbol.link = "🔗" # If you want, Customizing the link symbol
md = """
# 一级标题 `c!ode` # 一级标题 `code`
[Link!AA](https://www.example.com)
[key!]: https://www.google.com "a title!"
[这是!链接2][asd!asd](https://www.example.com)
[rttt]()
![PIC](https://www.example.com/image.jpg)
1. Order!ed
1. Order!ed sub
- Unord*-.ered
'\_', '\*', '\[', '\]', '\(', '\)', '\~', '\`', '\>', '\#', '\+', '\-', '\=', '\|', '\{', '\}', '\.', '\!'
_ , * , [ , ] , ( , ) , ~ , ` , > , # , + , - , = , | , { , } , . , !
**bold text**
*bold text*
_italic text_
__underline__
~no valid strikethrough~
~~strikethrough~~
||spoiler||
*bold _italic bold ~~italic bold strikethrough ||italic bold strikethrough spoiler||~~ __underline italic bold___ bold*
__underline italic bold__
[link](https://www.google.com)
- [ ] Uncompleted task list item
- [x] Completed task list item
> Quote
```python
print("Hello, World!")
```
This is `inline code`
1. First ordered list item
2. Another item
- Unordered sub-list.
1. Actual numbers don't matter, just that it's a number
"""
converted = telegramify_markdown.convert(md)
print(converted)
````

output as follows:

```markdown
*📌 一级标题 `c\!ode` \# 一级标题 `code`*
[Link\!AA](https://www\.example\.com)

🔗[a title\!](https://www\.google\.com)

\[这是\!链接2\][asd\!asd](https://www\.example\.com)
[rttt]()
🖼[PIC](https://www\.example\.com/image\.jpg)
1\. Order\!ed
1\. Order\!ed sub
⦁ Unord\*\-\.ered
```

> Note: Telegram Server automatically processes the double of `\`(`\\`) again (even after escaping), which is beyond the
> control of us.
![.github/result.png](.github/result.png)
2 changes: 1 addition & 1 deletion pdm.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

85 changes: 55 additions & 30 deletions playground/exp1.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,40 @@
key: value
---

# 一级标题 `c!ode` # 一级标题 `code`
\(c!ode\)

## 二级标题
\# Heading Level 1 `c!ode`

### 三级标题
# Heading Level 1 `c!ode`

## Heading Level 2

### Heading Level 3

Header
======

included as literal
1231asdasd

**这是粗!体文本**
*这是斜!体文本*
~~这是删!除线文本~~
**Bold text**
*Italic text*
~~Strikethrough text~~

> 这是引用!文本
> Blockquote text
`这是内联!代码\\\\`
`Inline code`

\\\/\111`sad`
\\/\\111`sad`

```
这是代码块!
Code block
```

```python
# 这是带有语言指定的代码块
# Code block with specified language
print("Hello, World!")
```

Expand All @@ -42,29 +46,50 @@ print("Hello, World!")

1. numbered item

[key!]: https://www.google.com "a title!"

<p>some text</p>
[some text](https://www.example.com)

[这是链!接](https://www.example.com)

[这是!链接2][asd!asd](https://www.example.com)
[some text2][asd!asd](https://www.example.com)
[rttt]()
[这是链接3][asdasd2]
[some text3][asdasd2]

![这是图片](https://www.example.com/image.jpg)
![Image](https://www.example.com/image.jpg)

<https://www.google.com>

---
这是水平线

内置的 **加粗***斜体* 文本

| 表头 | 表头 |
|-----|-----|
| 单元格 | 单元格 |
| 单元格 | 单元格 |

- [ ] 这是未完成的任务列表项
- [x] 这是已完成的任务列表项
Horizontal Rule

**Bold** and *Italic* text

| Header | Header |
|--------|--------|
| Cell | Cell |
| Cell | Cell |

- [ ] Uncompleted task list item
- [x] Completed task list item

In all other places characters '_', '*', '[', ']', '(', ')', '~', '`', '>', '#', '+', '-', '=', '|', '{', '}', '.', '!' must be escaped with the preceding character '\'.
In all other places characters '\_', '\*', '\[', '\]', '\(', '\)', '\~', '\`', '\>', '\#', '\+', '\-', '\=', '\|', '\{', '\}', '\.', '\!' must be escaped with the preceding character '\'.

*bold \*text*
_italic \*text_
__underline__
~strikethrough~
||spoiler||
*bold _italic bold ~italic bold strikethrough ||italic bold strikethrough spoiler||~ __underline italic bold___ bold*
[inline URL](http://www.example.com/)
[inline mention of a user](tg://user?id=123456789)
![👍](tg://emoji?id=5368324170671202286)
`inline fixed-width code`
```
pre-formatted fixed-width code block
```
```lua
pre-formatted fixed-width code block written in the Python programming language
```
>Block quotation started
>Block quotation continued
>The last line of the block quotation**
>The second block quotation started right after the previous\r
>The third block quotation started right after the previous
43 changes: 43 additions & 0 deletions playground/show_send.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
import os

from dotenv import load_dotenv
from telebot import TeleBot

import telegramify_markdown

md = """
'\_', '\*', '\[', '\]', '\(', '\)', '\~', '\`', '\>', '\#', '\+', '\-', '\=', '\|', '\{', '\}', '\.', '\!'
_ , * , [ , ] , ( , ) , ~ , ` , > , # , + , - , = , | , { , } , . , !
**bold text**
*bold text*
_italic text_
__underline__
~no valid strikethrough~
~~strikethrough~~
||spoiler||
*bold _italic bold ~~italic bold strikethrough ||italic bold strikethrough spoiler||~~ __underline italic bold___ bold*
__underline italic bold__
[link](https://www.google.com)
- [ ] Uncompleted task list item
- [x] Completed task list item
> Quote
```python
print("Hello, World!")
```
This is `inline code`
1. First ordered list item
2. Another item
- Unordered sub-list.
1. Actual numbers don't matter, just that it's a number
"""
converted = telegramify_markdown.convert(md)
print(converted)
load_dotenv()
telegram_bot_token = os.getenv("TELEGRAM_BOT_TOKEN", None)
chat_id = os.getenv("TELEGRAM_CHAT_ID", None)
bot = TeleBot(telegram_bot_token)
bot.send_message(
chat_id,
converted,
parse_mode="MarkdownV2"
)
3 changes: 3 additions & 0 deletions playground/telegram_exp.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@


def ignore(a):
print(a)
pass


Expand Down Expand Up @@ -29,3 +30,5 @@ def ignore(a):
""">Hello, World\!"""
ignore(formatting.escape_markdown("Hello, World!"))
"""Hello, World\!"""
ignore(formatting.escape_markdown("\(Hello, World!)"))
"""Hello, World\!"""
13 changes: 2 additions & 11 deletions playground/use_case.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,8 @@
markdown_symbol.head_level_1 = "📌" # If you want, Customizing the head level 1 symbol
markdown_symbol.link = "🔗" # If you want, Customizing the link symbol
md = """
# 一级标题 `c!ode` # 一级标题 `code`
[Link!AA](https://www.example.com)
[key!]: https://www.google.com "a title!"
[这是!链接2][asd!asd](https://www.example.com)
[rttt]()
![PIC](https://www.example.com/image.jpg)
1. Order!ed
1. Order!ed sub
- Unord*-.ered
*bold _italic bold ~italic bold strikethrough ||italic bold strikethrough spoiler||~ __underline italic bold___ bold*
~strikethrough~
"""
converted = telegramify_markdown.convert(md)
print(converted)
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
[project]
name = "telegramify-markdown"
version = "0.1.2"
version = "0.1.3"
description = "Convert Markdown to a format usable by Telegram."
authors = [
{ name = "sudoskys", email = "coldlando@hotmail.com" },
]
dependencies = [
"mistletoe>=1.3.0",
"mistletoe==1.3.0",
"pytelegrambotapi>=4.16.1",
"emoji>=2.10.1",
]
Expand Down
11 changes: 9 additions & 2 deletions src/telegramify_markdown/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import os
from typing import Union

import mistletoe
Expand All @@ -10,16 +9,24 @@
from .render import TelegramMarkdownRenderer


def markdownify(text: str):
# '_', '*', '[', ']', '(', ')', '~', '`', '>', '#', '+', '-', '=', '|', '{', '}', '.', '!'
# if text in ["_", "*", "[", "]", "(", ")", "~", "`", ">", "#", "+", "-", "=", "|", "{", "}", ".", "!"]:
# return text
return formatting.escape_markdown(text)


def _update_text(token: Union[SpanToken, BlockToken]):
"""Update the text contents of a span token and its children.
`InlineCode` tokens are left unchanged."""
if isinstance(token, ThematicBreak):
token.line = formatting.escape_markdown("————————")
pass
elif isinstance(token, LinkReferenceDefinition):
pass
else:
assert hasattr(token, "content"), f"Token {token} has no content attribute"
token.content = formatting.escape_markdown(token.content)
token.content = markdownify(token.content)


def _update_block(token: BlockToken):
Expand Down
15 changes: 13 additions & 2 deletions src/telegramify_markdown/render.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from mistletoe import block_token, span_token
from mistletoe.markdown_renderer import MarkdownRenderer, LinkReferenceDefinition, Fragment
from telebot import formatting

from .customize import markdown_symbol


Expand Down Expand Up @@ -64,11 +65,14 @@ def render_setext_heading(
yield formatting.escape_markdown("——" * 5)

def render_emphasis(self, token: span_token.Emphasis) -> Iterable[Fragment]:
token.delimiter = "_"
return super().render_emphasis(token)

def render_strong(self, token: span_token.Strong) -> Iterable[Fragment]:
return self.embed_span(Fragment(token.delimiter * 1), token.children)
# Telegram strong: *text* but __text__ for emphasis, so we need to check the delimiter
if token.delimiter == "*":
return self.embed_span(Fragment(token.delimiter * 1), token.children)
# __
return self.embed_span(Fragment(token.delimiter * 2), token.children)

def render_strikethrough(
self, token: span_token.Strikethrough
Expand Down Expand Up @@ -129,6 +133,13 @@ def render_link_or_image(
def render_auto_link(self, token: span_token.AutoLink) -> Iterable[Fragment]:
yield Fragment(formatting.escape_markdown("<") + token.children[0].content + formatting.escape_markdown(">"))

def render_escape_sequence(
self, token: span_token.EscapeSequence
) -> Iterable[Fragment]:
# 渲染转义字符
# because the escape_markdown already happened in the parser, we can skip it here.
yield Fragment("" + token.children[0].content)

def render_table(
self, token: block_token.Table, max_line_length: int
) -> Iterable[str]:
Expand Down
Loading

0 comments on commit 67dd16d

Please sign in to comment.