Skip to content

Commit

Permalink
Split to two languages
Browse files Browse the repository at this point in the history
  • Loading branch information
haydenwong7bm committed Feb 4, 2023
1 parent 5c3f16c commit f6ff4cf
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@

* [For English, please click here.](https://github.com/haydenwong7bm/inherited-glyphs-converter/blob/master/README_en.md)

# 傳承字形轉換器
轉換中文文字至[傳承字形](https://zh.wikipedia.org/wiki/%E8%88%8A%E5%AD%97%E5%BD%A2)(大致根據[《傳承字形檢校表》](https://github.com/ichitenfont/inheritedglyphs)標準),消除[新字形](https://zh.wikipedia.org/wiki/%E6%96%B0%E5%AD%97%E5%BD%A2)[香港](https://zh.wikipedia.org/wiki/%E5%B8%B8%E7%94%A8%E5%AD%97%E5%AD%97%E5%BD%A2%E8%A1%A8)[臺灣](https://zh.wikipedia.org/wiki/%E5%9C%8B%E5%AD%97%E6%A8%99%E6%BA%96%E5%AD%97%E9%AB%94)標準異體字,如該異體字於Unicode[分開編碼](https://zh.wikipedia.org/wiki/%E4%B8%AD%E6%97%A5%E9%9F%93%E7%B5%B1%E4%B8%80%E8%A1%A8%E6%84%8F%E6%96%87%E5%AD%97#%E8%AA%8D%E5%90%8C%E5%8E%9F%E5%89%87%E8%88%87%E5%8E%9F%E5%AD%97%E9%9B%86%E5%88%86%E9%9B%A2%E5%8E%9F%E5%89%87)

Expand Down
44 changes: 44 additions & 0 deletions README_en.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
* [請點擊這裏査看中文版。](https://github.com/haydenwong7bm/inherited-glyphs-converter/)

# inherited-glyphs-converter
Convert CJK text to their [inherited glyphs](https://en.wikipedia.org/wiki/Jiu_zixing) form (mostly follows [_List of Recommended Inherited Glyph Components_](https://github.com/ichitenfont/inheritedglyphs)), eliminating the [xin zixing](https://en.wikipedia.org/wiki/Xin_zixing), [Hong Kong](https://en.wikipedia.org/wiki/List_of_Graphemes_of_Commonly-Used_Chinese_Characters) and [Taiwan](https://en.wikipedia.org/wiki/Standard_Form_of_National_Characters) standard variant if that character variant is [encoded seperately](https://en.wikipedia.org/wiki/CJK_Unified_Ideographs#CJK_Unified_Ideographs) on Unicode.

The converter will keep [Shinjitai](https://en.wikipedia.org/wiki/Shinjitai) and [simplified Chinese characters](https://en.wikipedia.org/wiki/Simplified_Chinese_characters) as much as possible.

## Usage

### Command line

python . <file name>

Command line arguments:

| **Options** | **Usage** | **Default value if `-o` not provided** |
|---|---|---|
| `-o` | Set options below if this argument is provided. | |
| `-j` | Use Japanese [compatibility ideographs](https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs). | `True` |
| `-k` | Use Korean compatibility ideographs. | `True` |
| `-t` | Use [CNS 11643 compatibility ideographs](https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs_Supplement). | `True` |
| `-s <value>` | If `value` is `c`: Use only [UnihanCore2020](https://www.unicode.org/L2/L2019/19388-unihan-core-2020.pdf) characters on supplementary planes<br>If `value` is `*`: Use all characters on supplementary planes. | `c` |
| `-i` | Convert other inherited variants (e.g. 秘 → 祕, 裡 → 裏). | `True` |

### Import module
The `inheritedglyphs` module provides a single function `convert()` which converts a string to their inherited glyphs form.

Function arguments:

| **Arguments** | **Usage** | **Default value** |
|---|---|---|
| `use_compatibility` | An iterable that contains `'j'`, `'k'`, and/or `'t'`.<br>`'j'`: Use Japanese [compatibility ideographs](https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs).<br>`'k'`: Use Korean compatibility ideographs.<br>`'t'`: Use [CNS 11643 compatibility ideographs](https://en.wikipedia.org/wiki/CJK_Compatibility_Ideographs_Supplement). | `'jkt'` |
| `convert_inherited` | If `True`, it will convert other inherited variants (e.g. 祕 → 祕, 裡 → 裏). | `True` |
| `use_supp` | Either be `False`, `'c'`, `'*'`.<br>`c`: in supplementary planes, only use [UnihanCore2020](https://www.unicode.org/L2/L2019/19388-unihan-core-2020.pdf) characters.<br>`'*'`: in supplementary planes, use all characters. | `'c'` |

>>> from inheritedglyphs import *
>>> string = '教育及青年發展局是澳門特區政府社會文化司成立的公共部門。'
>>> print(convert(string))
敎育及靑年發展局是澳門特區政府社會文化司成立的公共部門。」
>>> print(convert(string, use_compatibility='j')) # don't use Korean and CNS compatibility ideographs
敎育及靑年發展局是澳門特區政府社會文化司成立的公共部門。
>>> string = '李白(唐‧五言絶句)《靜夜思》:「床前明月光,疑是地上霜,舉頭望明月,低頭思故鄉。」'
>>> print(convert(string, convert_inherited=False))
李白(唐‧五言絕句)《靜夜思》:「床前明月光,疑是地上霜,擧頭望明月,低頭思故鄕。」

0 comments on commit f6ff4cf

Please sign in to comment.