-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add content src/site/notes/OpenSource/正则表达式.md
- Loading branch information
1 parent
3e235d5
commit e225f93
Showing
1 changed file
with
59 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
--- | ||
{"dg-publish":true,"permalink":"/OpenSource/正则表达式/","noteIcon":"3"} | ||
--- | ||
|
||
#regex | ||
### 1.单词边界(Word Boundary) | ||
|
||
**单词边界**(Word Boundary)是一个正则表达式概念,用来标识单词与其他字符之间的分隔位置。它表示一种逻辑位置,而不是具体的字符。单词边界通常位于: | ||
|
||
1. 一个**单词字符**(字母、数字、下划线)和一个**非单词字符**之间。 | ||
2. 一个**单词字符**与字符串的开始或结束位置之间。 | ||
|
||
**单词字符的定义** | ||
|
||
单词字符(Word Character)通常包括: | ||
|
||
• 字母:a-z 和 A-Z | ||
|
||
• 数字:0-9 | ||
|
||
• 下划线:_ | ||
|
||
非单词字符是指不属于上述范围的字符,比如空格、标点符号(如 ,、.、!)、换行符等。 | ||
|
||
**单词边界的匹配规则** | ||
|
||
以下是单词边界的几个常见场景: | ||
|
||
• 在单词开头:匹配单词字符的开始(前面是非单词字符或字符串起点)。 | ||
|
||
• 在单词结尾:匹配单词字符的结束(后面是非单词字符或字符串终点)。 | ||
**例子:** | ||
给定字符串: | ||
"hello world, hi!" | ||
1. **单词边界的位置** | ||
2. | ||
在这个字符串中,单词边界会出现在以下地方(用竖线 | 标识): | ||
|hello| |world|, |hi|! | ||
|
||
2. **常见匹配结果** | ||
|
||
• 如果匹配 `\bhello\b`,表示完整单词 hello,不会匹配 helloworld 或 chello。 | ||
|
||
• 如果匹配 `\bhi\b`,只会匹配孤立的 hi,不会匹配 this. | ||
|
||
**单词边界的用途** | ||
• 限制匹配范围:确保只匹配整个单词,而不是单词的一部分。 | ||
• 避免误匹配:如只想匹配 cat 而不是 category 或 concatenate。 | ||
**示例:** | ||
1. 匹配独立的单词: | ||
`echo "cat dog category" | sed -E 's/\bcat\b/CAT/g'` | ||
|
||
输出: | ||
`CAT dog category` | ||
|
||
(只替换了孤立的 cat,未替换 category。) | ||
|
||
|
||
|