Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the POSIX standard. #17

Merged
merged 1 commit into from
Jan 14, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ We gladly accept any PR's assuming they are well written, documented ( if necess

If you're unsure if we'll accept a new feature please open an issue requesting it and we can have a discussion before you code and submit a PR.

If you update the regular expression, you must update its link.
If you update the regular expression, you must update its link and the corresponding regular expression in [POSIX.md](/POSIX.md).

If you change the list of test numbers for a regular expression, you must update the test numbers for all regular expressions to make sure they are consistent. This means you have to regenerate a new link for each regular expression and update it.

Expand All @@ -39,7 +39,7 @@ Please do not be offended if we close your issue and reference this document. If

如果您不确定我们是否接受新功能,请打开一个 issue 询问,我们可以在您编码和提交 PR 之前进行讨论。

如果您更新了正则表达式,您必须更新它的链接
如果您更新了正则表达式,您必须更新它的链接和 [POSIX-CN.md](/POSIX-CN.md) 中对应的正则表达式

如果您改变了某一个正则表达式的测试号码列表,您必须更新所有正则表达式的测试号码,确保它们一致。这也就意味着您必须重新为每一个正则表达式生成一个新的链接并更新。

Expand Down
258 changes: 258 additions & 0 deletions POSIX-CN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
# [POSIX] 标准正则表达式

目前正则表达式主要有 [PCRE] 和 [POSIX] 两大标准,[POSIX] 又分为 BRE(grep、sed 等)、GNU BRE(GNU grep、GNU sed 等)、ERE(egrep、awk 等)、GNU ERE(grep –E、GNU awk 等)以及已经被废弃的 SRE 几种流派,彼此之间的语法都有差异。

由于 [PCRE] 标准被各种常用编程语言广泛支持,所以本项目中的正则表达式都是 [PCRE] 的。而 [POSIX] 标准目前主要被各种 Unix-like 系统内置命令所支持,例如常用的 Linux 系统中的 `awk`, `sed` 等命令。

由于 Unix-like 系统众多,历史悠久,同一个命令有众多种实现版本(例如 `awk` 就有 `awk`,`gawk`,`mawk`,`nawk` 等众多实现),并且同一个系统中不同的命令还可能存在使用不同流派的情况,很难做到兼容所有流派和命令,故本文档所列出的正则表达式仅兼容以下两种流派:

- GNU BRE(gsed - GNU sed)
- GNU ERE(gawk - GNU awk,ggrep - GNU grep)

如需了解更多,可以阅读[维基百科 - Regular expression]。

## 正则表达式

### 匹配所有号码(手机卡 + 数据卡 + 上网卡)

<!--
请注意:
由于 GitHub Markdown 要求在表格中插入的 '|' 符号需要转义,所以:

- 所有 GNU ERE 正则中的 '|' 前面的 '\' 都是为了转义 '|'。
- 所有 GNU BRE 正则中的 '|' 前面的 '\\' 都是为了转义 '\|'。

这些转义字符并不是正则表达式所需要的。在您修改正则表达式的时候请多加小心。
为了防止渲染出现错误,所有的正则表达式都需要在注释中写清转义之前的语句。
参见 https://help.github.com/articles/organizing-information-with-tables/#formatting-content-within-your-table
-->

<!--
| GNU ERE | `^(+?86)?1(3[0-9]{3}|5[012356789][0-9]{2}|8[0-9]{3}|7([01356789][0-9]{2}|4(0[0-9]|1[0-2]|9[0-9]))|9[189][0-9]{2}|6[567][0-9]{2}|4([14]0[0-9]{3}|[68][0-9]{4}|[579][0-9]{2}))[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\|5[012356789][0-9]\{2\}\|8[0-9]\{3\}\|7\([01356789][0-9]\{2\}\|4\(0[0-9]\|1[0-2]\|9[0-9]\)\)\|9[189][0-9]\{2\}\|6[567][0-9]\{2\}\|4\([14]0[0-9]\{3\}\|[68][0-9]\{4\}\|[579][0-9]\{2\}\)\)[0-9]\{6\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3[0-9]{3}\|5[012356789][0-9]{2}\|8[0-9]{3}\|7([01356789][0-9]{2}\|4(0[0-9]\|1[0-2]\|9[0-9]))\|9[189][0-9]{2}\|6[567][0-9]{2}\|4([14]0[0-9]{3}\|[68][0-9]{4}\|[579][0-9]{2}))[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\\|5[012356789][0-9]\{2\}\\|8[0-9]\{3\}\\|7\([01356789][0-9]\{2\}\\|4\(0[0-9]\\|1[0-2]\\|9[0-9]\)\)\\|9[189][0-9]\{2\}\\|6[567][0-9]\{2\}\\|4\([14]0[0-9]\{3\}\\|[68][0-9]\{4\}\\|[579][0-9]\{2\}\)\)[0-9]\{6\}$` |

### 匹配所有支持短信功能的号码(手机卡 + 上网卡)

<!--
| GNU ERE | `^(+?86)?1(3[0-9]{3}|5[012356789][0-9]{2}|8[0-9]{3}|7([01356789][0-9]{2}|4(0[0-9]|1[0-2]|9[0-9]))|9[189][0-9]{2}|6[567][0-9]{2}|4[579][0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\|5[012356789][0-9]\{2\}\|8[0-9]\{3\}\|7\([01356789][0-9]\{2\}\|4\(0[0-9]\|1[0-2]\|9[0-9]\)\)\|9[189][0-9]\{2\}\|6[567][0-9]\{2\}\|4[579][0-9]\{2\}\)[0-9]\{6\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3[0-9]{3}\|5[012356789][0-9]{2}\|8[0-9]{3}\|7([01356789][0-9]{2}\|4(0[0-9]\|1[0-2]\|9[0-9]))\|9[189][0-9]{2}\|6[567][0-9]{2}\|4[579][0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\\|5[012356789][0-9]\{2\}\\|8[0-9]\{3\}\\|7\([01356789][0-9]\{2\}\\|4\(0[0-9]\\|1[0-2]\\|9[0-9]\)\)\\|9[189][0-9]\{2\}\\|6[567][0-9]\{2\}\\|4[579][0-9]\{2\}\)[0-9]\{6\}$` |

### 手机卡

#### 匹配所有

<!--
| GNU ERE | `^(+?86)?1(3[0-9]{3}|5[012356789][0-9]{2}|8[0-9]{3}|7([35678][0-9]{2}|4(0[0-9]|1[0-2]|9[0-9]))|9[189][0-9]{2}|66[0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\|5[012356789][0-9]\{2\}\|8[0-9]\{3\}\|7\([35678][0-9]\{2\}\|4\(0[0-9]\|1[0-2]\|9[0-9]\)\)\|9[189][0-9]\{2\}\|66[0-9]\{2\}\)[0-9]\{6\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3[0-9]{3}\|5[012356789][0-9]{2}\|8[0-9]{3}\|7([35678][0-9]{2}\|4(0[0-9]\|1[0-2]\|9[0-9]))\|9[189][0-9]{2}\|66[0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-9]\{3\}\\|5[012356789][0-9]\{2\}\\|8[0-9]\{3\}\\|7\([35678][0-9]\{2\}\\|4\(0[0-9]\\|1[0-2]\\|9[0-9]\)\)\\|9[189][0-9]\{2\}\\|66[0-9]\{2\}\)[0-9]\{6\}$` |

#### 匹配中国移动

<!--
| GNU ERE | `^(+?86)?1(3(4[0-8]|[5-9][0-9])|5[012789][0-9]|8[23478][0-9]|(78|98)[0-9])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(3\(4[0-8]\|[5-9][0-9]\)\|5[012789][0-9]\|8[23478][0-9]\|\(78\|98\)[0-9]\)[0-9]\{7\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3(4[0-8]\|[5-9][0-9])\|5[012789][0-9]\|8[23478][0-9]\|(78\|98)[0-9])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(3\(4[0-8]\\|[5-9][0-9]\)\\|5[012789][0-9]\\|8[23478][0-9]\\|\(78\\|98\)[0-9]\)[0-9]\{7\}$` |

#### 匹配中国联通

<!--
| GNU ERE | `^(+?86)?1(3[0-2]|[578][56]|66)[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-2]\|[578][56]\|66\)[0-9]\{8\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3[0-2]\|[578][56]\|66)[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?1\(3[0-2]\\|[578][56]\\|66\)[0-9]\{8\}$` |

#### 匹配中国电信

<!--
| GNU ERE | `^(+?86)?1(3(3[0-9]|49)[0-9]|53[0-9]{2}|8[019][0-9]{2}|7([37][0-9]{2}|40[0-5])|9[19][0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3\(3[0-9]\|49\)[0-9]\|53[0-9]\{2\}\|8[019][0-9]\{2\}\|7\([37][0-9]\{2\}\|40[0-5]\)\|9[19][0-9]\{2\}\)[0-9]\{6\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(3(3[0-9]\|49)[0-9]\|53[0-9]{2}\|8[019][0-9]{2}\|7([37][0-9]{2}\|40[0-5])\|9[19][0-9]{2})[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?1\(3\(3[0-9]\\|49\)[0-9]\\|53[0-9]\{2\}\\|8[019][0-9]\{2\}\\|7\([37][0-9]\{2\}\\|40[0-5]\)\\|9[19][0-9]\{2\}\)[0-9]\{6\}$` |

#### 匹配北京船舶通信导航有限公司(海事卫星通信)

<!--
| GNU ERE | `^(+?86)?1749[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1749[0-9]\{7\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1749[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1749[0-9]\{7\}$` |

#### 工业和信息化部应急通信保障中心(应急通信)

<!--
| GNU ERE | `^(+?86)?174(0[6-9]|1[0-2])[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?174\(0[6-9]\|1[0-2]\)[0-9]\{6\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?174(0[6-9]\|1[0-2])[0-9]{6}$` |
| GNU BRE | `^\(+\?86\)\?174\(0[6-9]\\|1[0-2]\)[0-9]\{6\}$` |

### 虚拟运营商

#### 匹配所有

<!--
| GNU ERE | `^(+?86)?1(7[01]|6[57])[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?1\(7[01]\|6[57]\)[0-9]\{8\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(7[01]\|6[57])[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?1\(7[01]\\|6[57]\)[0-9]\{8\}$` |

#### 匹配中国移动

<!--
| GNU ERE | `^(+?86)?1(65[0-9]|70[356])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(65[0-9]\|70[356]\)[0-9]\{7\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(65[0-9]\|70[356])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(65[0-9]\\|70[356]\)[0-9]\{7\}$` |

#### 匹配中国联通

<!--
| GNU ERE | `^(+?86)?1(70[4789]|71[0-9]|67[0-9])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(70[4789]\|71[0-9]\|67[0-9]\)[0-9]\{7\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1(70[4789]\|71[0-9]\|67[0-9])[0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?1\(70[4789]\\|71[0-9]\\|67[0-9]\)[0-9]\{7\}$` |

#### 匹配中国电信

<!--
| GNU ERE | `^(+?86)?170[0-2][0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?170[0-2][0-9]\{7\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?170[0-2][0-9]{7}$` |
| GNU BRE | `^\(+\?86\)\?170[0-2][0-9]\{7\}$` |

### 物联网数据卡

#### 匹配所有

<!--
| GNU ERE | `^(+?86)?14([14]0|[68][0-9])[0-9]{9}$` |
| GNU BRE | `^\(+\?86\)\?14\([14]0\|[68][0-9]\)[0-9]\{9\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?14([14]0\|[68][0-9])[0-9]{9}$` |
| GNU BRE | `^\(+\?86\)\?14\([14]0\\|[68][0-9]\)[0-9]\{9\}$` |

#### 匹配中国移动

<!--
| GNU ERE | `^(+?86)?14(40|8[0-9])[0-9]{9}$` |
| GNU BRE | `^\(+\?86\)\?14\(40\|8[0-9]\)[0-9]\{9\}$` |
-->

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?14(40\|8[0-9])[0-9]{9}$` |
| GNU BRE | `^\(+\?86\)\?14\(40\\|8[0-9]\)[0-9]\{9\}$` |

#### 匹配中国联通

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?146[0-9]{10}$` |
| GNU BRE | `^\(+\?86\)\?146[0-9]\{10\}$` |

#### 匹配中国电信

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?1410[0-9]{9}$` |
| GNU BRE | `^\(+\?86\)\?1410[0-9]\{9\}$` |

### 上网卡

#### 匹配所有

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?14[579][0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?14[579][0-9]\{8\}$` |

#### 匹配中国移动

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?147[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?147[0-9]\{8\}$` |

#### 匹配中国联通

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?145[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?145[0-9]\{8\}$` |

#### 匹配中国电信

| 命令 | 正则表达式 |
| --- | --- |
| GNU ERE | `^(+?86)?149[0-9]{8}$` |
| GNU BRE | `^\(+\?86\)\?149[0-9]\{8\}$` |


## 更新日志

#### 2019.01.12
- 发布首个兼容 GNU BRE 和 GNU ERE 的版本。


[PCRE]: https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions

[POSIX]: https://en.wikipedia.org/wiki/Regular_expression#Standards

[维基百科 - Regular expression]: https://en.wikipedia.org/wiki/Regular_expression

Loading