Skip to content

Commit 33b700b

Browse files
bpo-37723: Fix performance regression on regular expression parsing. (GH-15030)
Improve performance of sre_parse._uniq function. (cherry picked from commit 9f55551) Co-authored-by: yannvgn <hi@yannvgn.io>
1 parent 3533061 commit 33b700b

File tree

3 files changed

+4
-7
lines changed

3 files changed

+4
-7
lines changed

Lib/sre_parse.py

+1-7
Original file line numberDiff line numberDiff line change
@@ -406,13 +406,7 @@ def _escape(source, escape, state):
406406
raise source.error("bad escape %s" % escape, len(escape))
407407

408408
def _uniq(items):
409-
if len(set(items)) == len(items):
410-
return items
411-
newitems = []
412-
for item in items:
413-
if item not in newitems:
414-
newitems.append(item)
415-
return newitems
409+
return list(dict.fromkeys(items))
416410

417411
def _parse_sub(source, state, verbose, nested):
418412
# parse an alternation: a|b|c

Misc/ACKS

+1
Original file line numberDiff line numberDiff line change
@@ -1672,6 +1672,7 @@ Michael Urman
16721672
Hector Urtubia
16731673
Lukas Vacek
16741674
Ville Vainio
1675+
Yann Vaginay
16751676
Andi Vajda
16761677
Case Van Horsen
16771678
John Mark Vandenberg
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Fix performance regression on regular expression parsing with huge
2+
character sets. Patch by Yann Vaginay.

0 commit comments

Comments
 (0)