Don't parse a character property containing a backslash #301
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add backslash to the list of characters we don't consider valid for a character property name (previous rules implemented in #269). This means that we'll bail when attempting to lex a POSIX character property and instead lex a custom character class if we see a
\
. This allows e.g[:\Q :] \E]
to be lexed as a custom character class. For\p{...}
this just means we'll emit a truncated invalid property error, which is arguably more inline with what the user was expecting..I noticed this when digging through the ICU source code. It will bail out of parsing a POSIX character property if it encounters one of its known escape sequences (e.g
\a
,\e
,\f
, ...). Interestingly this doesn't cover character property escapes e.g\d
, but it's not clear that is intentional. Given backslash is not a valid character property character anyway, it seems reasonable to broaden this behavior to bail on any backslash.