-
Notifications
You must be signed in to change notification settings - Fork 49
Parse matching options #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
case asciiOnlyPOSIXProps // P | ||
case asciiOnlySpace // S | ||
case asciiOnlyWord // W | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natecook1000 Should we include the X
, u
& b
options from #64 here?
Avoid crashing on '(?' by emitting a new diagnostic.
Parse both explicitly scoped `(?i:...)`, `(?^i:...)`, `(?i-m:...)` and implicitly scoped `(?i)`, `(?^i)`, `(?i-m)` matching option specifiers, with support for, PCRE, ICU, and Oniguruma options.
009bb36
to
12275bb
Compare
@swift-ci please test Linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some minor questions.
// If hasImplicitScope is true, it was written as e.g (?i), and implicitly | ||
// forms a group containing all the following elements of the current | ||
// group. | ||
case changeMatchingOptions(MatchingOptionSequence, hasImplicitScope: Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just isolated: Bool
, or is there something important about the scope concept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it's used to check if we need to eat a closing )
for a group, so the implicitness of the scope is relevant there, but I'm also fine with renaming it to isolated
as we'd still have the hasImplicitScope
property. I think at the time I just couldn't think of a good descriptive name for the syntax :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so it’s purely a syntactic detail. Yeah, I’d prefer “isolated” as that’s what it is from a syntactic perspective.
return false | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be an example use of this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to determine whether we need to eat the closing )
for the group when parsing https://github.com/apple/swift-experimental-string-processing/blob/32b66851651561fb95b6de3dd784c18ba50f27d1/Sources/_MatchingEngine/Regex/Parse/Parse.swift#L171-L178
return opt | ||
|
||
default: | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a known terminator (:
or )
), so it seems like we'd want to gather options until we hit the terminator (and error if we encounter anything not an option) and error if we reach end-of-input as well. I think that could also simplify some of this logic, as it's fine to consume a non-terminating character up front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm it's tricky because there's also other valid group syntax that starts with (?
(even containing letter characters for e.g (?P<
, unless we guaranteed we would have already tried lexing them before trying to lex matching options? Or otherwise make sure to exclude the other valid group syntax here?
I think it's best to merge this and follow up after on details. |
Parse both explicitly scoped
(?i:...)
,(?^i:...)
,(?i-m:...)
and implicitly scoped(?i)
,(?^i)
,(?i-m)
matching option specifiers, with support for, PCRE, ICU, and Oniguruma options.