-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Update regex literal lexing and emission #40595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@rintaro Mind taking a look at the lexer/parser changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
1b6fdb4
to
8e6be65
Compare
lib/Parse/Lexer.cpp
Outdated
if (ErrStr) { | ||
diagnose(TokStart, diag::regex_literal_parsing_error, ErrStr); | ||
formToken(tok::unknown, TokStart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be some cases where we emit an error, but still want to form a regex literal token.
String literal has a FIXME
where even if the end delimiter is missing, we should form a string literal and type check it, so that we can get additional semantic diagnostics, and more importantly, we can get code completion even inside malformed string interpolations. For example, consider "Name: \(user.#HERE# "
, in this case, end delimiter is missing because it's consumed as the string interpolation, because )
is missing. But we still want code completion for user
.
How about changing regexLiteralLexingFn
like:
const char *Ptr = TokStart;
const char *ErrStr = nullptr;
// Success: status = SUCCESS, advances `Ptr`, ErrStr == nullptr..
// Error, but parsable: status = SUCCESS, advances `Ptr` and populate `ErrStr`.
// Error, non parsable: status = FAILED, advances `Ptr` and populate `ErrStr`.
// Total failure: status = FAILED, Ptr == TokStart, ErrStr == nullptr.
bool status = regexLiteralLexingFn(&Ptr, BufferEnd, &ErrStr)
if (ErrStr) {
diagnose(TokStart, diag::regex_literal_parsing_error, ErrStr);
}
if (Ptr == TokStart) {
return false;
}
assert(Ptr > TokStart);
formToken(status ? tok::unknown : tok::regex_literal, TokStart);
return true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rintaro How does this look?
I updated my PR to fix some build script issues so you might need to rebase this. Hopefully both PRs can be merged today. |
…d regex. - Checkout apple/swift-experimental-string-processing using a tag. - Build `_MatchingEngine` as part of libswift (`ExperimentalRegex`) using sources from the package. - Parse regex literals using the parser from `_MatchingEngine`. - Build both `_MatchingEngine` and `_StringProcessing` as part of core libs using sources from the package. - Use `Regex<DynamicCaptures>` as the default regex type until we finalize swiftlang/swift-experimental-string-processing#68.
@swift-ci please test macOS platform |
Update the lexing implementation to defer to the regex library, which will pass back the pointer from to resume lexing, and update the emission to call the new `Regex(_regexString:version:)` overload, that will accept the regex string with delimiters. Because this uses the library's lexing implementation, the delimiters are now `'/.../'` and `'|...|'` instead of plain `'...'`.
8e6be65
to
128f5d4
Compare
@swift-ci please test |
Build failed |
@swift-ci please test macOS platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lexer/Parser changes LGTM. Thank you!
…les" This reverts commit a67a043, reversing changes made to 9965df7. This commit or the earlier commit this commit is based on (swiftlang#40531) broke the incremental bot.
Looks like this or the commit this is based on broke the incremental bot. |
…ing_and_emission Revert "Merge pull request #40595 from hamishknight/straw-bales"
Based on top of #40531, only the last commit is relevant.
Update the regex lexing implementation to defer to the regex library, which will pass back the pointer from which to resume lexing, and update the emission to call the new
Regex(_regexString:version:)
overload, that will accept the regex string with delimiters.Because this uses the library's lexing implementation, the delimiters are now
'/.../'
and'|...|'
instead of plain'...'
.