src: general C++ cleanup in node_url.cc#19598
src: general C++ cleanup in node_url.cc#19598addaleax wants to merge 2 commits intonodejs:masterfrom
Conversation
- Merge `domain` and `opaque` storage in URL parser: This just simplifies the code a bit, having multiple fields in an union with the same type is usually just overhead. - Add move variant of `URLHost::ToString()`: This helps avoid unnecessary string copy operations, especially since we control the lifetime of `URLHost` objects pretty well. - Use const refs in node_url.cc where appropriate - Remove or reduce overly generous `.reserve()` calls: These would otherwise keep a lot of unused memory lying around. - Return return values instead of unnecessary pointer arguments - Use more common/expressive variable names - Avoid macro use, reduce number of unnecessary JS strings: There’s no reason for `GET`, `GET_AND_SET` and `UTF8STRING` to be macros. Also, `GET` would previously create a JS string instance for each single call, even though the strings it was called with were compile-time constants. - Avoid unnecessary JS casts when the type of a value is known - Avoid (commonly unnecessary) copy for whitespace stripping
| CHECK_EQ(type_, HostType::H_FAILED); | ||
| std::string output; | ||
| output.reserve(length * 3); | ||
| output.reserve(length); |
There was a problem hiding this comment.
@TimothyGu i’m not sure whether the spec definition says that this must be ASCII even before percent encoding, but either way, I would personally expect it to be ASCII most or all of the time?
There was a problem hiding this comment.
As long as the tests still pass. :)
src/node_url.cc
Outdated
| bool uflag = false; | ||
| bool atflag = false; // Set when @ has been seen. | ||
| bool square_bracket_flag = false; // Set inside of [...] | ||
| bool inside_password_flag = false; // Set after a : after an username. |
There was a problem hiding this comment.
As a data point, the URL Standard calls it passwordTokenSeenFlag, but seems like you figured out what it meant on yourself :)
There was a problem hiding this comment.
@TimothyGu Idk, do you want me to change it? I think both of these are okay names?
There was a problem hiding this comment.
I'd slightly prefer the name in the spec for better correspondence for future readers.
Yes, and a pretty good one :) But I wanted to wait for the benchmark CI before I started putting numbers on this… ;) |
|
Benchmark CI finished. The results are as follows (omitting some files with no or statistically insignificant results): That's some quite substantial speedups!
The only performance decreases look fluky. |
|
Landed in ae70e2b |
- Merge `domain` and `opaque` storage in URL parser: This just simplifies the code a bit, having multiple fields in an union with the same type is usually just overhead. - Add move variant of `URLHost::ToString()`: This helps avoid unnecessary string copy operations, especially since we control the lifetime of `URLHost` objects pretty well. - Use const refs in node_url.cc where appropriate - Remove or reduce overly generous `.reserve()` calls: These would otherwise keep a lot of unused memory lying around. - Return return values instead of unnecessary pointer arguments - Use more common/expressive variable names - Avoid macro use, reduce number of unnecessary JS strings: There’s no reason for `GET`, `GET_AND_SET` and `UTF8STRING` to be macros. Also, `GET` would previously create a JS string instance for each single call, even though the strings it was called with were compile-time constants. - Avoid unnecessary JS casts when the type of a value is known - Avoid (commonly unnecessary) copy for whitespace stripping PR-URL: #19598 Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
- Merge `domain` and `opaque` storage in URL parser: This just simplifies the code a bit, having multiple fields in an union with the same type is usually just overhead. - Add move variant of `URLHost::ToString()`: This helps avoid unnecessary string copy operations, especially since we control the lifetime of `URLHost` objects pretty well. - Use const refs in node_url.cc where appropriate - Remove or reduce overly generous `.reserve()` calls: These would otherwise keep a lot of unused memory lying around. - Return return values instead of unnecessary pointer arguments - Use more common/expressive variable names - Avoid macro use, reduce number of unnecessary JS strings: There’s no reason for `GET`, `GET_AND_SET` and `UTF8STRING` to be macros. Also, `GET` would previously create a JS string instance for each single call, even though the strings it was called with were compile-time constants. - Avoid unnecessary JS casts when the type of a value is known - Avoid (commonly unnecessary) copy for whitespace stripping PR-URL: #19598 Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>


Merge
domainandopaquestorage in URL parser:This just simplifies the code a bit, having multiple fields
in an union with the same type is usually just overhead.
Add move variant of
URLHost::ToString():This helps avoid unnecessary string copy operations, especially
since we control the lifetime of
URLHostobjects pretty well.Use const refs in node_url.cc where appropriate
Remove or reduce overly generous
.reserve()calls:These would otherwise keep a lot of unused memory lying around.
Return return values instead of unnecessary pointer arguments
Use more common/expressive variable names
Avoid macro use, reduce number of unnecessary JS strings:
There’s no reason for
GET,GET_AND_SETandUTF8STRINGto bemacros. Also,
GETwould previously create a JS string instancefor each single call, even though the strings it was called
with were compile-time constants.
Avoid unnecessary JS casts when the type of a value is known
Avoid (commonly unnecessary) copy for whitespace stripping
Checklist
make -j4 test(UNIX), orvcbuild test(Windows) passes