Conversation
|
It seems this is not improving things in all cases, so maybe there is more work to do on this one. |
BridgeAR
left a comment
There was a problem hiding this comment.
The upper bound is there for partitioning the reads. For further information check #25741 and #17054.
We should not remove that.
Please also check the comment right above the upper bound: https://github.com/nodejs/node/pull/44276/files#diff-5cd422a9a64bc2c1275d70ba5dcafbc5ea55274432fffc5e4159126007dc4894L137-L138
|
Thanks @BridgeAR for the links. It was really helpful towards the decisions that lead to this discussion. So, removing chunks blocks the IO. Not-removing chunks increase execution speed. Since we don't want to do that, but we also have to reduce the number of calls between JS and C++, there are not many options here. If we're going to leave the chunks with 512kb, the only optimization we can do at this point is to move open, fstat, close functionality inside fs/read c++ implementation to reduce the communication. Even after that, the number of calls between JS and C++ will be directly correlated by file_size / 512. I'll be happy to dive into the correct path here if anyone would be kind to help, but afaik if we want to improve the performance, we need to pick our battles. Here's my review of the current flow for fs.readFile: https://github.com/nodejs/node/discussions/44239 |
|
I think the use case to optimize is reading
I think you can move the accumulation of bytes in C++ and avoid calling JS at all. |
I'm still investigating why the performance is down by 20% for certain cases, but 68% performance looks like a good start.
I left the chunk-by-chunk implementation to support edge cases where the file size is unknown before reading it, where fstat returns 0.