fs: remove chunk by chunk file read by anonrig · Pull Request #44276 · nodejs/node

anonrig · 2022-08-18T13:56:31Z

I'm still investigating why the performance is down by 20% for certain cases, but 68% performance looks like a good start.

I left the chunk-by-chunk implementation to support edge cases where the file size is unknown before reading it, where fstat returns 0.

fs/readfile-partitioned.js concurrent=1 len=1024 dur=1                        0.48 %       ±2.40% ±3.20% ±4.17%
fs/readfile-partitioned.js concurrent=1 len=16777216 dur=1           ***     19.44 %       ±4.43% ±5.91% ±7.74%
fs/readfile-partitioned.js concurrent=10 len=1024 dur=1                       1.24 %       ±2.71% ±3.60% ±4.69%
fs/readfile-partitioned.js concurrent=10 len=16777216 dur=1          ***    -68.32 %       ±1.90% ±2.55% ±3.37%
fs/readfile-promises.js concurrent=1 len=1024 duration=1                      2.00 %       ±2.88% ±3.84% ±5.00%
fs/readfile-promises.js concurrent=1 len=16777216 duration=1         ***     23.65 %       ±2.00% ±2.68% ±3.52%
fs/readfile-promises.js concurrent=10 len=1024 duration=1                    -0.51 %       ±0.55% ±0.74% ±0.96%
fs/readfile-promises.js concurrent=10 len=16777216 duration=1        ***    -16.45 %       ±3.41% ±4.53% ±5.90%
fs/readfile.js concurrent=1 len=1024 duration=1                              -2.52 %       ±3.28% ±4.37% ±5.69%
fs/readfile.js concurrent=1 len=16777216 duration=1                  ***     21.26 %       ±1.81% ±2.41% ±3.14%
fs/readfile.js concurrent=10 len=1024 duration=1                             -0.56 %       ±1.26% ±1.68% ±2.21%
fs/readfile.js concurrent=10 len=16777216 duration=1                 ***    -19.66 %       ±2.37% ±3.18% ±4.17%

anonrig · 2022-08-18T13:57:09Z

@mcollina @RafaelGSS

mcollina · 2022-08-18T14:05:14Z

It seems this is not improving things in all cases, so maybe there is more work to do on this one.

BridgeAR

The upper bound is there for partitioning the reads. For further information check #25741 and #17054.
We should not remove that.

Please also check the comment right above the upper bound: https://github.com/nodejs/node/pull/44276/files#diff-5cd422a9a64bc2c1275d70ba5dcafbc5ea55274432fffc5e4159126007dc4894L137-L138

anonrig · 2022-08-18T14:29:13Z

Thanks @BridgeAR for the links. It was really helpful towards the decisions that lead to this discussion.

So, removing chunks blocks the IO. Not-removing chunks increase execution speed.

Since we don't want to do that, but we also have to reduce the number of calls between JS and C++, there are not many options here. If we're going to leave the chunks with 512kb, the only optimization we can do at this point is to move open, fstat, close functionality inside fs/read c++ implementation to reduce the communication. Even after that, the number of calls between JS and C++ will be directly correlated by file_size / 512.

I'll be happy to dive into the correct path here if anyone would be kind to help, but afaik if we want to improve the performance, we need to pick our battles.

Here's my review of the current flow for fs.readFile: https://github.com/nodejs/node/discussions/44239

mcollina · 2022-08-18T15:33:42Z

I think the use case to optimize is reading 'utf8' text. This is exceptionally slow and uses 3x the amount of memory of the file size.

Since we don't want to do that, but we also have to reduce the number of calls between JS and C++, there are not many options here. If we're going to leave the chunks with 512kb, the only optimization we can do at this point is to move open, fstat, close functionality inside fs/read c++ implementation to reduce the communication. Even after that, the number of calls between JS and C++ will be directly correlated by file_size / 512.

I think you can move the accumulation of bytes in C++ and avoid calling JS at all.

anonrig · 2022-08-18T15:57:18Z

Thanks @mcollina. I created another pull request to update the benchmarks to include encoding. #44278 I'm going to close this pull request for now.

fs: remove chunk by chunk file read

4b507e0

nodejs-github-bot added fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. labels Aug 18, 2022

mcollina requested a review from bnoordhuis August 18, 2022 14:05

BridgeAR requested changes Aug 18, 2022

View reviewed changes

anonrig closed this Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fs: remove chunk by chunk file read#44276

fs: remove chunk by chunk file read#44276
anonrig wants to merge 1 commit intonodejs:mainfrom
anonrig:perf/fs-readfile

anonrig commented Aug 18, 2022

Uh oh!

anonrig commented Aug 18, 2022

Uh oh!

mcollina commented Aug 18, 2022

Uh oh!

BridgeAR left a comment

Uh oh!

anonrig commented Aug 18, 2022

Uh oh!

mcollina commented Aug 18, 2022

Uh oh!

anonrig commented Aug 18, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

anonrig commented Aug 18, 2022

Uh oh!

anonrig commented Aug 18, 2022

Uh oh!

mcollina commented Aug 18, 2022

Uh oh!

BridgeAR left a comment

Choose a reason for hiding this comment

Uh oh!

anonrig commented Aug 18, 2022

Uh oh!

mcollina commented Aug 18, 2022

Uh oh!

anonrig commented Aug 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

anonrig commented Aug 18, 2022 •

edited

Loading