-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSONSerialization: Improve parsing of numbers #1657
Merged
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to preserve the UTF-8 behavior here of walking a pointer along, rather than appending a single character at a time here? (Given that the vast majority of JSON provided is given to us in UTF-8, it'd be nice to maintain the performance there.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think about this but strings on x86_64/ARM64 upto 15 ASCII characters should actually fit into a SSO so avoid a memory allocation which probably covers most numbers to be parsed.
The other issue is that strings passed to
Int64()
,UInt64()
andDouble()
cant have any invalid trailing characters so this avoids creating a string of all the available characters usingString(bytesNoCopy:)
which I believe still get validated according to the encoding which could end up reading through the whole of the rest of the JSON document and then scanning through it to determine the new shorter count.As an performance enhancement, when validating the characters its possible to count the number of integers and look for
.eE
and directly jump to parsing as aDouble()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more along the lines of
String.init(bytesNoCopy:length:encoding:freeWhenDone:)
which would allow us to avoid reading to the end of the document and wouldn't necessitate doing any further validation. Might be worth doing some small perf tests, just to see. (Or is this not available in s-cl-f?)As for looking for
[.eE]
— we do just this on Darwin: as soon as we encounter one of those characters we avoid parsing as an integer unnecessarily, which does save some time in common situations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately from https://github.com/apple/swift-corelibs-foundation/blob/a2b40951e8365da696d5105fd57a19c1f1c220ef/Foundation/NSString.swift#L1237:
So I don't think its that useful at the moment. I will look into bypassing the integer parsing where possible