Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterator on JSON objects #9

Merged
merged 3 commits into from
Mar 20, 2025
Merged

Conversation

Neradoc
Copy link
Contributor

@Neradoc Neradoc commented Mar 18, 2025

This adds 2 ways to iterate on an object's items following the APIs of dict().

  • The main iterator is for key in obj it iterates over the keys without consuming or loading the values. The current key's value can then be accessed with obj[key].
  • The TransientObject.items() method will iterate on item tuples fetching both key and value with for key, value in obj.items(). Because it fetches every value (be it a basic type or a Transient instance), it will use more memory.

So this works, although note that you can only get the key once unless it's a container too.

for key in stream["some_dict"]:
    if key == "needle":
        print(stream["some_dict"][key])

The new example json_stream_local_file_advanced.py uses the iterator to find keys that match a list of expected keys and returns a python dict that contains only those items. It's an example of how to retrieve a set of keys when the order is unknown.

I am not sure how to best document the iterator outside of the items() property.

Neradoc added 3 commits March 17, 2025 22:01
…f the current key.

Use common active_key, fix finish(), etc.
Rename example and use the key iteration.
Additional tests.
@justmobilize
Copy link
Contributor

@Neradoc will this also make it easy to implement:

for key, value in stream["some_dict"].items():

@Neradoc
Copy link
Contributor Author

Neradoc commented Mar 19, 2025

items() is implemented using the new _next_key() method and self[key].
Since it retrieves the values however, I advise using the other form when looking for a key.
It's still better memory-wise than loading the whole dictionary of course.

@tannewt
Copy link
Member

tannewt commented Mar 19, 2025

Does the original library do this? https://github.com/daggaz/json-stream (Do we care if it doesn't?)

@Neradoc
Copy link
Contributor Author

Neradoc commented Mar 19, 2025

The json_stream library has iterators, but doesn't do obj[key] like in my example.
This works in both:

import json_stream
source = io.BytesIO(b'{"a":1,"b":2,"c":3}')
stream = json_stream.load(source)

for key, value in stream.items():
    print(key, value)

so does:

for key in stream:
    print(key)

This only works in this version:

source = io.BytesIO(b'{"a":1,"b":2,"c":3}')
stream = json_stream.load(source)

for key in stream:
    print(key, stream[key])
json_stream.base.TransientAccessException: a not found in transient JSON stream or already passed in this stream

I would argue that being able to loop through keys and access the value only when needed is a good feature to have. Granted in most cases the values might be numbers, small strings or Transient subclasses, but it could be a big string.


Another difference introduced in #7 is the ability to access stream["dict"] multiple times as long as we are still "in that dict". This fails with the json_stream library:

source = io.BytesIO(b'{"a":{"x":10,"y":20},"b":2,"c":3}')
stream = json_stream.load(source)
print(stream["a"]["x"])
print(stream["a"]["y"])
   stream["a"]["y"]
    ~~~~~~^^^^^
[...]
json_stream.base.TransientAccessException: a not found in transient JSON stream or already passed in this stream

That's mostly a convenience one, since we can always save stream["a"] in a variable rather than reuse the notation.
And it only applies if the value (stream["a"]) is another Transient container (list or dict). Which makes sense because we need to keep a (small) container object in memory (at this point in the stream) to be able to skip to the end of it, but we don't want to cache literal values, since again they could be big strings.

@tannewt tannewt self-requested a review March 20, 2025 16:53
Copy link
Member

@tannewt tannewt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thank you!

@tannewt tannewt merged commit f9f0242 into adafruit:main Mar 20, 2025
1 check passed
adafruit-adabot added a commit to adafruit/Adafruit_CircuitPython_Bundle that referenced this pull request Apr 3, 2025
Updating https://github.com/adafruit/Adafruit_CircuitPython_INA228 to 1.0.1 from 1.0.0:
  > Update adafruit_ina228.py

Updating https://github.com/adafruit/Adafruit_CircuitPython_SSD1305 to 1.4.0 from 1.3.21:
  > Merge pull request adafruit/Adafruit_CircuitPython_SSD1305#16 from mikeysklar/ssd1305-white-module-col-offset

Updating https://github.com/adafruit/Adafruit_CircuitPython_TLV320 to 1.0.0 from 51c14aa:
  < Update README.rst

Updating https://github.com/adafruit/Adafruit_CircuitPython_Bitmap_Font to 2.3.0 from 2.2.0:
  > Merge pull request adafruit/Adafruit_CircuitPython_Bitmap_Font#70 from tannewt/cmap03

Updating https://github.com/adafruit/Adafruit_CircuitPython_Display_Text to 3.2.4 from 3.2.3:
  > Merge pull request adafruit/Adafruit_CircuitPython_Display_Text#219 from FoamyGuy/use_ruff

Updating https://github.com/adafruit/Adafruit_CircuitPython_JSON_Stream to 0.9.0 from 0.8.6:
  > Merge pull request adafruit/Adafruit_CircuitPython_JSON_Stream#9 from Neradoc/iterator-on-objects
  > Merge pull request adafruit/Adafruit_CircuitPython_JSON_Stream#8 from Neradoc/fix-string-in-string

Updating https://github.com/adafruit/Adafruit_CircuitPython_USB_Host_Descriptors to 0.2.1 from 0.1.4:
  > Merge pull request adafruit/Adafruit_CircuitPython_USB_Host_Descriptors#4 from FoamyGuy/two_mice_example
  > Merge pull request adafruit/Adafruit_CircuitPython_USB_Host_Descriptors#3 from FoamyGuy/find_mouse_helper

Updating https://github.com/adafruit/Adafruit_CircuitPython_Bundle/circuitpython_library_list.md to NA from NA:
  > Added the following libraries: Adafruit_CircuitPython_TLV320
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants