feat: (Series|DataFrame).explode #556

chelsea-lin · 2024-04-01T17:55:08Z

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

bigframes/core/__init__.py

bigframes/core/blocks.py

bigframes/core/compile/compiled.py

TrevorBergeron · 2024-04-03T00:11:53Z

bigframes/core/compile/compiled.py

+        zip_array = (
+            table_w_offset[offset_array_id]
+            .zip(*[table_w_offset[column_id] for column_id in column_ids])
+            .name(zip_array_id)
+        )


Have you tried directly indexing using the offsets rather than zipping? I'm worried compiling this creates an extra correlated JOIN.

Thanks! The new logical has a better performance!

TrevorBergeron · 2024-04-03T00:14:46Z

tests/system/small/test_series.py

+        ),
+    ],
+)
+def test_series_explode(data):


I'm not sure if any of these tests use the unordered path. Maybe use to_pandas(ordered=False) for a test or two? Also for unordered test cases make sure not to ignore index as resetting the index will invoke the ordered path.

It looks dtype validation will trigger the unordered path. Also I added the aggregation and to_pandas(ordered=False) as you suggested. Thanks!

TrevorBergeron

LGTM

* feat: (Series|DataFrame).explode * fixing schema and adding tests * fixing multi-index tests * add docs and fix tests

chelsea-lin requested a review from TrevorBergeron April 1, 2024 17:55

product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Apr 1, 2024

chelsea-lin force-pushed the main_chelsealin_explode branch from 475eeea to 9a56bb5 Compare April 2, 2024 04:38

product-auto-label bot added size: l Pull request size is large. and removed size: m Pull request size is medium. labels Apr 2, 2024

chelsea-lin marked this pull request as ready for review April 2, 2024 05:18

chelsea-lin requested review from a team as code owners April 2, 2024 05:18

TrevorBergeron reviewed Apr 3, 2024

View reviewed changes

chelsea-lin added 7 commits April 3, 2024 21:02

feat: (Series|DataFrame).explode

6db8ef3

fixing schema and adding tests

6dba89f

fixing multi-index tests

208f343

add docs and fix tests

38758a2

fix mypy

2743ca7

fixing tests

e0a90eb

address comments

9aef5a3

chelsea-lin force-pushed the main_chelsealin_explode branch from b648bdb to 9aef5a3 Compare April 3, 2024 21:02

chelsea-lin requested a review from TrevorBergeron April 3, 2024 22:19

fixing tests

ca8ea66

TrevorBergeron approved these changes Apr 4, 2024

View reviewed changes

chelsea-lin merged commit 9e32f57 into main Apr 4, 2024
16 checks passed

chelsea-lin deleted the main_chelsealin_explode branch April 4, 2024 16:42

release-please bot mentioned this pull request Apr 4, 2024

chore(main): release 1.1.0 #509

Merged

Genesis929 pushed a commit that referenced this pull request Apr 9, 2024

feat: (Series|DataFrame).explode (#556)

cc1d4e4

* feat: (Series|DataFrame).explode * fixing schema and adding tests * fixing multi-index tests * add docs and fix tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: (Series|DataFrame).explode #556

feat: (Series|DataFrame).explode #556

chelsea-lin commented Apr 1, 2024

TrevorBergeron Apr 3, 2024

chelsea-lin Apr 3, 2024

TrevorBergeron Apr 3, 2024

chelsea-lin Apr 3, 2024

TrevorBergeron left a comment

feat: (Series|DataFrame).explode #556

feat: (Series|DataFrame).explode #556

Conversation

chelsea-lin commented Apr 1, 2024

TrevorBergeron Apr 3, 2024

Choose a reason for hiding this comment

chelsea-lin Apr 3, 2024

Choose a reason for hiding this comment

TrevorBergeron Apr 3, 2024

Choose a reason for hiding this comment

chelsea-lin Apr 3, 2024

Choose a reason for hiding this comment

TrevorBergeron left a comment

Choose a reason for hiding this comment