Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enh arrow json extension #61103

Merged
merged 4 commits into from
Mar 12, 2025

Conversation

asharmalik19
Copy link
Contributor

Comment on lines 2268 to 2271
elif isinstance(pa_type, pa.ExtensionType):
return type(self)(pa_type.storage_type).type
elif isinstance(pa_type, pa.JsonType):
return str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif isinstance(pa_type, pa.ExtensionType):
return type(self)(pa_type.storage_type).type
elif isinstance(pa_type, pa.JsonType):
return str
elif isinstance(pa_type, pa.BaseExtensionType):
return type(self)(pa_type.storage_type).type

@@ -65,6 +65,7 @@ Other enhancements
- :class:`Rolling` and :class:`Expanding` now support ``nunique`` (:issue:`26958`)
- :class:`Rolling` and :class:`Expanding` now support aggregations ``first`` and ``last`` (:issue:`33155`)
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
- :meth: ``ArrowDtype.type`` now supports the pyarrow json data type (:issue:`60958`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :meth: ``ArrowDtype.type`` now supports the pyarrow json data type (:issue:`60958`)
- :class:`ArrowDtype` now supports ``pyarrowJsonType`` (:issue:`60958`)

@@ -61,6 +61,7 @@ Other enhancements
- :meth:`Series.cummin` and :meth:`Series.cummax` now supports :class:`CategoricalDtype` (:issue:`52335`)
- :meth:`Series.plot` now correctly handle the ``ylabel`` parameter for pie charts, allowing for explicit control over the y-axis label (:issue:`58239`)
- :meth:`DataFrame.plot.scatter` argument ``c`` now accepts a column of strings, where rows with the same string are colored identically (:issue:`16827` and :issue:`16485`)
- :class:`ArrowDtype` now supports ``pyarrowJsonType`` (:issue:`60958`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- :class:`ArrowDtype` now supports ``pyarrowJsonType`` (:issue:`60958`)
- :class:`ArrowDtype` now supports ``pyarrow.JsonType`` (:issue:`60958`)

@@ -3553,3 +3553,10 @@ def test_categorical_from_arrow_dictionary():
dtype="int64",
)
tm.assert_series_equal(result, expected)


def test_arrow_json_type():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to @pytest.mark.skipif(pa_version_under19p0, reason=...) as pa.json_ is new as of Pyarrow 19

@mroeschke mroeschke added the Arrow pyarrow functionality label Mar 12, 2025
@asharmalik19 asharmalik19 force-pushed the enh-arrow-json-extension branch from d2c4db1 to dced6fa Compare March 12, 2025 20:29
Improved extension type handling by using BaseExtensionType for consistent storage type resolution across all PyArrow extension types, including JSON.

Fixes pandas-dev#60958
@asharmalik19 asharmalik19 force-pushed the enh-arrow-json-extension branch from dced6fa to a945971 Compare March 12, 2025 20:58
@mroeschke mroeschke added this to the 3.0 milestone Mar 12, 2025
@mroeschke mroeschke merged commit 44c8f20 into pandas-dev:main Mar 12, 2025
42 checks passed
@mroeschke
Copy link
Member

Thanks @asharmalik19

@asharmalik19
Copy link
Contributor Author

Thanks @mroeschke for such quick feedback

@tswast
Copy link
Contributor

tswast commented Mar 24, 2025

Thank you so much for your help, @asharmalik19 and @mroeschke !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Support pa.json_ in arrow extension type
3 participants