-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Remove ArrowStringArray and StringDtype("pyarrow") #48469
Comments
Noting that there was discussion to not immediately deprecate Also Nonetheless, agreed that |
As I mentioned in #47818 (review), I am personally -1 on moving away from |
Is the idea here to introduce storage parameter/option/... for other dtypes so that "int64" could alias to one of, say, "int64[pyarrow]", "int64[masked]", "int64[numpy]"? |
What if we used |
The real issue is
|
I'm going to close this issue since we are not going to "Remove ArrowStringArray and StringDtype("pyarrow")" anytime soon. Conversely, it will effectively become the default, albeit a variant using nan semantics, in pandas 3.0. There maybe a few points raised in this discussion that participants may feel have not been addressed, but I think they can be opened as separate issues if needed. |
Feature Type
Problem Description
I wish I could use pandas to create pyarrow backend Series for strings.
I wish there was a single data type and single extension array for strings (rather than 2).
Currently, we have 2 pyarrow data types & arrays for strings
StringDtype("pyarrow")
backend by arrays.ArrowStringArrayArrowDtype(pa.string())
backend by arrays.ArrowExtensionArrayI propose we use
ArrowDtype(pa.string())
and ArrowExtensionArray.Feature Description
Alternative Solutions
Additional Context
The text was updated successfully, but these errors were encountered: