-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API (string dtype): implement hierarchy (NA > NaN, pyarrow > python) for consistent comparisons between different string dtypes #61138
base: main
Are you sure you want to change the base?
Conversation
…for consistent comparisons between different string dtypes
expected = pd.array([None, None, None], dtype=expected_dtype) | ||
tm.assert_extension_array_equal(result, expected) | ||
# # with list | ||
# other = [None, None, "c"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to implement testing this in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was already implemented, just need to add this case back to the test. The original "array" test was actually testing with a list. I updated the test to now actually use an array (parametrized with all the different dtypes, to get all combinations of dtypes in both operands), and added a separate test with just the list.
9a0c382
to
4ebd93b
Compare
result = getattr(a, op_name)(pd.NA) | ||
expected = pd.array([None, None, None], dtype=expected_dtype) | ||
tm.assert_extension_array_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case of comparing with NA, we already have a dedicated test just above, so removing it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a whatsnew?
using_infer_string | ||
and all_arithmetic_operators == "__radd__" | ||
and ( | ||
(dtype.na_value is pd.NA) or (dtype.storage == "python" and HAS_PYARROW) | ||
dtype.na_value is pd.NA | ||
and not (not HAS_PYARROW and dtype.storage == "python") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This would be a bit cleaner as
using_infer_string
and all_arithmetic_operators == "__radd__"
and dtype.na_value is pd.NA
and (HAS_PYARROW or dtype.storage == "pyarrow")
Closes #60639
This does not yet handle the case of comparison to object dtype.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.