Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to compatible NumPy dtype for MaskedArray to_numpy #55058

Merged
merged 18 commits into from
Dec 1, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Sep 7, 2023

This still needs a whatsnew

@phofl phofl requested a review from WillAyd as a code owner September 7, 2023 20:53
@mroeschke mroeschke added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label Sep 8, 2023
@WillAyd
Copy link
Member

WillAyd commented Sep 11, 2023

I think this is pretty reasonable. Is this part of a larger set of changes or is this self-contained?

Probably worth a whatsnew

@phofl
Copy link
Member Author

phofl commented Sep 11, 2023

This is something we wanted to clean up for a while now, but yes I have to add a whatsnew (didn't get to it yet because this needs more than a single line)

@phofl
Copy link
Member Author

phofl commented Sep 30, 2023

Added the whatsnew

@phofl phofl added this to the 2.2 milestone Sep 30, 2023
@lukemanley
Copy link
Member

Does this close #48891?

@phofl
Copy link
Member Author

phofl commented Sep 30, 2023

yep

phofl and others added 2 commits October 15, 2023 16:00
# Conflicts:
#	pandas/tests/base/test_conversion.py
#	pandas/tests/frame/test_repr.py
@phofl
Copy link
Member Author

phofl commented Nov 16, 2023

cc @mroeschke

I think this is ready for a review

@@ -1527,6 +1528,8 @@ def _format_strings(self) -> list[str]:
if isinstance(values, Categorical):
# Categorical is special for now, so that we can preserve tzinfo
array = values._internal_get_values()
elif isinstance(values, BaseMaskedArray):
array = values.to_numpy(na_value=NA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep hat should work as well

@pytest.mark.parametrize("na_action", [None, "ignore"])
def test_map(self, data_missing, na_action):
result = data_missing.map(lambda x: x, na_action=na_action)
if data_missing.dtype == Float32Dtype():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there a solution to avoid this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with roundtripping through object, maybe your other suggestion above solves this.

No if your suggestion doesn't work

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do the same thing for ArrowExtensionArray.to_numpy?

@phofl
Copy link
Member Author

phofl commented Nov 29, 2023

Can we get this in if ci is green now? Just merged main to resolve whatsnew conflicts

phofl and others added 2 commits November 30, 2023 21:47
Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
@mroeschke mroeschke merged commit 2a0e253 into pandas-dev:main Dec 1, 2023
@mroeschke
Copy link
Member

Thanks @phofl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
4 participants