BUG: Series.setitem losing precision when enlarging #47342

phofl · 2022-06-14T10:13:35Z

xref BUG: assignment with enlargement gives object dtype with ExtensionArrays #32346 (Replace xxxx with the Github issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

We have to preserve the dtype here. This only fixes this case for Series, not for DataFames

mroeschke · 2022-06-15T19:14:26Z

pandas/core/indexing.py

-            # this preserves dtype of the value
-            new_values = Series([value])._values
+            # this preserves dtype of the value and of the object
+            if isna(value == value):


value == value to see if it's pd.NA? Not exactly clear why value is being compare to itself

Yep, is there a better way to achieve this?

I think isna(value) should just work right?

In [1]: pd.isna(pd.NA) Out[1]: True In [2]: pd.isna(np.nan) Out[2]: True

I don’t want to run in there when I get nan, because nan does not fit into int64 for example

I see. Well If you only want to check for pd.NA I think value is pd.NA is clearer IMO

The current code only checks for pd.NA, so my example above of pd.NaT isn't actually applicable right now. But in light of:

we should strive to have the result consistent regardless of enlargement or not.

We should maybe handle np.nan as well?

We currently treat np.nan as "missing value" when setting into a nullable series without enlargement. So then we should also treat it as "missing" in case of of enlargement? (and so preserve the nullable Int64 dtype, instead of converting to float64)

I like your idea of checking nans more broadly. So currently, if we are setting nan into Int64 it gets converted to pd.NA?

Yes, for example:

In [7]: s = pd.Series([1, 2, 3], dtype="Int64") In [8]: s[0] = np.nan In [9]: s Out[9]: 0 <NA> 1 2 2 3 dtype: Int64

Wasn’t aware of this. Will adjust accordingly. You are correct, this should be consistent

Could you have another look @jorisvandenbossche? I tried to add relevant cases for enlargement and non-enlargement to ensure consistency. Let me know if there is something missing.

There is one open case:

ser = pd.Series([1, 2], dtype="Int64") ser[1] = "a"

This raises while expansion casts to object

ser = pd.Series([1, 2], dtype="Int64") ser[2] = "a"

This is true for rhs="a" and rhs=pd.NaT. With non-ea dtypes we are casting to object. Do we want to be consistent here or is the difference intended?

jreback · 2022-07-01T22:18:10Z

thanks @phofl

phofl · 2022-07-01T22:19:02Z

Thx, I'll open a follow up for the ea case with non matching dtypes

BUG: Series.setitem losing precision when enlarging

3886376

phofl added Dtype Conversions Unexpected or buggy dtype conversions Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 14, 2022

Fix nan case and add test

f986959

mroeschke reviewed Jun 15, 2022

View reviewed changes

mroeschke added this to the 1.5 milestone Jun 15, 2022

phofl added 2 commits June 24, 2022 13:21

Handle multiple na cases

1c5c0d4

Remove test

d83391a

mroeschke approved these changes Jun 24, 2022

View reviewed changes

jreback approved these changes Jun 30, 2022

View reviewed changes

jreback merged commit bd9a6f0 into pandas-dev:main Jul 1, 2022

phofl deleted the dtype_casting_enlargement branch July 1, 2022 22:19

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022

BUG: Series.setitem losing precision when enlarging (pandas-dev#47342)

92f0e02

CloseChoice mentioned this pull request Jul 16, 2022

REGR: setting numeric value in Categorical Series with enlargement raise internal error #47677

Closed

This was referenced Apr 21, 2023

BUG: assignment with enlargement gives object dtype with ExtensionArrays #32346

Open

BUG: fix setitem with enlargment with pyarrow Scalar #52833

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Series.setitem losing precision when enlarging #47342

BUG: Series.setitem losing precision when enlarging #47342

phofl commented Jun 14, 2022

mroeschke Jun 15, 2022

phofl Jun 15, 2022

mroeschke Jun 15, 2022

phofl Jun 15, 2022

mroeschke Jun 15, 2022

jorisvandenbossche Jun 17, 2022

phofl Jun 17, 2022

jorisvandenbossche Jun 17, 2022

phofl Jun 17, 2022

phofl Jun 24, 2022

jreback commented Jul 1, 2022

phofl commented Jul 1, 2022

BUG: Series.setitem losing precision when enlarging #47342

BUG: Series.setitem losing precision when enlarging #47342

Conversation

phofl commented Jun 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jul 1, 2022

phofl commented Jul 1, 2022