Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Series construction sharing data with DTI #22113

Closed
wants to merge 6 commits into from

Conversation

makbigc
Copy link
Contributor

@makbigc makbigc commented Jul 29, 2018

add a type check for DatetimeIndex and then pass a deep copy to create a Series object

closes #21907

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a test

@@ -434,7 +434,7 @@ Indexing
- Fixed ``DataFrame[np.nan]`` when columns are non-unique (:issue:`21428`)
- Bug when indexing :class:`DatetimeIndex` with nanosecond resolution dates and timezones (:issue:`11679`)
- Bug where indexing with a Numpy array containing negative values would mutate the indexer (:issue:`21867`)

- Bug when :class:`Series` was created from :class:`DatetimeIndex`, the series shares the same underneath data with that index (:issue:`21907`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need the word underneath.

@@ -206,7 +206,11 @@ def __init__(self, data=None, index=None, dtype=None, name=None,
data = data.astype(dtype)
else:
# need to copy to avoid aliasing issues
data = data._values.copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't passing deep=True simply solve this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem only occurs with tz-aware cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a good solution here, to special case things. pass deep=True here; if this doesn't work, then the copy constructor is broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this bug is caused by the return value of _values. In datetimes.py, _values returns the DatetimeIndex itself for tz-aware and ndarray for tz-naive.

    @property
    def _values(self):
        # tz-naive -> ndarray
        # tz-aware -> DatetimeIndex
        if self.tz is not None:
            return self
        else:
            return self.values

That 's why this problem only occurs with tz-aware cases as jbrockmendel said.


In [2]: dti1 = pd.date_range('2016-01-01', periods=10, tz='US/Pacific')

In [3]: dti2 = pd.date_range('2016-01-01', periods=10)

In [4]: ser1 = pd.Series(dti1)

In [5]: ser2 = pd.Series(dti2)

In [6]: ser1[::3]=pd.NaT

In [7]: ser2[::3]=pd.NaT

In [8]: dti1
Out[8]: 
DatetimeIndex([                      'NaT', '2016-01-02 00:00:00-08:00',
               '2016-01-03 00:00:00-08:00',                       'NaT',
               '2016-01-05 00:00:00-08:00', '2016-01-06 00:00:00-08:00',
                                     'NaT', '2016-01-08 00:00:00-08:00',
               '2016-01-09 00:00:00-08:00',                       'NaT'],
              dtype='datetime64[ns, US/Pacific]', freq='D')

In [9]: dti2
Out[9]: 
DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10'],
              dtype='datetime64[ns]', freq='D')

@jreback jreback changed the title Fix Bug #21907 BUG: Series construction sharing data with DTI Jul 29, 2018
@jreback
Copy link
Contributor

jreback commented Jul 29, 2018

also pls for all indices via the indices fixture, we DO test construction on all indices, so maybe you can just add onto these tests, look in pandas/tests/series/test_constructor.py

@jreback jreback added the Bug label Jul 29, 2018
@jreback
Copy link
Contributor

jreback commented Nov 4, 2018

can you rebase and update

@pep8speaks
Copy link

Hello @makbigc! Thanks for updating the PR.

@codecov
Copy link

codecov bot commented Nov 18, 2018

Codecov Report

Merging #22113 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22113      +/-   ##
==========================================
+ Coverage   92.23%   92.28%   +0.05%     
==========================================
  Files         161      161              
  Lines       51408    51451      +43     
==========================================
+ Hits        47416    47483      +67     
+ Misses       3992     3968      -24
Flag Coverage Δ
#multiple 90.68% <100%> (+0.05%) ⬆️
#single 42.28% <100%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/series.py 93.69% <100%> (+0.01%) ⬆️
pandas/core/arrays/datetimelike.py 95.96% <0%> (-0.19%) ⬇️
pandas/io/formats/format.py 97.76% <0%> (-0.12%) ⬇️
pandas/core/arrays/categorical.py 95.35% <0%> (ø) ⬆️
pandas/core/ops.py 94.27% <0%> (+0.01%) ⬆️
pandas/core/generic.py 96.84% <0%> (+0.01%) ⬆️
pandas/io/parsers.py 95.57% <0%> (+0.01%) ⬆️
pandas/core/arrays/datetimes.py 98.51% <0%> (+0.03%) ⬆️
pandas/core/indexes/datetimes.py 96.2% <0%> (+0.05%) ⬆️
pandas/core/arrays/timedeltas.py 96.09% <0%> (+0.45%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1250500...aba2368. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Dec 4, 2018

closing as superseded by #24096

@jreback jreback closed this Dec 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pd.Series(dti) shares data with dti --> DatetimeIndex mutated
4 participants