ENH: add timedelta as valid type for interpolate with method='time' #14799

aileronajay · 2016-12-05T00:24:22Z

closes interpolate w/ method=time does not work with timedeltas #6424
tests added / passed
passes git diff upstream/master | flake8 --diff
whatsnew entry

aileronajay · 2016-12-05T00:26:59Z

this PR is currently a WIP, tests have not been added yet. This iteration is for initial reviews

aileronajay · 2016-12-05T01:33:47Z

@TomAugspurger the change that is currently in this PR will support timedelta as part of the index and the case presented in the issue

pd.DataFrame({'v':[1,np.nan,5]},index=pd.to_timedelta([1,2,3], unit="d")).interpolate(method="time")

will work

pd.DataFrame({'v':[1,np.nan,5]},index=pd.to_timedelta([1,2,3], unit="d")).interpolate(method="time")
v
1 days 1.0
2 days 3.0
3 days 5.0

Though i think what we expect here is that interpolate should work if the passed data is of type datetime or timedelta, something like this (below) should work (like how cut was enhanced). Is this the expectation?

pd.DataFrame({'v':[np.datetime64('2012-01-01'),np.nan,np.datetime64('2012-01-03')]},index=[1,2,3]).interpolate()
v
1 2012-01-01
2 NaT
3 2012-01-03

aileronajay · 2016-12-05T21:28:23Z

@jreback @jorisvandenbossche @TomAugspurger need pointers about expected functionality. I have added support for time delta as index when interpolating. Should we also be able to interpolate between input time data?

TomAugspurger · 2016-12-05T22:31:31Z

@aileronajay IIRC the original issue #6424 was about interpolating when the index was timedeltas. We can support interpolating time delta data like in your second example as well. That's fine either in this PR or a separate one.

aileronajay · 2016-12-05T22:48:54Z

@TomAugspurger i will then add support for timedelta as index in this PR and create a separate PR for interpolating time data

codecov-io · 2016-12-06T04:35:07Z

Current coverage is 85.26% (diff: 100%)

Merging #14799 into master will not change coverage

@@             master     #14799   diff @@
==========================================
  Files           144        144          
  Lines         50979      50979          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits          43469      43469          
  Misses         7510       7510          
  Partials          0          0

Powered by Codecov. Last update 36bb8af...fff59f5

aileronajay · 2016-12-07T01:04:44Z

are there other changes required in this PR? (like more tests?) @sinhrks @jreback @TomAugspurger @jorisvandenbossche

jreback · 2016-12-07T01:17:42Z

pandas/core/missing.py

@@ -187,7 +187,8 @@ def _interp_limit(invalid, fw_limit, bw_limit):
        if method in ('values', 'index'):
            inds = np.asarray(xvalues)
            # hack for DatetimeIndex, #1646
-            if issubclass(inds.dtype.type, np.datetime64):
+            if (issubclass(inds.dtype.type, np.datetime64) or
+                    issubclass(inds.dtype.type, np.timedelta64)):


this should use the is_datetime64_dtype functions instead

or better yet the needs_i8_conversion function

@jreback I have made this change now

aileronajay · 2016-12-07T18:05:42Z

@jreback i have made the change requested in the review, are there other changes that need to be taken care of?

jorisvandenbossche

Can you add a small notice in the v0.20.0.txt whatsnew file (in the 'other enhancements' section)

jorisvandenbossche · 2016-12-07T22:37:22Z

pandas/tests/series/test_missing.py

@@ -891,6 +891,16 @@ def test_spline_error(self):
        with tm.assertRaises(ValueError):
            s.interpolate(method='spline', order=0)

+    def test_interp_timedelta64(self):
+        # GH 6424
+        tm._skip_if_no_scipy()


Is scipy needed for this interpolate method?

It shouldn't be.

the interpolate method being used over here is 'time', i dont think it requires scipy, i just had this notion that we ignore the interpolation tests if scipy is absent, on a more thorough check i found that is not the case

I have removed the skip if no scipy statement

TomAugspurger · 2016-12-07T22:45:16Z

pandas/core/missing.py

@@ -187,7 +188,7 @@ def _interp_limit(invalid, fw_limit, bw_limit):
        if method in ('values', 'index'):
            inds = np.asarray(xvalues)
            # hack for DatetimeIndex, #1646
-            if issubclass(inds.dtype.type, np.datetime64):
+            if (needs_i8_conversion(inds.dtype.type)):


You can remove the redundant outer pair of parenthesis.

@TomAugspurger i have addressed this review comment

TomAugspurger · 2016-12-07T22:48:32Z

pandas/tests/series/test_missing.py

@@ -891,6 +891,16 @@ def test_spline_error(self):
        with tm.assertRaises(ValueError):
            s.interpolate(method='spline', order=0)

+    def test_interp_timedelta64(self):


Can you add another test with non-uniform spacing in the index. e.g. pd.to_timedelta([1, 2, 4]) and ensure that it gets interpolated to the correct value.

sure, would it be better if i add it as one more in the existing test or as a new test?

@TomAugspurger i have added non uniform spacing as a test case now

aileronajay · 2016-12-09T17:32:42Z

@jreback I have made the suggested changes

jreback · 2016-12-09T19:31:14Z

doc/source/whatsnew/v0.20.0.txt

@@ -59,6 +59,7 @@ Other enhancements
 - ``Series`` provides a ``to_excel`` method to output Excel files (:issue:`8825`)
 - The ``usecols`` argument in ``pd.read_csv`` now accepts a callable function as a value  (:issue:`14154`)
 - ``pd.DataFrame.plot`` now prints a title above each subplot if ``suplots=True`` and ``title`` is a list of strings (:issue:`14753`)
+-  The ``pd.Series.interpolate`` now supports timedelta as index type(:issue:`6424`)


with method='time' (also space before (:issue:...

jreback · 2016-12-09T19:31:50Z

minor doc change. ping on green.

jreback · 2016-12-09T21:50:09Z

doc/source/whatsnew/v0.20.0.txt

@@ -59,6 +59,7 @@ Other enhancements
 - ``Series`` provides a ``to_excel`` method to output Excel files (:issue:`8825`)
 - The ``usecols`` argument in ``pd.read_csv`` now accepts a callable function as a value  (:issue:`14154`)
 - ``pd.DataFrame.plot`` now prints a title above each subplot if ``suplots=True`` and ``title`` is a list of strings (:issue:`14753`)
+-  The ``pd.Series.interpolate`` now supports timedelta as index type with method=`time` (:issue:`6424`)


small edit, but I can do on merge:

remove the leading The and method='time' needs to be in double backticks.

timedelta as *an* index type (add an)

@jreback should i make these changes or will be able to do that during merge?

aileronajay · 2016-12-10T03:06:12Z

@jreback there was a clean run in travis, appveyor failed after running for an hour, there appears to be a problem with appveyor

jorisvandenbossche · 2016-12-10T11:19:57Z

@aileronajay Thanks!

* origin/master: (22 commits) BUG: astype falsely converts inf to integer (GH14265) (pandas-dev#14343) BUG: Apply min_itemsize to index even when not appending DOC: warning section on memory overflow when joining/merging dataframes on index with duplicate keys (pandas-dev#14788) BLD: missing - on secure BLD: new access token on pandas-dev TST: Test DatetimeIndex weekend offset (pandas-dev#14853) BLD: escape GH_TOKEN in build_docs TST: Correct results with np.size and crosstab (pandas-dev#4003) (pandas-dev#14755) Frame benchmarking sum instead of mean (pandas-dev#14824) CLN: lint of test_base.py BUG: Allow TZ-aware DatetimeIndex in merge_asof() (pandas-dev#14844) BUG: GH11847 Unstack with mixed dtypes coerces everything to object TST: skip testing on windows for specific formatting which sometimes hangs (pandas-dev#14851) BLD: try new gh token for pandas-docs CLN/PERF: clean-up of the benchmarks (pandas-dev#14099) ENH: add timedelta as valid type for interpolate with method='time' (pandas-dev#14799) DOC: add section on groupby().rolling/expanding/resample (pandas-dev#14801) TST: add test to confirm GH14606 (specify category dtype for empty) (pandas-dev#14752) BLD: use org name in build-docs.sh BF(TST): use = (native) instead of < (little endian) for target data types (pandas-dev#14832) ...

* commit 'v0.19.0-174-g81a2f79': (156 commits) BLD: escape GH_TOKEN in build_docs TST: Correct results with np.size and crosstab (pandas-dev#4003) (pandas-dev#14755) Frame benchmarking sum instead of mean (pandas-dev#14824) CLN: lint of test_base.py BUG: Allow TZ-aware DatetimeIndex in merge_asof() (pandas-dev#14844) BUG: GH11847 Unstack with mixed dtypes coerces everything to object TST: skip testing on windows for specific formatting which sometimes hangs (pandas-dev#14851) BLD: try new gh token for pandas-docs CLN/PERF: clean-up of the benchmarks (pandas-dev#14099) ENH: add timedelta as valid type for interpolate with method='time' (pandas-dev#14799) DOC: add section on groupby().rolling/expanding/resample (pandas-dev#14801) TST: add test to confirm GH14606 (specify category dtype for empty) (pandas-dev#14752) BLD: use org name in build-docs.sh BF(TST): use = (native) instead of < (little endian) for target data types (pandas-dev#14832) ENH: Introduce UnsortedIndexError GH11897 (pandas-dev#14762) ENH: Add the ability to have a separate title for each subplot when plotting (pandas-dev#14753) DOC: Fix grammar and formatting typos (pandas-dev#14803) BLD: try new build credentials for pandas-docs TST: Test pivot with categorical data MAINT: Cleanup pandas/src/parser (pandas-dev#14740) ...

release 0.19.1 was from release branch * releases: (156 commits) BLD: escape GH_TOKEN in build_docs TST: Correct results with np.size and crosstab (pandas-dev#4003) (pandas-dev#14755) Frame benchmarking sum instead of mean (pandas-dev#14824) CLN: lint of test_base.py BUG: Allow TZ-aware DatetimeIndex in merge_asof() (pandas-dev#14844) BUG: GH11847 Unstack with mixed dtypes coerces everything to object TST: skip testing on windows for specific formatting which sometimes hangs (pandas-dev#14851) BLD: try new gh token for pandas-docs CLN/PERF: clean-up of the benchmarks (pandas-dev#14099) ENH: add timedelta as valid type for interpolate with method='time' (pandas-dev#14799) DOC: add section on groupby().rolling/expanding/resample (pandas-dev#14801) TST: add test to confirm GH14606 (specify category dtype for empty) (pandas-dev#14752) BLD: use org name in build-docs.sh BF(TST): use = (native) instead of < (little endian) for target data types (pandas-dev#14832) ENH: Introduce UnsortedIndexError GH11897 (pandas-dev#14762) ENH: Add the ability to have a separate title for each subplot when plotting (pandas-dev#14753) DOC: Fix grammar and formatting typos (pandas-dev#14803) BLD: try new build credentials for pandas-docs TST: Test pivot with categorical data MAINT: Cleanup pandas/src/parser (pandas-dev#14740) ...

…andas-dev#14799)

aileronajay mentioned this pull request Dec 5, 2016

interpolate w/ method=time does not work with timedeltas #6424

Closed

sinhrks added Dtype Conversions Unexpected or buggy dtype conversions Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Timedelta Timedelta data type labels Dec 6, 2016

jreback requested changes Dec 7, 2016

View reviewed changes

jorisvandenbossche reviewed Dec 7, 2016

View reviewed changes

TomAugspurger reviewed Dec 7, 2016

View reviewed changes

aileronajay force-pushed the interpolate_enhance branch from 90a8086 to 1cd8575 Compare December 7, 2016 23:04

aileronajay added 6 commits December 9, 2016 12:23

added timedelta as valid type for conversion to integer

d87273d

added test case for change

822f55e

updated test with correct issue number and timedelta as index

81dab58

checking if index is of time type using needs_i8_conversion

18b756c

code review changes

95722ff

rebase changes

579c4bb

aileronajay force-pushed the interpolate_enhance branch from 1cd8575 to 579c4bb Compare December 9, 2016 17:31

jreback reviewed Dec 9, 2016

View reviewed changes

jreback approved these changes Dec 9, 2016

View reviewed changes

whatsnew changes

b72ab91

jreback reviewed Dec 9, 2016

View reviewed changes

whatsnew message typos

fff59f5

edit whatsnew line

562a793

jorisvandenbossche changed the title ~~added timedelta as valid type for conversion to integer~~ ENH: add timedelta as valid type for interpolate with method='time' Dec 10, 2016

jorisvandenbossche merged commit 1dbc7be into pandas-dev:master Dec 10, 2016

jorisvandenbossche added this to the 0.20.0 milestone Dec 10, 2016

ischurov pushed a commit to ischurov/pandas that referenced this pull request Dec 19, 2016

ENH: add timedelta as valid type for interpolate with method='time' (p…

f7537eb

…andas-dev#14799)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: add timedelta as valid type for interpolate with method='time' #14799

ENH: add timedelta as valid type for interpolate with method='time' #14799

aileronajay commented Dec 5, 2016 •

edited

Loading

aileronajay commented Dec 5, 2016

aileronajay commented Dec 5, 2016

aileronajay commented Dec 5, 2016

TomAugspurger commented Dec 5, 2016

aileronajay commented Dec 5, 2016

codecov-io commented Dec 6, 2016 •

edited

Loading

aileronajay commented Dec 7, 2016

jreback Dec 7, 2016

jreback Dec 7, 2016

aileronajay Dec 7, 2016

aileronajay commented Dec 7, 2016

jorisvandenbossche left a comment

jorisvandenbossche Dec 7, 2016

TomAugspurger Dec 7, 2016

aileronajay Dec 7, 2016

aileronajay Dec 7, 2016

TomAugspurger Dec 7, 2016

aileronajay Dec 7, 2016

TomAugspurger Dec 7, 2016

aileronajay Dec 7, 2016

aileronajay Dec 7, 2016

aileronajay commented Dec 9, 2016

jreback Dec 9, 2016

jreback commented Dec 9, 2016

jreback Dec 9, 2016

aileronajay Dec 9, 2016

aileronajay commented Dec 10, 2016

jorisvandenbossche commented Dec 10, 2016

ENH: add timedelta as valid type for interpolate with method='time' #14799

ENH: add timedelta as valid type for interpolate with method='time' #14799

Conversation

aileronajay commented Dec 5, 2016 • edited Loading

aileronajay commented Dec 5, 2016

aileronajay commented Dec 5, 2016

aileronajay commented Dec 5, 2016

TomAugspurger commented Dec 5, 2016

aileronajay commented Dec 5, 2016

codecov-io commented Dec 6, 2016 • edited Loading

Current coverage is 85.26% (diff: 100%)

aileronajay commented Dec 7, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aileronajay commented Dec 7, 2016

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aileronajay commented Dec 9, 2016

Choose a reason for hiding this comment

jreback commented Dec 9, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aileronajay commented Dec 10, 2016

jorisvandenbossche commented Dec 10, 2016

aileronajay commented Dec 5, 2016 •

edited

Loading

codecov-io commented Dec 6, 2016 •

edited

Loading