-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame with tz-aware data and max(axis=1) returns NaN #10390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls show pd.show_versions() and df_datetime64.info() |
Hello! Hijacking this issue as I've also verified this behaviour (actually, it took a while to discover after upgrading to 0.19.0 and discovering some odd dropping of timezones - see #14524, which is a duplication of #13905). This behaviour was masked to my program previously as Pandas 0.18.1 was dropping the timezones from all relevant columns before I tried to perform this step. Once upgrading to 0.19.0 half the operations I was performing stopped dropping timezones, leading to mismatch between tz-aware and tz-naive timestamps which I've been chasing down the rabbit hole for a couple of days now. I've verified that this is present in pandas 0.18.1 and 0.19.0. From some stepping through of the code, this looks like a potential problem with the numpy implementations of This issue has meant that I've been forced to roll back to 0.18.1 to use the drop timezone bug in order to make the A small, complete example of the issueimport pandas as pd
df = pd.DataFrame(pd.date_range(start=pd.Timestamp('2016-01-01 00:00:00+00'), end=pd.Timestamp('2016-01-01 23:59:59+00'), freq='H'))
df.columns = ['a']
df['b'] = df.a.subtract(pd.Timedelta(seconds=60*60)) # if using pandas 0.19.0 to test, ensure that this is a series of timedeltas instead of a single - we want b and c to be tz-naive.
df[['a', 'b']].max() # This is fine, produces two numbers
df[['a', 'b']].max(axis=1) # This is not fine, produces a correctly sized series of NaN
df['c'] = df.a.subtract(pd.Timedelta(seconds=60)) # if using pandas 0.19.0 to test, ensure that this is a series of timedeltas instead of a single - we want b and c to be tz-naive.
df[['b', 'c']].max(axis=1) # This is fine, produces correctly sized series of valid timestamps without timezone
df[['a', 'b']].T.max() # produces an empty series. Expected OutputCalling Output of
|
I have a dataframe looks like this, and its column 2 is missing:

When I try to select the max date in each row, I got all NaN in return:

However, If the dataframe's type is float64, the selection work as expected.
The text was updated successfully, but these errors were encountered: