Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concat Series[datetimetz] & Series[category] loses timezone #23816

Closed
TomAugspurger opened this issue Nov 20, 2018 · 6 comments
Closed

Concat Series[datetimetz] & Series[category] loses timezone #23816

TomAugspurger opened this issue Nov 20, 2018 · 6 comments
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Milestone

Comments

@TomAugspurger
Copy link
Contributor

In [8]: a = pd.Series(pd.date_range('20170101', periods=4, tz='US/Pacific'))

In [9]: b = pd.Series(['a', 'b'], dtype='category')

In [10]: pd.concat([a, b], ignore_index=True)[0]
Out[10]: Timestamp('2017-01-01 08:00:00')

Out[10] should have a tz like the following

In [11]: pd.concat([a, b.astype(object)], ignore_index=True)[0]
Out[11]: Timestamp('2017-01-01 00:00:00-0800', tz='US/Pacific', freq='D')
@TomAugspurger TomAugspurger added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype labels Nov 20, 2018
@TomAugspurger TomAugspurger added this to the Contributions Welcome milestone Nov 20, 2018
@evangelinehl
Copy link
Contributor

Hi,

Can my team and I contribute to this issue?

Thanks.

@TomAugspurger
Copy link
Contributor Author

Sure, you can give it a shot.

I'm not sure where the best place is for the fix yet. Somewhere in core/dtypes/concat.py, but it'll be a bit tricky to find the best time to convert to objects.

@TomAugspurger
Copy link
Contributor Author

That's the offset from the timezone ("US/Pacific").

Looking more closely, I think the issue is at

def _concat_asobject(to_concat):
.

We call np.asarray(x) on non-categorical, like our datetimes. I wonder if we could rewrite that as

to_concat = [np.asarray(x.astype(object)) for x in to_concat]

We know we need object dtype anyway, so there's no avoiding that. We need the asarray(...) for things like SparseArray, since SparseArray.astype(object) is still sparse.

Maybe give that change a shot and see if the tests still pass.

@evangelinehl
Copy link
Contributor

Okay I'll take a look at the suggested change. Thanks!

@evangelinehl
Copy link
Contributor

This should be closed

@jreback
Copy link
Contributor

jreback commented Dec 1, 2018

@evangelineliu things are closed with a merged pull request

@gfyoung gfyoung added the Bug label Dec 2, 2018
@jreback jreback modified the milestones: Contributions Welcome, 0.24.0 Dec 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

4 participants