Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix concat series loss of timezone #24027

Merged
merged 25 commits into from
Dec 5, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d80afa0
BUG: Fix concat series loss of timezone
jakezimmer Nov 30, 2018
f4b751d
Merge branch 'master' of https://github.com/pandas-dev/pandas
evangelinehl Dec 1, 2018
159c4e6
Fixed naming error for is_datetimetz since this function is no longer…
evangelinehl Dec 1, 2018
2450097
Attempted to use _concat_compat to rectify the timezone bug
jakezimmer Dec 1, 2018
9cb20c4
Merge remote-tracking branch 'origin/master'
jakezimmer Dec 1, 2018
7f9dd52
Attempt to fix tz error with concat compat instead of union
jakezimmer Dec 1, 2018
6cb2022
changing behavior to be based on tz
jakezimmer Dec 1, 2018
a4da449
Attempting to fix differing dimensions bug
evangelinehl Dec 2, 2018
fe83e6d
Another attempt to fix dimensions bug
evangelinehl Dec 2, 2018
2cbb533
Just trying to test different versions here
evangelinehl Dec 2, 2018
f527dcc
Trying to fix dimensions bug now that Travis CI is passing but others…
evangelinehl Dec 2, 2018
583ce49
tests failed so changing it back to when travis ci succeeded
evangelinehl Dec 2, 2018
01a2c10
Changing it back because we're trying to figure out if concat_compat …
evangelinehl Dec 2, 2018
683dccf
Reverting back to version when all tests passed
evangelinehl Dec 3, 2018
857c6be
Restored blank lines
evangelinehl Dec 3, 2018
64da4c0
Added test case for the new tz output
evangelinehl Dec 3, 2018
9e699e4
Fixed style issues
evangelinehl Dec 3, 2018
64182c5
Fixed the whitespace issue in linting
jakezimmer Dec 3, 2018
b630d58
Merge branch 'master' into PR_TOOL_MERGE_PR_24027
jreback Dec 4, 2018
c7dcdb4
fix up
jreback Dec 4, 2018
165689e
updated whatsnew (v0.24.0) to reflect changes
jakezimmer Dec 4, 2018
43b2e2a
Merge branch 'master' into master
jakezimmer Dec 4, 2018
634c736
no changes since @jreback's fix up commit
jakezimmer Dec 5, 2018
0b86ef9
Update v0.24.0.rst
jakezimmer Dec 5, 2018
1867b3a
removed trailing whitespace
jakezimmer Dec 5, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1545,6 +1545,7 @@ Reshaping
- Bug in :meth:`DataFrame.append` with a :class:`Series` with a dateutil timezone would raise a ``TypeError`` (:issue:`23682`)
- Bug in ``Series`` construction when passing no data and ``dtype=str`` (:issue:`22477`)
- Bug in :func:`cut` with ``bins`` as an overlapping ``IntervalIndex`` where multiple bins were returned per item instead of raising a ``ValueError`` (:issue:`23980`)
- Bug in :func:`pandas.concat` when joining ``Series`` datetimetz with ``Series`` category would lose timezone (:issue:`23816`)
- Bug in :meth:`DataFrame.join` when joining on partial MultiIndex would drop names (:issue:`20452`).

.. _whatsnew_0240.bug_fixes.sparse:
Expand Down
18 changes: 8 additions & 10 deletions pandas/core/dtypes/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,15 +191,6 @@ def _concat_categorical(to_concat, axis=0):
A single array, preserving the combined dtypes
"""

def _concat_asobject(to_concat):
to_concat = [x.get_values() if is_categorical_dtype(x.dtype)
else np.asarray(x).ravel() for x in to_concat]
res = _concat_compat(to_concat)
if axis == 1:
return res.reshape(1, len(res))
else:
return res

# we could have object blocks and categoricals here
# if we only have a single categoricals then combine everything
# else its a non-compat categorical
Expand All @@ -214,7 +205,14 @@ def _concat_asobject(to_concat):
if all(first.is_dtype_equal(other) for other in to_concat[1:]):
return union_categoricals(categoricals)

return _concat_asobject(to_concat)
# extract the categoricals & coerce to object if needed
to_concat = [x.get_values() if is_categorical_dtype(x.dtype)
else np.asarray(x).ravel() if not is_datetime64tz_dtype(x)
else np.asarray(x.astype(object)) for x in to_concat]
result = _concat_compat(to_concat)
if axis == 1:
result = result.reshape(1, len(result))
return result


def union_categoricals(to_union, sort_categories=False, ignore_order=False):
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/reshape/test_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -2552,3 +2552,16 @@ def test_concat_series_name_npscalar_tuple(s1name, s2name):
result = pd.concat([s1, s2])
expected = pd.Series({'a': 1, 'b': 2, 'c': 5, 'd': 6})
tm.assert_series_equal(result, expected)


def test_concat_categorical_tz():
# GH-23816
a = pd.Series(pd.date_range('2017-01-01', periods=2, tz='US/Pacific'))
b = pd.Series(['a', 'b'], dtype='category')
result = pd.concat([a, b], ignore_index=True)
expected = pd.Series([
pd.Timestamp('2017-01-01', tz="US/Pacific"),
pd.Timestamp('2017-01-02', tz="US/Pacific"),
'a', 'b'
])
tm.assert_series_equal(result, expected)