Skip to content

Commit cd8057b

Browse files
TomAugspurgerpeterpanmj
authored andcommitted
Rename categories with Series (pandas-dev#17982)
* PERF/API: Treat series as array-like for rename_categories HEAD: ``` [ 50.00%] ··· Running categoricals.Categoricals3.time_rank_string_cat 6.63ms [ 50.00%] ····· [100.00%] ··· Running categoricals.Categoricals3.time_rank_string_cat_ordered 4.85ms ``` Closes pandas-dev#17981 * Redo docstring * Use list-like * Warn * Fix doc indent * Doc cleanup * More doc cleanup * Fix API reference * Typos
1 parent 29c548c commit cd8057b

File tree

3 files changed

+83
-9
lines changed

3 files changed

+83
-9
lines changed

doc/source/whatsnew/v0.21.0.txt

+30-1
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,36 @@ Now, to find prices per store/product, we can simply do:
239239
.pipe(lambda grp: grp.Revenue.sum()/grp.Quantity.sum())
240240
.unstack().round(2))
241241

242+
243+
.. _whatsnew_0210.enhancements.reanme_categories:
244+
245+
``Categorical.rename_categories`` accepts a dict-like
246+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
247+
248+
:meth:`~Series.cat.rename_categories` now accepts a dict-like argument for
249+
``new_categories``. The previous categories are looked up in the dictionary's
250+
keys and replaced if found. The behavior of missing and extra keys is the same
251+
as in :meth:`DataFrame.rename`.
252+
253+
.. ipython:: python
254+
255+
c = pd.Categorical(['a', 'a', 'b'])
256+
c.rename_categories({"a": "eh", "b": "bee"})
257+
258+
.. warning::
259+
260+
To assist with upgrading pandas, ``rename_categories`` treats ``Series`` as
261+
list-like. Typically, they are considered to be dict-like, and in a future
262+
version of pandas ``rename_categories`` will change to treat them as
263+
dict-like.
264+
265+
.. ipython:: python
266+
:okwarning:
267+
268+
c.rename_categories(pd.Series([0, 1], index=['a', 'c']))
269+
270+
Follow the warning message's recommendations.
271+
242272
See the :ref:`documentation <groupby.pipe>` for more.
243273

244274
.. _whatsnew_0210.enhancements.other:
@@ -267,7 +297,6 @@ Other Enhancements
267297
- :func:`DataFrame.items` and :func:`Series.items` are now present in both Python 2 and 3 and is lazy in all cases. (:issue:`13918`, :issue:`17213`)
268298
- :func:`Styler.where` has been implemented as a convenience for :func:`Styler.applymap`. (:issue:`17474`)
269299
- :func:`MultiIndex.is_monotonic_decreasing` has been implemented. Previously returned ``False`` in all cases. (:issue:`16554`)
270-
- :func:`Categorical.rename_categories` now accepts a dict-like argument as ``new_categories`` and only updates the categories found in that dict. (:issue:`17336`)
271300
- :func:`read_excel` raises ``ImportError`` with a better message if ``xlrd`` is not installed. (:issue:`17613`)
272301
- :func:`read_json` now accepts a ``chunksize`` parameter that can be used when ``lines=True``. If ``chunksize`` is passed, read_json now returns an iterator which reads in ``chunksize`` lines with each iteration. (:issue:`17048`)
273302
- :meth:`DataFrame.assign` will preserve the original order of ``**kwargs`` for Python 3.6+ users instead of sorting the column names. (:issue:`14207`)

pandas/core/categorical.py

+41-8
Original file line numberDiff line numberDiff line change
@@ -866,11 +866,6 @@ def set_categories(self, new_categories, ordered=None, rename=False,
866866
def rename_categories(self, new_categories, inplace=False):
867867
""" Renames categories.
868868
869-
The new categories can be either a list-like dict-like object.
870-
If it is list-like, all items must be unique and the number of items
871-
in the new categories must be the same as the number of items in the
872-
old categories.
873-
874869
Raises
875870
------
876871
ValueError
@@ -879,15 +874,30 @@ def rename_categories(self, new_categories, inplace=False):
879874
880875
Parameters
881876
----------
882-
new_categories : Index-like or dict-like (>=0.21.0)
883-
The renamed categories.
877+
new_categories : list-like or dict-like
878+
879+
* list-like: all items must be unique and the number of items in
880+
the new categories must match the existing number of categories.
881+
882+
* dict-like: specifies a mapping from
883+
old categories to new. Categories not contained in the mapping
884+
are passed through and extra categories in the mapping are
885+
ignored. *New in version 0.21.0*.
886+
887+
.. warning::
888+
889+
Currently, Series are considered list like. In a future version
890+
of pandas they'll be considered dict-like.
891+
884892
inplace : boolean (default: False)
885893
Whether or not to rename the categories inplace or return a copy of
886894
this categorical with renamed categories.
887895
888896
Returns
889897
-------
890-
cat : Categorical with renamed categories added or None if inplace.
898+
cat : Categorical or None
899+
With ``inplace=False``, the new categorical is returned.
900+
With ``inplace=True``, there is no return value.
891901
892902
See also
893903
--------
@@ -896,10 +906,33 @@ def rename_categories(self, new_categories, inplace=False):
896906
remove_categories
897907
remove_unused_categories
898908
set_categories
909+
910+
Examples
911+
--------
912+
>>> c = Categorical(['a', 'a', 'b'])
913+
>>> c.rename_categories([0, 1])
914+
[0, 0, 1]
915+
Categories (2, int64): [0, 1]
916+
917+
For dict-like ``new_categories``, extra keys are ignored and
918+
categories not in the dictionary are passed through
919+
920+
>>> c.rename_categories({'a': 'A', 'c': 'C'})
921+
[A, A, b]
922+
Categories (2, object): [A, b]
899923
"""
900924
inplace = validate_bool_kwarg(inplace, 'inplace')
901925
cat = self if inplace else self.copy()
902926

927+
if isinstance(new_categories, ABCSeries):
928+
msg = ("Treating Series 'new_categories' as a list-like and using "
929+
"the values. In a future version, 'rename_categories' will "
930+
"treat Series like a dictionary.\n"
931+
"For dict-like, use 'new_categories.to_dict()'\n"
932+
"For list-like, use 'new_categories.values'.")
933+
warn(msg, FutureWarning, stacklevel=2)
934+
new_categories = list(new_categories)
935+
903936
if is_dict_like(new_categories):
904937
cat.categories = [new_categories.get(item, item)
905938
for item in cat.categories]

pandas/tests/test_categorical.py

+12
Original file line numberDiff line numberDiff line change
@@ -1203,6 +1203,18 @@ def test_rename_categories(self):
12031203
with pytest.raises(ValueError):
12041204
cat.rename_categories([1, 2])
12051205

1206+
def test_rename_categories_series(self):
1207+
# https://github.com/pandas-dev/pandas/issues/17981
1208+
c = pd.Categorical(['a', 'b'])
1209+
xpr = "Treating Series 'new_categories' as a list-like "
1210+
with tm.assert_produces_warning(FutureWarning) as rec:
1211+
result = c.rename_categories(pd.Series([0, 1]))
1212+
1213+
assert len(rec) == 1
1214+
assert xpr in str(rec[0].message)
1215+
expected = pd.Categorical([0, 1])
1216+
tm.assert_categorical_equal(result, expected)
1217+
12061218
def test_rename_categories_dict(self):
12071219
# GH 17336
12081220
cat = pd.Categorical(['a', 'b', 'c', 'd'])

0 commit comments

Comments
 (0)