-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Let MultiIndex.set_levels accept any iterable (#23273) #23291
Conversation
Hello @imankulov! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #23291 +/- ##
==========================================
+ Coverage 92.22% 92.22% +<.01%
==========================================
Files 169 169
Lines 50911 50913 +2
==========================================
+ Hits 46954 46956 +2
Misses 3957 3957
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a whatsnew note in bug fixes / indexing
@@ -389,6 +390,9 @@ def set_levels(self, levels, level=None, inplace=False, | |||
labels=[[0, 0, 1, 1], [0, 1, 0, 1]], | |||
names=[u'foo', u'bar']) | |||
""" | |||
if is_list_like(levels) and not is_subscriptable(levels): | |||
levels = list(levels) | |||
|
|||
if level is not None and not is_list_like(level): | |||
if not is_list_like(levels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just add
elif is_list_like(levels):
levels = list(levels)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried this approach, but unfortunately, it breaks functionality where one passes to set_levels CategoricalIndex (or in general, something else which is more complex than a list).
Specifically, this turns
index.set_levels(CategoricalIndex(list("bac")), 0)
into
index.set_levels(list("bac"), 0)
and breaks test_set_levels_categorical
I can add change this with
if is_list_like(levels) and not isinstance(levels, Index):
but I'm not sure what's better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 2nd would be fine, we don't need/want any more api checkers)
new_sizes = map(int, ['3', '2', '1']) | ||
index = pd.MultiIndex.from_arrays([sizes, colors], names=['size', 'color']) | ||
index2 = index.set_levels(new_sizes, level='size') | ||
assert_matching(index2.get_level_values('size'), [3, 2, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct the appropriate MultiIndex and use tm.assert_index_equal
pandas/core/dtypes/inference.py
Outdated
Check if the object is subscriptable. | ||
String types are not included as sequences here. | ||
|
||
Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not needed
Thanks for the review @jreback. Addressed all comments. Need advice on how to deal better with set_index(CategoricalIndex) |
@@ -389,6 +390,9 @@ def set_levels(self, levels, level=None, inplace=False, | |||
labels=[[0, 0, 1, 1], [0, 1, 0, 1]], | |||
names=[u'foo', u'bar']) | |||
""" | |||
if is_list_like(levels) and not is_subscriptable(levels): | |||
levels = list(levels) | |||
|
|||
if level is not None and not is_list_like(level): | |||
if not is_list_like(levels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 2nd would be fine, we don't need/want any more api checkers)
index = pd.MultiIndex.from_arrays([sizes, colors], names=['size', 'color']) | ||
|
||
new_sizes = map(int, ['3', '2', '1']) | ||
index = index.set_levels(new_sizes, level='size') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result =
use expected =
thanks! |
…ndas * repo_org/master: (23 commits) DOC: Add docstring validations for "See Also" section (pandas-dev#23143) TST: Fix test assertion (pandas-dev#23357) BUG: Handle Period in combine (pandas-dev#23350) REF: SparseArray imports (pandas-dev#23329) CI: Migrate some CircleCI jobs to Azure (pandas-dev#22992) DOC: update the is_month_start/is_month_end docstring (pandas-dev#23051) Partialy fix issue pandas-dev#23334 - isort pandas/core/groupby directory (pandas-dev#23341) TST: Add base test for extensionarray setitem pandas-dev#23300 (pandas-dev#23304) API: Add sparse Acessor (pandas-dev#23183) PERF: speed up CategoricalIndex.get_loc (pandas-dev#23235) fix and test incorrect case in delta_to_nanoseconds (pandas-dev#23302) BUG: Handle Datetimelike data in DataFrame.combine (pandas-dev#23317) TST: re-enable gbq tests (pandas-dev#23303) Switched references of App veyor to azure pipelines in the contributing CI section (pandas-dev#23311) isort imports-io (pandas-dev#23332) DOC: Added a Multi Index example for the Series.sum method (pandas-dev#23279) REF: Make PeriodArray an ExtensionArray (pandas-dev#22862) DOC: Added Examples for Series max (pandas-dev#23298) API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option (pandas-dev#22644) BUG: Let MultiIndex.set_levels accept any iterable (pandas-dev#23273) (pandas-dev#23291) ...
…xamples * repo_org/master: (83 commits) DOC: Add docstring validations for "See Also" section (pandas-dev#23143) TST: Fix test assertion (pandas-dev#23357) BUG: Handle Period in combine (pandas-dev#23350) REF: SparseArray imports (pandas-dev#23329) CI: Migrate some CircleCI jobs to Azure (pandas-dev#22992) DOC: update the is_month_start/is_month_end docstring (pandas-dev#23051) Partialy fix issue pandas-dev#23334 - isort pandas/core/groupby directory (pandas-dev#23341) TST: Add base test for extensionarray setitem pandas-dev#23300 (pandas-dev#23304) API: Add sparse Acessor (pandas-dev#23183) PERF: speed up CategoricalIndex.get_loc (pandas-dev#23235) fix and test incorrect case in delta_to_nanoseconds (pandas-dev#23302) BUG: Handle Datetimelike data in DataFrame.combine (pandas-dev#23317) TST: re-enable gbq tests (pandas-dev#23303) Switched references of App veyor to azure pipelines in the contributing CI section (pandas-dev#23311) isort imports-io (pandas-dev#23332) DOC: Added a Multi Index example for the Series.sum method (pandas-dev#23279) REF: Make PeriodArray an ExtensionArray (pandas-dev#22862) DOC: Added Examples for Series max (pandas-dev#23298) API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option (pandas-dev#22644) BUG: Let MultiIndex.set_levels accept any iterable (pandas-dev#23273) (pandas-dev#23291) ...
git diff upstream/master -u -- "*.py" | flake8 --diff
Quite surprisingly,
is_list_like
accepts a wide range of iterables including non-subscriptable objects. Later in the codeset_levels
implicitly supposes that passed instance can return its 0'th element by index, which is not always the case, as provided in the example of the #23273.To address the issue, a new helper function
is_subscriptable
is added, and if passed levels look like a list, but are not subscriptable, they are explicltly converted to a list withlist(levels)