Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are multiple callable/dict groupers allowed in groupby? #22278

Open
toobaz opened this issue Aug 11, 2018 · 1 comment
Open

Are multiple callable/dict groupers allowed in groupby? #22278

toobaz opened this issue Aug 11, 2018 · 1 comment
Labels
Deprecate Functionality to remove in pandas Groupby

Comments

@toobaz
Copy link
Member

toobaz commented Aug 11, 2018

Problem description

The docs for DataFrame.groupby signature start with:

by : mapping, function, label, or list of labels
    Used to determine the groups for the groupby.

... but the code assumes that lists of mappings or functions can also be passed, and this is also tested, although with limited enthusiasm:

# this code path isn't used anywhere else

... and consistency (apparently that code path is used somewhere else):
grouped = wp.groupby([lambda x: x.month, lambda x: x.weekday()],

Expected Output

Either we disable/deprecate the possibility of passing lists of mappings, ore we document it.

I guess the latter is the desired outcome, since the code does not support the feature "by chance". Still I wanted to double check with @pandas-dev/pandas-core because

  • it is not a killer feature, as it is really easy to pass a single lambda that does the same job of a list of mappings (and more, like applying different mappings to specific levels of the index)
  • removing it would allow us to simplify the code quite a bit (e.g. get_group(...) fails for groupby(...) based on a function #22257 wouldn't have happened)
  • it is probably not much used

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.24.0.dev0+437.g33d70efb5
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.28.4
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1634.dev0+ge8120cf6d
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1
gcsfs: None

@toobaz toobaz added Docs API Design Needs Discussion Requires discussion from core team before further action labels Aug 11, 2018
@WillAyd
Copy link
Member

WillAyd commented Aug 11, 2018

I am +1 to disable entirely as I think the value of supporting this is relatively limited with potential for a high cost of development complexity / edge case coverage

@mroeschke mroeschke added Deprecate Functionality to remove in pandas Groupby and removed API Design Docs Needs Discussion Requires discussion from core team before further action labels Jun 21, 2021
@rhshadrach rhshadrach self-assigned this Apr 16, 2022
@rhshadrach rhshadrach removed their assignment Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Groupby
Projects
None yet
Development

No branches or pull requests

4 participants