-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add masked algorithm for mean() #34754
Comments
Hey @jorisvandenbossche, Can I work on this issue? I'm new to open sourcing |
Sure! |
take |
Hey @jorisvandenbossche, Please correct me if i'm wrong anywhere! I've gone through the numpy version of masked mean and implemented a mean function in the How do I test the time of this function relative to the older one? |
@Akshatt the easiest is if you already open a PR with what you've got (you can indicate the PR as "draft" and eg put WIP in the title), that makes it easier to give feedback
You can do it similarly as what I showed in the top post in this issue. I am using the |
@jorisvandenbossche Okay, got it! I've created a draft pull request #34814. |
Similarly as we now have masked implementations for sum, prod, min and max for the nullable integer array (first PR #30982, now lives at https://github.com/pandas-dev/pandas/blob/master/pandas/core/array_algos/masked_reductions.py), we can add one for the
mean
reduction as well.Very rough check gives a nice speed-up:
The
nanmean
version lives here: https://github.com/pandas-dev/pandas/blob/master/pandas/core/nanops.py#L517And as reference, numpy is also adding a version that accepts a mask: numpy/numpy#15852 (which could be used in the future, and as inspiration for the implementation now).
The text was updated successfully, but these errors were encountered: