-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Suppport masks in median groupby algo #48387
Conversation
pandas/_libs/groupby.pyx
Outdated
@@ -93,6 +124,13 @@ cdef inline float64_t median_linear(float64_t* a, int n) nogil: | |||
a = tmp | |||
n -= na_count | |||
|
|||
return calc_median_linear(a, n, na_count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you free a
here instead of in calc_median_linear
? While it does mean you'll have to free in both this method and the masked version, it makes calc_median_linear
have no side-effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it, my train of thought went the other way, if moved inside the function the caller would not forget it. But no real preference on my side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @phofl |
* ENH: Suppport masks in median groupby algo * Avoid float cast * Out dtype * Add cast * Revert algos * Add whatsnew * Deduplicate * Fix * Add type hints * Move free
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Only a small speed up (5-10%, depending on the number of NAs). In general, this is needed to streamline the code base in ops.py after everything supports masks.