Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: rank_1d #40546

Merged
merged 18 commits into from
Mar 23, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add comments, whitespace
mzeitlin11 committed Mar 20, 2021

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
commit fe6495a289b97d21a5bb559b09813eebe2781213
13 changes: 8 additions & 5 deletions pandas/_libs/algos.pyx
Original file line number Diff line number Diff line change
@@ -985,8 +985,9 @@ def rank_1d(
else:
mask = np.zeros(shape=len(masked_vals), dtype=np.uint8)

# If ascending is true and na_option == 'bottom',
# fill with the largest so NaN
# If ascending and na_option == 'bottom' or descending and
# na_option == 'top' -> we want to rank NaN as the highest
# so fill with the maximum value for the type
if ascending ^ (na_option == 'top'):
if rank_t is object:
nan_fill_val = Infinity()
@@ -997,6 +998,8 @@ def rank_1d(
else:
nan_fill_val = np.inf
order = (masked_vals, mask, labels)

# Otherwise, fill with the lowest value of the type
else:
if rank_t is object:
nan_fill_val = NegInfinity()
@@ -1036,8 +1039,8 @@ def rank_1d(
dups += 1
sum_ranks += i - grp_start + 1

next_val_diff = at_end or (masked_vals[lexsort_indexer[i]] !=
masked_vals[lexsort_indexer[i+1]])
next_val_diff = at_end or are_diff(masked_vals[lexsort_indexer[i]],
masked_vals[lexsort_indexer[i+1]])

# We'll need this check later anyway to determine group size, so just
# compute it here since shortcircuiting won't help
@@ -1058,7 +1061,7 @@ def rank_1d(
set_as_na = keep_na and mask[lexsort_indexer[i]]

# For all cases except TIEBREAK_FIRST when not setting
# nulls, we set the same value at each index
# nulls, we can set the same value at each index
if set_as_na:
computed_rank = NaN
grp_na_count = dups