Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add a ratio() Method for Series and dataframes. Alternatively, consider removing similar methods at this functionality level (e.g. diff() and pct_change()) #60801

Open
2 of 3 tasks
DrorDr opened this issue Jan 27, 2025 · 0 comments
Assignees
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@DrorDr
Copy link

DrorDr commented Jan 27, 2025

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

We propose adding a ratio() method to pandas.
Alternatively, if this request is deemed unnecessary because it is a "one-liner," we suggest re-evaluating whether diff() and pct_change() (other one-liners) should remain in the library for consistency.

===
Currently, pandas provides the diff() method to calculate differences between consecutive elements in a Series or DataFrame. However, there is no equivalent method for calculating ratios.
This absence creates the following issues:

  1. Users must implement ratios manually with df / df.shift(), which is repetitive, error-prone, and less readable than a dedicated method.
  2. Users are often tempted to misuse pct_change() as a substitute for ratios, but pct_change() introduces an adjustment ((current / previous - 1)) and is misleadingly named.

Adding a ratio() method would simplify a fundamental operation, enhance code clarity, and align pandas' functionality with its philosophy of making common operations easy and intuitive.

Feature Description

The ratio() method would calculate the ratio of consecutive elements in a Series or DataFrame along a specified axis, similar to how diff() computes differences.

Behavior:

  • By default, ratio() divides the current element by the element n rows before it (n is defined by the periods parameter, default is 1).
  • Missing values (e.g., from shift()) would propagate as NaN, consistent with pandas behavior.

Example Usage:

# Input data
df = pd.DataFrame({"col": [2, 4, 8, 16]})

# Current workaround
df["ratios"] = df["col"] / df["col"].shift(1)

# Proposed functionality
df["ratios"] = df["col"].ratio()

# Expected output
#     col  ratios
# 0     2     NaN
# 1     4     2.0
# 2     8     2.0
# 3    16     2.0

Alternative Solutions

  1. Existing Functionality (df / df.shift()):
    This is the current workaround but is repetitive, less readable, and more prone to user error.

  2. Misusing pct_change():
    While pct_change() appears similar, it computes (current / previous - 1), which is not a true ratio. This method is often misinterpreted and introduces unnecessary complexity.

  3. 3rd Party Packages:
    For the basic functionality - simpler to use the 1-liner
    For optimized or more generic functionality: We are not aware of any 3rd party package for this.

Additional Context

  1. Parity with Existing Methods:
    Pandas already provides diff() for additive differences and both cumsum() and cumprod() for cumulative operations.
    A ratio() method would align pandas’ API by addressing a gap for multiplicative differences.

  2. Confusion with pct_change():
    Many users misuse pct_change() because there is no direct alternative for calculating ratios. Adding a ratio() method would eliminate this source of confusion.

  3. Broader Use Case:
    Ratios are widely used in finance, analytics, and scientific computing, making this a valuable addition to pandas' core functionality.

  4. Implementation, testing and efficiency:
    We propose the viewpoint that ratio() is the natural geometric equivalent to the existing diff() in all possible respects, and thus the requirements and implementation considerations should follow similar lines (save for specific, division-related caveats that will be dealt with as needed).

@DrorDr DrorDr added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants