Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix scatter plot colors in groupby context to match line plot behavior (#59846) #61233

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -762,6 +762,7 @@ Plotting
- Bug in :meth:`DataFrame.plot.bar` with ``stacked=True`` where labels on stacked bars with zero-height segments were incorrectly positioned at the base instead of the label position of the previous segment (:issue:`59429`)
- Bug in :meth:`DataFrame.plot.line` raising ``ValueError`` when set both color and a ``dict`` style (:issue:`59461`)
- Bug in :meth:`DataFrame.plot` that causes a shift to the right when the frequency multiplier is greater than one. (:issue:`57587`)
- Bug in :meth:`DataFrameGroupBy.plot` with scatter colors (:issue:`59846`)
- Bug in :meth:`Series.plot` with ``kind="pie"`` with :class:`ArrowDtype` (:issue:`59192`)

Groupby/resample/rolling
Expand Down
32 changes: 28 additions & 4 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,11 +431,35 @@ def __init__(self, groupby: GroupBy) -> None:
self._groupby = groupby

def __call__(self, *args, **kwargs):
def f(self):
return self.plot(*args, **kwargs)
# Special case for scatter plots to enable auto colors like line plots
if kwargs.get("kind") == "scatter":
import matplotlib.pyplot as plt

f.__name__ = "plot"
return self._groupby._python_apply_general(f, self._groupby._selected_obj)
# Get colors from matplotlib's color cycle (similar to what LinePlot uses)
colors = plt.rcParams["axes.prop_cycle"].by_key()["color"]

# Determine the axis to plot on
if "ax" in kwargs:
ax = kwargs["ax"]
else:
_, ax = plt.subplots()

# Plot each group with a different color
results = {}
for i, (name, group) in enumerate(self._groupby):
group_kwargs = kwargs.copy()
group_kwargs["ax"] = ax
group_kwargs["color"] = colors[i % len(colors)]
results[name] = group.plot(*args, **group_kwargs)

return results
else:
# Original implementation for non-scatter plots
def f(self):
return self.plot(*args, **kwargs)

f.__name__ = "plot"
return self._groupby._python_apply_general(f, self._groupby._selected_obj)

def __getattr__(self, name: str):
def attr(*args, **kwargs):
Expand Down
49 changes: 49 additions & 0 deletions pandas/tests/plotting/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,3 +152,52 @@ def test_groupby_hist_series_with_legend_raises(self):

with pytest.raises(ValueError, match="Cannot use both legend and label"):
g.hist(legend=True, label="d")

def test_groupby_scatter_colors(self):
# GH 59846 - Test that scatter plots use different colors for different groups
# similar to how line plots do
from matplotlib.collections import PathCollection
import matplotlib.pyplot as plt

# Create test data with distinct groups
df = DataFrame(
{
"x": [1, 2, 3, 4, 5, 6, 7, 8, 9],
"y": [1, 2, 3, 4, 5, 6, 7, 8, 9],
"group": ["A", "A", "A", "B", "B", "B", "C", "C", "C"],
}
)

# Set up a figure with both line and scatter plots
fig, (ax1, ax2) = plt.subplots(1, 2)

# Plot line chart (known to use different colors for different groups)
df.groupby("group").plot(x="x", y="y", ax=ax1, kind="line")

# Plot scatter chart (should also use different colors for different groups)
df.groupby("group").plot(x="x", y="y", ax=ax2, kind="scatter")

# Get the colors used in the line plot and scatter plot
line_colors = [line.get_color() for line in ax1.get_lines()]

# Get scatter colors
scatter_colors = []
for collection in ax2.collections:
if isinstance(collection, PathCollection): # This is a scatter plot
# Get the face colors (might be array of RGBA values)
face_colors = collection.get_facecolor()
# If multiple points with same color, we get the first one
if face_colors.ndim > 1:
scatter_colors.append(tuple(face_colors[0]))
else:
scatter_colors.append(tuple(face_colors))

# Assert that we have the right number of colors (one per group)
assert len(line_colors) == 3
assert len(scatter_colors) == 3

# Assert that the colors are all different (fixed behavior)
assert len(set(line_colors)) == 3
assert len(set(scatter_colors)) == 3

plt.close(fig)
Loading