Skip to content

PX ECDF #3330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 13, 2021
Merged

PX ECDF #3330

Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
docs sweep
  • Loading branch information
nicolaskruchten committed Aug 13, 2021
commit 542ed5aff78217b60b6d16ed6cc687d6d626b9bc
11 changes: 8 additions & 3 deletions doc/python/box-plots.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ jupyter:
extension: .md
format_name: markdown
format_version: '1.2'
jupytext_version: 1.6.0
jupytext_version: 1.4.2
kernelspec:
display_name: Python 3
language: python
Expand All @@ -20,7 +20,7 @@ jupyter:
name: python
nbconvert_exporter: python
pygments_lexer: ipython3
version: 3.7.6
version: 3.7.7
plotly:
description: How to make Box Plots in Python with Plotly.
display_as: statistical
Expand All @@ -36,13 +36,18 @@ jupyter:
thumbnail: thumbnail/box.jpg
---

A [box plot](https://en.wikipedia.org/wiki/Box_plot) is a statistical representation of numerical data through their quartiles. The ends of the box represent the lower and upper quartiles, while the median (second quartile) is marked by a line inside the box. For other statistical representations of numerical data, see [other statistical charts](https://plotly.com/python/statistical-charts/).
<!-- #region -->
A [box plot](https://en.wikipedia.org/wiki/Box_plot) is a statistical representation of the distribution of a variable through its quartiles. The ends of the box represent the lower and upper quartiles, while the median (second quartile) is marked by a line inside the box. For other statistical representations of numerical data, see [other statistical charts](https://plotly.com/python/statistical-charts/).


Alternatives to box plots for visualizing distributions include [histograms](https://plotly.com/python/histograms/), [violin plots](https://plotly.com/python/violin/), [ECDF plots](https://plotly.com/python/ecdf-plots/) and [strip charts](https://plotly.com/python/strip-charts/).

## Box Plot with `plotly.express`

[Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/).

In a box plot created by `px.box`, the distribution of the column given as `y` argument is represented.
<!-- #endregion -->

```python
import plotly.express as px
Expand Down
45 changes: 41 additions & 4 deletions doc/python/ecdf-plots.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@ jupyter:

### Overview

[Empirical cumulative distribution function plots](https://en.wikipedia.org/wiki/Empirical_distribution_function) are a way to visualize the distribution of a variabl, and Plotly Express has a built-in function, `px.ecdf()` to generate such plots. [Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/).
[Empirical cumulative distribution function plots](https://en.wikipedia.org/wiki/Empirical_distribution_function) are a way to visualize the distribution of a variable, and Plotly Express has a built-in function, `px.ecdf()` to generate such plots. [Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/).

Alternatives to ECDF plots for visualizing distributions include [histograms](https://plotly.com/python/histograms/), [violin plots](https://plotly.com/python/violin/), [box plots](https://plotly.com/python/box-plots/) and [strip charts](https://plotly.com/python/strip-charts/).

### Simple ECDF Plots

Expand Down Expand Up @@ -90,24 +92,30 @@ fig.show()

By default, the Y value represents the fraction of the data that is *at or below* the value on on the X axis. Setting `ecdfmode` to `"reversed"` reverses this, with the Y axis representing the fraction of the data *at or above* the X value. Setting `ecdfmode` to `"complementary"` plots `1-ECDF`, meaning that the Y values represent the fraction of the data *above* the X value.

In `standard` mode (the default), the right-most point is at 1 (or the total count/sum, depending on `ecdfnorm`) and the right-most point is above 0.

```python
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="standard",
title="ecdfmode='standard' (at or below X value, the default)")
title="ecdfmode='standard' (Y=fraction at or below X value, this the default)")
fig.show()
```

In `reversed` mode, the right-most point is at 1 (or the total count/sum, depending on `ecdfnorm`) and the left-most point is above 0.

```python
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="reversed",
title="ecdfmode='reversed' (at or above X value)")
title="ecdfmode='reversed' (Y=fraction at or above X value)")
fig.show()
```

In `complementary` mode, the right-most point is at 0 and no points are at 1 (or the total count/sum) per the definition of the CCDF as 1-ECDF, which has no point at 0.

```python
import plotly.express as px
fig = px.ecdf(df, x=[1,2,3,4], markers=True, ecdfmode="complementary",
title="ecdfmode='complementary' (above X value)")
title="ecdfmode='complementary' (Y=fraction above X value)")
fig.show()
```

Expand Down Expand Up @@ -140,6 +148,35 @@ fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False)
fig.show()
```

### Marginal Plots

ECDF plots also support [marginal plots](https://plotly.com/python/marginal-plots/)

```python
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", markers=True, lines=False, marginal="histogram")
fig.show()
```

```python
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", marginal="rug")
fig.show()
```

### Facets

ECDF Plots also support [faceting](https://plotly.com/python/facet-plots/)

```python
import plotly.express as px
df = px.data.tips()
fig = px.ecdf(df, x="total_bill", color="sex", facet_row="time", facet_col="day")
fig.show()
```

```python

```
24 changes: 20 additions & 4 deletions doc/python/graph-objects.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,19 @@ Graph objects have several benefits compared to plain Python dictionaries:
5. Graph object constructors and update methods accept "magic underscores" (e.g. `go.Figure(layout_title_text="The Title")` rather than `dict(layout=dict(title=dict(text="The Title")))`) for more compact code.
6. Graph objects support attached rendering (`.show()`) and exporting functions (`.write_image()`) that automatically invoke the appropriate functions from [the `plotly.io` module](https://plotly.com/python-api-reference/plotly.io.html).

### When to use Graph Objects Directly
### When to use Graph Objects vs Plotly Express

The recommended way to create figures is using the [functions in the plotly.express module](https://plotly.com/python-api-reference/), [collectively known as Plotly Express](/python/plotly-express/), which all return instances of `plotly.graph_objects.Figure`, so every figure produced with the plotly library, actually uses graph objects under the hood, unless manually constructed out of dictionaries.
The recommended way to create figures is using the [functions in the plotly.express module](https://plotly.com/python-api-reference/), [collectively known as Plotly Express](/python/plotly-express/), which all return instances of `plotly.graph_objects.Figure`, so every figure produced with the `plotly` library actually uses graph objects under the hood, unless manually constructed out of dictionaries.

That said, certain kinds of figures are not yet possible to create with Plotly Express, such as figures that use certain 3D trace-types like [`mesh`](/python/3d-mesh/) or [`isosurface`](/python/3d-isosurface-plots/). In addition, certain figures are cumbersome to create by starting from a figure created with Plotly Express, for example figures with [subplots of different types](/python/mixed-subplots/), [dual-axis plots](/python/multiple-axes/), or [faceted plots](/python/facet-plots/) with multiple different types of traces. To construct such figures, it can be easier to start from an empty `plotly.graph_objects.Figure` object (or one configured with subplots via the [make_subplots() function](/python/subplots/)) and progressively add traces and update attributes as above. Every `plotly` documentation page lists the Plotly Express option at the top if a Plotly Express function exists to make the kind of chart in question, and then the graph objects version below.

Note that the figures produced by Plotly Express **in a single function-call** are [easy to customize at creation-time](/python/styling-plotly-express/), and to [manipulate after creation](/python/creating-and-updating-figures/) using the `update_*` and `add_*` methods. The figures produced by Plotly Express can always be built from the ground up using graph objects, but this approach typically takes **5-100 lines of code rather than 1**. Here is a simple example of how to produce the same figure object from the same data, once with Plotly Express and once without. The data in this example is in "long form" but [Plotly Express also accepts data in "wide form"](/python/wide-form/) and the line-count savings from Plotly Express over graph objects are comparable. More complex figures such as [sunbursts](/python/sunburst-charts/), [parallel coordinates](/python/parallel-coordinates-plot/), [facet plots](/python/facet-plots/) or [animations](/python/animations/) require many more lines of figure-specific graph objects code, whereas switching from one representation to another with Plotly Express usually involves changing just a few characters.
Note that the figures produced by Plotly Express **in a single function-call** are [easy to customize at creation-time](/python/styling-plotly-express/), and to [manipulate after creation](/python/creating-and-updating-figures/) using the `update_*` and `add_*` methods.

### Comparing Graph Objects and Plotly Express

The figures produced by Plotly Express can always be built from the ground up using graph objects, but this approach typically takes **5-100 lines of code rather than 1**.

Here is a simple example of how to produce the same figure object from the same data, once with Plotly Express and once without. The data in this example is in "long form" but [Plotly Express also accepts data in "wide form"](/python/wide-form/) and the line-count savings from Plotly Express over graph objects are comparable. More complex figures such as [sunbursts](/python/sunburst-charts/), [parallel coordinates](/python/parallel-coordinates-plot/), [facet plots](/python/facet-plots/) or [animations](/python/animations/) require many more lines of figure-specific graph objects code, whereas switching from one representation to another with Plotly Express usually involves changing just a few characters.

```python
import pandas as pd
Expand All @@ -73,11 +79,17 @@ df = pd.DataFrame({
"Number Eaten": [2, 1, 3, 1, 3, 2],
})


# Plotly Express

import plotly.express as px

fig = px.bar(df, x="Fruit", y="Number Eaten", color="Contestant", barmode="group")
fig.show()


# Graph Objects

import plotly.graph_objects as go

fig = go.Figure()
Expand All @@ -88,4 +100,8 @@ fig.update_layout(legend_title_text = "Contestant")
fig.update_xaxes(title_text="Fruit")
fig.update_yaxes(title_text="Number Eaten")
fig.show()
```
```

```python

```
13 changes: 9 additions & 4 deletions doc/python/histograms.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,19 @@ jupyter:
thumbnail: thumbnail/histogram.jpg
---

In statistics, a [histogram](https://en.wikipedia.org/wiki/Histogram) is representation of the distribution of numerical data, where the data are binned and the count for each bin is represented. More generally, in plotly a histogram is an aggregated bar chart, with several possible aggregation functions (e.g. sum, average, count...).
<!-- #region -->
In statistics, a [histogram](https://en.wikipedia.org/wiki/Histogram) is representation of the distribution of numerical data, where the data are binned and the count for each bin is represented. More generally, in Plotly a histogram is an aggregated bar chart, with several possible aggregation functions (e.g. sum, average, count...) which can be used to visualize data on categorical and date axes as well as linear axes.

If you're looking instead for bar charts, i.e. representing *raw, unaggregated* data with rectangular

Alternatives to violin plots for visualizing distributions include [violin plots](https://plotly.com/python/violin/), [box plots](https://plotly.com/python/box-plots/), [ECDF plots](https://plotly.com/python/ecdf-plots/) and [strip charts](https://plotly.com/python/strip-charts/).

> If you're looking instead for bar charts, i.e. representing *raw, unaggregated* data with rectangular
bar, go to the [Bar Chart tutorial](/python/bar-charts/).

## Histograms with Plotly Express

[Plotly Express](/python/plotly-express/) is the easy-to-use, high-level interface to Plotly, which [operates on a variety of types of data](/python/px-arguments/) and produces [easy-to-style figures](/python/styling-plotly-express/).
<!-- #endregion -->

```python
import plotly.express as px
Expand Down Expand Up @@ -160,7 +165,7 @@ fig = px.histogram(df, x="total_bill", color="sex")
fig.show()
```

#### Using histfunc
#### Aggregating with other functions than `count`

For each bin of `x`, one can compute a function of data using `histfunc`. The argument of `histfunc` is the dataframe column given as the `y` argument. Below the plot shows that the average tip increases with the total bill.

Expand Down Expand Up @@ -193,7 +198,7 @@ fig.show()

#### Visualizing the distribution

With the `marginal` keyword, a subplot is drawn alongside the histogram, visualizing the distribution. See [the distplot page](https://plotly.com/python/distplot/)for more examples of combined statistical representations.
With the `marginal` keyword, a [marginal](https://plotly.com/python/marginal-plots/) is drawn alongside the histogram, visualizing the distribution. See [the distplot page](https://plotly.com/python/distplot/) for more examples of combined statistical representations.

```python
import plotly.express as px
Expand Down
56 changes: 56 additions & 0 deletions doc/python/line-and-scatter.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,15 @@ fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
fig.show()
```

Color can be [continuous](https://plotly.com/python/colorscales/) as follows, or [discrete/categorical](https://plotly.com/python/discrete-color/) as above.

```python
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color='petal_length')
fig.show()
```

The `symbol` argument can be mapped to a column as well. A [wide variety of symbols](https://plotly.com/python/marker-style/) are available.

```python
Expand Down Expand Up @@ -104,6 +113,53 @@ fig.update_traces(marker_size=10)
fig.show()
```

### Error Bars

Scatter plots support [error bars](https://plotly.com/python/error-bars/).

```python
import plotly.express as px
df = px.data.iris()
df["e"] = df["sepal_width"]/100
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
error_x="e", error_y="e")
fig.show()
```

### Marginal Distribution Plots

Scatter plots support [marginal distribution plots](https://plotly.com/python/marginal-plots/)

```python
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_length", y="sepal_width", marginal_x="histogram", marginal_y="rug")
fig.show()
```

### Facetting

Scatter plots support [faceting](https://plotly.com/python/facet-plots/).

```python
import plotly.express as px
df = px.data.tips()
fig = px.scatter(df, x="total_bill", y="tip", color="smoker", facet_col="sex", facet_row="time")
fig.show()
```

### Linear Regression and Other Trendlines

Scatter plots support [linear and non-linear trendlines](https://plotly.com/python/linear-fits/).

```python
import plotly.express as px

df = px.data.tips()
fig = px.scatter(df, x="total_bill", y="tip", trendline="ols")
fig.show()
```

## Line plots with Plotly Express

```python
Expand Down
Loading