PERF: expand asv benchmark coverage #24214

qwhelan · 2018-12-10T22:03:08Z

Continuing on the approach in #23935, I've identified significant functions with no benchmark coverage in the asv suite - this time extending the line tracing analysis down to the Cython layer. Of interest:

merge_asof() is only benchmarked in the default direction, despite the other two directions having unique Cython codepaths
Reindex operations on a PeriodIndex are currently largely uncovered
CategoricalIndex is missing a lot of basic cases
There's a chunk of Cython code specifically for parsing Quarter strings that now has a benchmark
Several typos in the Timestamp benchmarks

I'll update this comment when the asv comparison runs complete.

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pep8speaks · 2018-12-10T22:03:13Z

Hello @qwhelan! Thanks for updating the PR.

In the file asv_bench/benchmarks/join_merge.py, following are the PEP8 issues :

Line 284:80: E501 line too long (90 > 79 characters)
Line 332:80: E501 line too long (83 > 79 characters)

Comment last updated on January 08, 2019 at 21:23 Hours UTC

codecov · 2018-12-10T22:18:25Z

Codecov Report

Merging #24214 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #24214      +/-   ##
==========================================
- Coverage   92.37%   92.37%   -0.01%     
==========================================
  Files         166      166              
  Lines       52315    52315              
==========================================
- Hits        48328    48327       -1     
- Misses       3987     3988       +1

Flag	Coverage Δ
#multiple	`90.79% <ø> (ø)`	⬆️
#single	`43.06% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/util/testing.py	`88% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ae466c...014fea3. Read the comment docs.

codecov · 2018-12-10T22:18:25Z

Codecov Report

Merging #24214 into master will decrease coverage by 49.2%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           master   #24214       +/-   ##
===========================================
- Coverage   92.21%      43%   -49.21%     
===========================================
  Files         162      162               
  Lines       51763    51763               
===========================================
- Hits        47733    22261    -25472     
- Misses       4030    29502    +25472

Flag	Coverage Δ
#multiple	`?`
#single	`43% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/io/formats/latex.py	`0% <0%> (-100%)`	⬇️
pandas/core/categorical.py	`0% <0%> (-100%)`	⬇️
pandas/io/sas/sas_constants.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/plotting.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/converter.py	`0% <0%> (-100%)`	⬇️
pandas/io/formats/html.py	`0% <0%> (-98.62%)`	⬇️
pandas/core/groupby/categorical.py	`0% <0%> (-95.46%)`	⬇️
pandas/io/sas/sas7bdat.py	`0% <0%> (-91.17%)`	⬇️
pandas/io/sas/sas_xport.py	`0% <0%> (-90.15%)`	⬇️
pandas/core/tools/numeric.py	`10.44% <0%> (-89.56%)`	⬇️
... and 119 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91802fb...5549d6b. Read the comment docs.

WillAyd

Here's a quick pass of things to look at. There's quite a few benchmarks in here so may come back with more later

asv_bench/benchmarks/algorithms.py

asv_bench/benchmarks/categoricals.py

jreback · 2019-01-05T15:37:04Z

can you merge master and update

qwhelan · 2019-01-06T03:40:52Z

@WillAyd Rebased and incorporated your comments

jreback · 2019-01-06T16:33:19Z

some linting issues:https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=6625

qwhelan · 2019-01-06T22:42:39Z

@jreback I've added a small asv suite fix - a recent change to pandas_vb_common.py means the suite will completely fail on init for any commits prior to ~July when the EA Dtypes were added.

WillAyd · 2019-01-07T17:19:27Z

asv_bench/benchmarks/algorithms.py

+    param_names = ['dtype']
+
+    def setup(self, dtype):
+        N = 10**5 * 5


What was the reason for changing this line?

Intent was to make Duplicated and DuplicatedUniqueIndex comparable in terms of N (the former is repeated 5x). I'll remove it as it's confusing.

WillAyd · 2019-01-07T17:20:27Z

asv_bench/benchmarks/algorithms.py

+class Quantile(object):
+    params = [[0, 0.5, 1],
+              ['linear', 'nearest', 'lower', 'higher', 'midpoint'],
+              ['float', 'int']]


Maybe add uint here as well?

jreback

lgtm. one comment.

jreback · 2019-01-08T13:10:06Z

asv_bench/benchmarks/algorithms.py

@@ -113,4 +132,21 @@ def time_series_dates(self, df):
        hashing.hash_pandas_object(df['dates'])


+class Quantile(object):


can remove the Match one; was remove in 0.23.0 i think.

jreback · 2019-01-09T12:19:46Z

lgtm. over to you @WillAyd

WillAyd · 2019-01-09T15:54:48Z

Thanks @qwhelan !

qwhelan force-pushed the missing_asv branch 3 times, most recently from 6105284 to 5549d6b Compare December 10, 2018 22:18

WillAyd requested changes Dec 11, 2018

View reviewed changes

WillAyd added the Benchmark Performance (ASV) benchmarks label Dec 11, 2018

qwhelan force-pushed the missing_asv branch from 5549d6b to 56010b8 Compare January 6, 2019 03:39

qwhelan force-pushed the missing_asv branch from 56010b8 to 99e9ad8 Compare January 6, 2019 22:24

qwhelan force-pushed the missing_asv branch from c5dcc9f to 8649518 Compare January 7, 2019 01:01

WillAyd reviewed Jan 7, 2019

View reviewed changes

qwhelan force-pushed the missing_asv branch from 8649518 to 014fea3 Compare January 7, 2019 23:21

jreback requested changes Jan 8, 2019

View reviewed changes

qwhelan added 12 commits January 8, 2019 12:07

PERF: add asv benchmarks for quantile()

5bba279

PERF: add asv benchmark for plot(table=True)

2cec9de

PERF: add asv benchmark for pivot_table options

070c1a5

PERF: add asv benchmark for concat() with mixed ndims

cd6caa4

PERF: add asv benchmarks for crosstab()

d650151

PERF: benchmark CategoricalIndex operations

8cc56dd

PERF: benchmark removing categories

92bd49d

PERF: benchmark reindexing for CategoricalIndex

92e97a7

PERF: add coverage for non-default directions in merge_asof

bf30fe7

PERF: fix Timestamp asv benchmark typos and add missing ones

9cf1ce4

PERF: benchmark PeriodIndex pad/backfill

1a928d8

PERF: expand UInt64Index benchmark coverage

35ae1fa

qwhelan added 3 commits January 8, 2019 12:07

PERF: benchmark to_datetime quarter parsing

60f19cc

CLN: fix import preventing asv from running for old commits

2d121e2

CLN: remove Match asv benchmark

ce6bed2

qwhelan force-pushed the missing_asv branch from 014fea3 to ce6bed2 Compare January 8, 2019 21:23

jreback added this to the 0.24.0 milestone Jan 9, 2019

jreback approved these changes Jan 9, 2019

View reviewed changes

WillAyd approved these changes Jan 9, 2019

View reviewed changes

WillAyd merged commit 46a31c9 into pandas-dev:master Jan 9, 2019

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

PERF: expand asv benchmark coverage (pandas-dev#24214)

525796c

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

PERF: expand asv benchmark coverage (pandas-dev#24214)

683f7a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: expand asv benchmark coverage #24214

PERF: expand asv benchmark coverage #24214

qwhelan commented Dec 10, 2018 •

edited

Loading

pep8speaks commented Dec 10, 2018 •

edited

Loading

codecov bot commented Dec 10, 2018 •

edited

Loading

codecov bot commented Dec 10, 2018

WillAyd left a comment

jreback commented Jan 5, 2019

qwhelan commented Jan 6, 2019

jreback commented Jan 6, 2019

qwhelan commented Jan 6, 2019

WillAyd Jan 7, 2019

qwhelan Jan 7, 2019

qwhelan Jan 7, 2019

WillAyd Jan 7, 2019

qwhelan Jan 7, 2019

jreback left a comment

jreback Jan 8, 2019

qwhelan Jan 8, 2019

jreback commented Jan 9, 2019

WillAyd commented Jan 9, 2019

		@@ -113,4 +132,21 @@ def time_series_dates(self, df):
		hashing.hash_pandas_object(df['dates'])


		class Quantile(object):

PERF: expand asv benchmark coverage #24214

PERF: expand asv benchmark coverage #24214

Conversation

qwhelan commented Dec 10, 2018 • edited Loading

pep8speaks commented Dec 10, 2018 • edited Loading

Comment last updated on January 08, 2019 at 21:23 Hours UTC

codecov bot commented Dec 10, 2018 • edited Loading

Codecov Report

codecov bot commented Dec 10, 2018

Codecov Report

WillAyd left a comment

Choose a reason for hiding this comment

jreback commented Jan 5, 2019

qwhelan commented Jan 6, 2019

jreback commented Jan 6, 2019

qwhelan commented Jan 6, 2019

WillAyd Jan 7, 2019

Choose a reason for hiding this comment

qwhelan Jan 7, 2019

Choose a reason for hiding this comment

qwhelan Jan 7, 2019

Choose a reason for hiding this comment

WillAyd Jan 7, 2019

Choose a reason for hiding this comment

qwhelan Jan 7, 2019

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

jreback Jan 8, 2019

Choose a reason for hiding this comment

qwhelan Jan 8, 2019

Choose a reason for hiding this comment

jreback commented Jan 9, 2019

WillAyd commented Jan 9, 2019

qwhelan commented Dec 10, 2018 •

edited

Loading

pep8speaks commented Dec 10, 2018 •

edited

Loading

codecov bot commented Dec 10, 2018 •

edited

Loading