Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: expand asv benchmark coverage #24214

Merged
merged 15 commits into from
Jan 9, 2019
Merged

Conversation

qwhelan
Copy link
Contributor

@qwhelan qwhelan commented Dec 10, 2018

Continuing on the approach in #23935, I've identified significant functions with no benchmark coverage in the asv suite - this time extending the line tracing analysis down to the Cython layer. Of interest:

  • merge_asof() is only benchmarked in the default direction, despite the other two directions having unique Cython codepaths
  • Reindex operations on a PeriodIndex are currently largely uncovered
  • CategoricalIndex is missing a lot of basic cases
  • There's a chunk of Cython code specifically for parsing Quarter strings that now has a benchmark
  • Several typos in the Timestamp benchmarks

I'll update this comment when the asv comparison runs complete.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@pep8speaks
Copy link

pep8speaks commented Dec 10, 2018

Hello @qwhelan! Thanks for updating the PR.

Line 284:80: E501 line too long (90 > 79 characters)
Line 332:80: E501 line too long (83 > 79 characters)

Comment last updated on January 08, 2019 at 21:23 Hours UTC

@qwhelan qwhelan force-pushed the missing_asv branch 3 times, most recently from 6105284 to 5549d6b Compare December 10, 2018 22:18
@codecov
Copy link

codecov bot commented Dec 10, 2018

Codecov Report

Merging #24214 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24214      +/-   ##
==========================================
- Coverage   92.37%   92.37%   -0.01%     
==========================================
  Files         166      166              
  Lines       52315    52315              
==========================================
- Hits        48328    48327       -1     
- Misses       3987     3988       +1
Flag Coverage Δ
#multiple 90.79% <ø> (ø) ⬆️
#single 43.06% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 88% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ae466c...014fea3. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 10, 2018

Codecov Report

Merging #24214 into master will decrease coverage by 49.2%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #24214       +/-   ##
===========================================
- Coverage   92.21%      43%   -49.21%     
===========================================
  Files         162      162               
  Lines       51763    51763               
===========================================
- Hits        47733    22261    -25472     
- Misses       4030    29502    +25472
Flag Coverage Δ
#multiple ?
#single 43% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-98.62%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-95.46%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.17%) ⬇️
pandas/io/sas/sas_xport.py 0% <0%> (-90.15%) ⬇️
pandas/core/tools/numeric.py 10.44% <0%> (-89.56%) ⬇️
... and 119 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 91802fb...5549d6b. Read the comment docs.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a quick pass of things to look at. There's quite a few benchmarks in here so may come back with more later

@WillAyd WillAyd added the Benchmark Performance (ASV) benchmarks label Dec 11, 2018
@jreback
Copy link
Contributor

jreback commented Jan 5, 2019

can you merge master and update

@qwhelan
Copy link
Contributor Author

qwhelan commented Jan 6, 2019

@WillAyd Rebased and incorporated your comments

@jreback
Copy link
Contributor

jreback commented Jan 6, 2019

@qwhelan
Copy link
Contributor Author

qwhelan commented Jan 6, 2019

@jreback I've added a small asv suite fix - a recent change to pandas_vb_common.py means the suite will completely fail on init for any commits prior to ~July when the EA Dtypes were added.

param_names = ['dtype']

def setup(self, dtype):
N = 10**5 * 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for changing this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intent was to make Duplicated and DuplicatedUniqueIndex comparable in terms of N (the former is repeated 5x). I'll remove it as it's confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

class Quantile(object):
params = [[0, 0.5, 1],
['linear', 'nearest', 'lower', 'higher', 'midpoint'],
['float', 'int']]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add uint here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. one comment.

@@ -113,4 +132,21 @@ def time_series_dates(self, df):
hashing.hash_pandas_object(df['dates'])


class Quantile(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove the Match one; was remove in 0.23.0 i think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jreback jreback added this to the 0.24.0 milestone Jan 9, 2019
@jreback
Copy link
Contributor

jreback commented Jan 9, 2019

lgtm. over to you @WillAyd

@WillAyd WillAyd merged commit 46a31c9 into pandas-dev:master Jan 9, 2019
@WillAyd
Copy link
Member

WillAyd commented Jan 9, 2019

Thanks @qwhelan !

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmark Performance (ASV) benchmarks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants