Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: ASV inference benchmark #18759

Merged
merged 3 commits into from
Dec 18, 2017
Merged

Conversation

mroeschke
Copy link
Member

  • Flake8 and removed star imports

  • Used params and setup_cache where possible

asv dev -b ^inference
· Discovering benchmarks
· Running 13 total benchmarks (1 commits * 1 environments * 13 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/inference.py:105
[  7.69%] ··· Running inference.MaybeConvertNumeric.time_convert                                                                    2.55s
[ 15.38%] ··· Running inference.NumericInferOps.time_add                                                                               ok
[ 15.38%] ···· 
               ========= ========
                 dtype           
               --------- --------
                 int64    7.27ms 
                 int32    4.09ms 
                 uint32   2.33ms 
                float32   3.98ms 
                float64   7.20ms 
               ========= ========

[ 23.08%] ··· Running inference.NumericInferOps.time_divide                                                                            ok
[ 23.08%] ···· 
               ========= ========
                 dtype           
               --------- --------
                 int64    9.22ms 
                 int32    7.77ms 
                 uint32   6.34ms 
                float32   4.00ms 
                float64   7.25ms 
               ========= ========

[ 30.77%] ··· Running inference.NumericInferOps.time_modulo                                                                            ok
[ 30.77%] ···· 
               ========= ========
                 dtype           
               --------- --------
                 int64    17.3ms 
                 int32    9.96ms 
                 uint32   10.3ms 
                float32   8.27ms 
                float64   8.37ms 
               ========= ========

[ 38.46%] ··· Running inference.NumericInferOps.time_multiply                                                                          ok
[ 38.46%] ···· 
               ========= ========
                 dtype           
               --------- --------
                 int64    7.18ms 
                 int32    4.12ms 
                 uint32   2.42ms 
                float32   4.08ms 
                float64   7.15ms 
               ========= ========

[ 46.15%] ··· Running inference.NumericInferOps.time_subtract                                                                          ok
[ 46.15%] ···· 
               ========= ========
                 dtype           
               --------- --------
                 int64    7.23ms 
                 int32    3.96ms 
                 uint32   2.36ms 
                float32   3.96ms 
                float64   7.12ms 
               ========= ========

[ 53.85%] ··· Running inference.ToNumeric.time_from_float                                                                              ok
[ 53.85%] ···· 
               ======== =======
                errors         
               -------- -------
                ignore   157μs 
                coerce   158μs 
               ======== =======

[ 61.54%] ··· Running inference.ToNumeric.time_from_numeric_str                                                                        ok
[ 61.54%] ···· 
               ======== ========
                errors          
               -------- --------
                ignore   8.08ms 
                coerce   8.05ms 
               ======== ========

[ 69.23%] ··· Running inference.ToNumeric.time_from_str                                                                                ok
[ 69.23%] ···· 
               ======== ========
                errors          
               -------- --------
                ignore   365μs  
                coerce   24.7ms 
               ======== ========

[ 76.92%] ··· Running inference.ToNumericDowncast.time_downcast                                                                        ok
[ 76.92%] ···· 
               ============== ======== ========= ======== ========== ========
               --                                 downcast                   
               -------------- -----------------------------------------------
                   dtype        None    integer   signed   unsigned   float  
               ============== ======== ========= ======== ========== ========
                string-float   264ms     270ms    272ms     267ms     268ms  
                 string-int    597ms     621ms    640ms     612ms     605ms  
                string-nint    613ms     622ms    631ms     597ms     610ms  
                 datetime64    5.17ms    72.0ms   72.3ms    74.3ms    8.87ms 
                  int-list     65.0ms    92.7ms   92.5ms    94.5ms    66.8ms 
                   int32       26.2μs    27.2ms   26.9ms    27.9ms    1.35ms 
               ============== ======== ========= ======== ========== ========

[ 76.92%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/inference.py:40
[ 84.62%] ··· Running inference.DateInferOps.time_add_timedeltas                                                                   29.1ms
[ 92.31%] ··· Running inference.DateInferOps.time_subtract_datetimes                                                               23.6ms
[100.00%] ··· Running inference.DateInferOps.time_timedelta_plus_datetime                                                           157ms

@codecov
Copy link

codecov bot commented Dec 13, 2017

Codecov Report

Merging #18759 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18759      +/-   ##
==========================================
- Coverage   91.61%   91.59%   -0.03%     
==========================================
  Files         153      153              
  Lines       51363    51363              
==========================================
- Hits        47058    47047      -11     
- Misses       4305     4316      +11
Flag Coverage Δ
#multiple 89.46% <ø> (-0.01%) ⬇️
#single 40.75% <ø> (-0.12%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/util/testing.py 82.32% <0%> (-0.2%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9705a48...d9fc6f4. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 13, 2017

Codecov Report

Merging #18759 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18759      +/-   ##
==========================================
+ Coverage   91.61%   91.61%   +<.01%     
==========================================
  Files         153      154       +1     
  Lines       51363    51430      +67     
==========================================
+ Hits        47058    47120      +62     
- Misses       4305     4310       +5
Flag Coverage Δ
#multiple 89.48% <ø> (+0.02%) ⬆️
#single 40.83% <ø> (-0.04%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.68% <0%> (-0.23%) ⬇️
pandas/core/internals.py 94.42% <0%> (-0.02%) ⬇️
pandas/core/generic.py 95.9% <0%> (ø) ⬆️
pandas/core/apply.py 99.42% <0%> (ø)
pandas/core/indexes/base.py 96.44% <0%> (ø) ⬆️
pandas/core/series.py 94.82% <0%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.71% <0%> (+0.01%) ⬆️
pandas/core/sparse/frame.py 94.8% <0%> (+0.02%) ⬆️
pandas/core/indexes/interval.py 93.83% <0%> (+0.03%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9705a48...8c2f3f6. Read the comment docs.

@jreback jreback added the Benchmark Performance (ASV) benchmarks label Dec 13, 2017
import pandas as pd
import numpy as np
import pandas.util.testing as tm
import pandas._libs.lib as lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is not backward compat before 0.20., but I think ok for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you recall what this import would be pre 0.20?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pandas import lib (this is in fact what pandas/lib.py does now, but that will get blown away in 0.22

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can import lib from pandas_vb_common.py, the back compat is handled there

# from GH 7332
goal_time = 0.2
params = ['int64', 'int32', 'uint32', 'float32', 'float64']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add uint64. you can add int16, int8, uint16, uint8 as well to cover the bases

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to define these numeric dtypes elsewhere and import here (so we are consistent across the asv),

and prob for datetimelike as well

@jreback jreback added this to the 0.22.0 milestone Dec 13, 2017

param_names = ['dtype', 'downcast']
params = [['string-float', 'string-int', 'string-nint', 'datetime64',
'int-list', 'int32'],
[None, 'integer', 'signed', 'unsigned', 'float']]

N = 500000
N2 = int(N / 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was for a good reason, you can't use floats to multiply lists (I find it a bit strange that this didn't fail for you)

@mroeschke
Copy link
Member Author

Created a new numeric_dtypes variable in pandas_vb_common.py of numpy numeric types, adjusted import and added back int(N / 2) for compat.

asv dev -b ^inference
· Discovering benchmarks
· Running 13 total benchmarks (1 commits * 1 environments * 13 benchmarks)
[  0.00%] ·· Building for existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ·· Benchmarking existing-py_home_matt_anaconda_envs_pandas_dev_bin_python
[  0.00%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/inference.py:43
[  7.69%] ··· Running inference.DateInferOps.time_add_timedeltas                                                        35.0ms
[ 15.38%] ··· Running inference.DateInferOps.time_subtract_datetimes                                                    23.7ms
[ 23.08%] ··· Running inference.DateInferOps.time_timedelta_plus_datetime                                                156ms
[ 23.08%] ··· Setting up /home/matt/Projects/pandas-mroeschke/asv_bench/benchmarks/inference.py:108
[ 30.77%] ··· Running inference.MaybeConvertNumeric.time_convert                                                         2.58s
[ 38.46%] ··· Running inference.NumericInferOps.time_add                                                                    ok
[ 38.46%] ···· 
               ======================== ========
                        dtype                   
               ------------------------ --------
                 <type 'numpy.int64'>    7.21ms 
                 <type 'numpy.int32'>    3.98ms 
                <type 'numpy.uint32'>    2.31ms 
                <type 'numpy.uint64'>    4.16ms 
                <type 'numpy.float32'>   3.89ms 
                <type 'numpy.float64'>   7.12ms 
                 <type 'numpy.int16'>    1.84ms 
                 <type 'numpy.int8'>     1.36ms 
                <type 'numpy.uint16'>    1.97ms 
                 <type 'numpy.uint8'>    1.54ms 
               ======================== ========

[ 46.15%] ··· Running inference.NumericInferOps.time_divide                                                                 ok
[ 46.15%] ···· 
               ======================== ========
                        dtype                   
               ------------------------ --------
                 <type 'numpy.int64'>    9.17ms 
                 <type 'numpy.int32'>    7.76ms 
                <type 'numpy.uint32'>    6.36ms 
                <type 'numpy.uint64'>    6.52ms 
                <type 'numpy.float32'>   4.02ms 
                <type 'numpy.float64'>   7.21ms 
                 <type 'numpy.int16'>    6.97ms 
                 <type 'numpy.int8'>     6.83ms 
                <type 'numpy.uint16'>    7.02ms 
                 <type 'numpy.uint8'>    6.98ms 
               ======================== ========

[ 53.85%] ··· Running inference.NumericInferOps.time_modulo                                                                 ok
[ 53.85%] ···· 
               ======================== ========
                        dtype                   
               ------------------------ --------
                 <type 'numpy.int64'>    17.6ms 
                 <type 'numpy.int32'>    10.1ms 
                <type 'numpy.uint32'>    9.58ms 
                <type 'numpy.uint64'>    13.9ms 
                <type 'numpy.float32'>   8.22ms 
                <type 'numpy.float64'>   8.31ms 
                 <type 'numpy.int16'>    8.61ms 
                 <type 'numpy.int8'>     15.1ms 
                <type 'numpy.uint16'>    8.19ms 
                 <type 'numpy.uint8'>    14.3ms 
               ======================== ========

[ 61.54%] ··· Running inference.NumericInferOps.time_multiply                                                               ok
[ 61.54%] ···· 
               ======================== ========
                        dtype                   
               ------------------------ --------
                 <type 'numpy.int64'>    7.25ms 
                 <type 'numpy.int32'>    4.12ms 
                <type 'numpy.uint32'>    2.39ms 
                <type 'numpy.uint64'>    4.11ms 
                <type 'numpy.float32'>   4.06ms 
                <type 'numpy.float64'>   7.08ms 
                 <type 'numpy.int16'>    1.51ms 
                 <type 'numpy.int8'>     1.09ms 
                <type 'numpy.uint16'>    1.52ms 
                 <type 'numpy.uint8'>    1.16ms 
               ======================== ========

[ 69.23%] ··· Running inference.NumericInferOps.time_subtract                                                               ok
[ 69.23%] ···· 
               ======================== ========
                        dtype                   
               ------------------------ --------
                 <type 'numpy.int64'>    7.20ms 
                 <type 'numpy.int32'>    4.05ms 
                <type 'numpy.uint32'>    2.46ms 
                <type 'numpy.uint64'>    4.06ms 
                <type 'numpy.float32'>   3.93ms 
                <type 'numpy.float64'>   7.31ms 
                 <type 'numpy.int16'>    1.62ms 
                 <type 'numpy.int8'>     1.12ms 
                <type 'numpy.uint16'>    1.84ms 
                 <type 'numpy.uint8'>    1.23ms 
               ======================== ========

[ 76.92%] ··· Running inference.ToNumeric.time_from_float                                                                   ok
[ 76.92%] ···· 
               ======== =======
                errors         
               -------- -------
                ignore   166μs 
                coerce   156μs 
               ======== =======

[ 84.62%] ··· Running inference.ToNumeric.time_from_numeric_str                                                             ok
[ 84.62%] ···· 
               ======== ========
                errors          
               -------- --------
                ignore   8.47ms 
                coerce   8.93ms 
               ======== ========

[ 92.31%] ··· Running inference.ToNumeric.time_from_str                                                                     ok
[ 92.31%] ···· 
               ======== ========
                errors          
               -------- --------
                ignore   357μs  
                coerce   25.6ms 
               ======== ========

[100.00%] ··· Running inference.ToNumericDowncast.time_downcast                                                             ok
[100.00%] ···· 
               ============== ======== ========= ======== ========== ========
               --                                 downcast                   
               -------------- -----------------------------------------------
                   dtype        None    integer   signed   unsigned   float  
               ============== ======== ========= ======== ========== ========
                string-float   270ms     268ms    264ms     271ms     283ms  
                 string-int    605ms     630ms    640ms     634ms     620ms  
                string-nint    622ms     643ms    686ms     650ms     611ms  
                 datetime64    5.38ms    76.7ms   72.3ms    85.6ms    9.01ms 
                  int-list     64.6ms    94.3ms   93.2ms    95.3ms    66.8ms 
                   int32       26.4μs    27.1ms   27.2ms    28.0ms    1.42ms 
               ============== ======== ========= ======== ========== ========

try:
import pandas._libs.lib as lib
except ImportError:
import pandas.lib as lib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep this in pandas_vb_common, you can just import lib from there (as this can then be reused in multiple files)

@mroeschke
Copy link
Member Author

@jorisvandenbossche changed the lib import to import from pandas_vb_common.py

@jreback jreback merged commit 7a0ee19 into pandas-dev:master Dec 18, 2017
@jreback
Copy link
Contributor

jreback commented Dec 18, 2017

thanks!

@mroeschke mroeschke deleted the asv_clean_inference branch December 18, 2017 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Benchmark Performance (ASV) benchmarks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants