11
11
[ pandas_profiling] ( https://github.com/pandas-profiling/pandas-profiling ) - Descriptive statistics using ` ProfileReport ` .
12
12
[ sklearn_pandas] ( https://github.com/scikit-learn-contrib/sklearn-pandas ) - Helpful ` DataFrameMapper ` class.
13
13
[ missingno] ( https://github.com/ResidentMario/missingno ) - Missing data visualization.
14
+ [ rainbow-csv] ( https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv ) - Plugin to display .csv files with nice colors.
14
15
15
16
#### Environment and Jupyter
16
17
[ General tricks] ( https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/ ) , [ Clean Coding (video)] ( https://www.youtube.com/watch?v=yXGCKqo5cEY )
@@ -25,7 +26,9 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1
25
26
[ pivottablejs] ( https://github.com/nicolaskruchten/jupyter_pivottablejs ) - Drag n drop Pivot Tables and Charts for jupyter notebooks.
26
27
[ itables] ( https://github.com/mwouts/itables ) - Interactive tables in Jupyter.
27
28
28
- #### Pandas Additions
29
+ #### Pandas Alternatives and Additions
30
+ [ modin] ( https://github.com/modin-project/modin ) - Parallelization library for faster pandas ` DataFrame ` .
31
+ [ vaex] ( https://github.com/vaexio/vaex ) - Out-of-Core DataFrames.
29
32
[ xarray] ( https://github.com/pydata/xarray/ ) - Extends pandas to n-dimensional arrays.
30
33
[ swifter] ( https://github.com/jmcarpenter2/swifter ) - Apply any function to a pandas dataframe faster.
31
34
[ pandas_flavor] ( https://github.com/Zsailer/pandas_flavor ) - Write custom accessors like ` .str ` and ` .dt ` .
@@ -49,7 +52,6 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1
49
52
[ dask] ( https://github.com/dask/dask ) , [ dask-ml] ( http://ml.dask.org/ ) - Pandas ` DataFrame ` for big data and machine learning library, [ resources] ( https://matthewrocklin.com/blog//work/2018/07/17/dask-dev ) , [ talk1] ( https://www.youtube.com/watch?v=ccfsbuqsjgI ) , [ talk2] ( https://www.youtube.com/watch?v=RA_2qdipVng ) , [ notebooks] ( https://github.com/dask/dask-ec2/tree/master/notebooks ) , [ videos] ( https://www.youtube.com/user/mdrocklin ) .
50
53
[ dask-gateway] ( https://github.com/jcrist/dask-gateway ) - Managing dask clusters.
51
54
[ turicreate] ( https://github.com/apple/turicreate ) - Helpful ` SFrame ` class for out-of-memory dataframes.
52
- [ modin] ( https://github.com/modin-project/modin ) - Parallelization library for faster pandas ` DataFrame ` .
53
55
[ h2o] ( https://github.com/h2oai/h2o-3 ) - Helpful ` H2OFrame ` class for out-of-memory dataframes.
54
56
[ datatable] ( https://github.com/h2oai/datatable ) - Data Table for big data support.
55
57
[ cuDF] ( https://github.com/rapidsai/cudf ) - GPU DataFrame Library.
@@ -58,7 +60,6 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1
58
60
[ bottleneck] ( https://github.com/kwgoodman/bottleneck ) - Fast NumPy array functions written in C.
59
61
[ bolz] ( https://github.com/Blosc/bcolz ) - A columnar data container that can be compressed.
60
62
[ cupy] ( https://github.com/cupy/cupy ) - NumPy-like API accelerated with CUDA.
61
- [ vaex] ( https://github.com/vaexio/vaex ) - Out-of-Core DataFrames.
62
63
[ petastorm] ( https://github.com/uber/petastorm ) - Data access library for parquet files by Uber.
63
64
[ zappy] ( https://github.com/lasersonlab/zappy ) - Distributed numpy arrays.
64
65
0 commit comments