@@ -69,7 +69,7 @@ Visualizations - [Null Hypothesis Significance Testing (NHST)](https://rpsycholo
69
69
[ pdpipe] ( https://github.com/shaypal5/pdpipe ) - Pipelines for DataFrames.
70
70
[ few] ( https://github.com/lacava/few ) - Feature engineering wrapper for sklearn.
71
71
[ skoot] ( https://github.com/tgsmith61591/skoot ) - Pipeline helper functions.
72
- [ categorical-encoding] ( https://github.com/scikit-learn-contrib/categorical-encoding ) - Categorical encoding of variables.
72
+ [ categorical-encoding] ( https://github.com/scikit-learn-contrib/categorical-encoding ) - Categorical encoding of variables, [ vtreat (R package) ] ( https://cran.r-project.org/web/packages/vtreat/vignettes/vtreat.html ) .
73
73
[ dirty_cat] ( https://github.com/dirty-cat/dirty_cat ) - Encoding dirty categorical variables.
74
74
[ patsy] ( https://github.com/pydata/patsy/ ) - R-like syntax for statistical models.
75
75
[ mlxtend] ( https://rasbt.github.io/mlxtend/user_guide/feature_extraction/LinearDiscriminantAnalysis/ ) - LDA.
@@ -88,8 +88,8 @@ Visualizations - [Null Hypothesis Significance Testing (NHST)](https://rpsycholo
88
88
89
89
#### Dimensionality Reduction
90
90
[ prince] ( https://github.com/MaxHalford/prince ) - Dimensionality reduction, factor analysis (PCA, MCA, CA, FAMD).
91
- [ sklearn] ( https://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html ) - Multidimensional scaling.
92
- [ sklearn] ( https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html ) - t-distributed Stochastic Neighbor Embedding. Faster implementations: [ lvdmaaten] ( https://lvdmaaten.github.io/tsne/ ) , [ MulticoreTSNE] ( https://github.com/DmitryUlyanov/Multicore-TSNE ) .
91
+ [ sklearn] ( https://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html ) - Multidimensional scaling (MDS) .
92
+ [ sklearn] ( https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html ) - t-distributed Stochastic Neighbor Embedding (t-SNE), [ intro ] ( https://distill.pub/2016/misread-tsne/ ) . Faster implementations: [ lvdmaaten] ( https://lvdmaaten.github.io/tsne/ ) , [ MulticoreTSNE] ( https://github.com/DmitryUlyanov/Multicore-TSNE ) .
93
93
[ sklearn] ( http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html ) - Truncated SVD (aka LSA).
94
94
[ mdr] ( https://github.com/EpistasisLab/scikit-mdr ) - Dimensionality reduction, multifactor dimensionality reduction (MDR).
95
95
[ umap] ( https://github.com/lmcinnes/umap ) - Uniform Manifold Approximation and Projection.
@@ -101,8 +101,10 @@ Visualizations - [Null Hypothesis Significance Testing (NHST)](https://rpsycholo
101
101
[ physt] ( https://github.com/janpipek/physt ) - Better histograms, [ talk] ( https://www.youtube.com/watch?v=ZG-wH3-Up9Y ) .
102
102
[ matplotlib_venn] ( https://github.com/konstantint/matplotlib-venn ) - Venn diagrams.
103
103
[ joypy] ( https://github.com/sbebo/joypy ) - Draw stacked density plots.
104
+ [ mosaic plots] ( https://www.statsmodels.org/dev/generated/statsmodels.graphics.mosaicplot.mosaic.html ) - Categorical variable visualization, [ example] ( https://sukhbinder.wordpress.com/2018/09/18/mosaic-plot-in-python/ ) .
104
105
[ yellowbrick] ( https://github.com/DistrictDataLabs/yellowbrick ) - Wrapper for matplotlib for diagnosic ML plots.
105
106
[ bokeh] ( https://bokeh.pydata.org/en/latest/ ) - Interactive visualization library, [ Examples] ( https://bokeh.pydata.org/en/latest/docs/user_guide/server.html ) , [ Examples] ( https://github.com/WillKoehrsen/Bokeh-Python-Visualization ) .
107
+ [ plotnine] ( https://github.com/has2k1/plotnine ) - ggplot for Python.
106
108
[ altair] ( https://altair-viz.github.io/ ) - Declarative statistical visualization library.
107
109
[ bqplot] ( https://github.com/bloomberg/bqplot ) - Plotting library for IPython/Jupyter Notebooks.
108
110
[ holoviews] ( http://holoviews.org/ ) - Visualization library.
@@ -144,6 +146,7 @@ Examples: [1](https://lazyprogrammer.me/tutorial-on-collaborative-filtering-and-
144
146
[ spotlight] ( https://github.com/maciejkula/spotlight ) - Deep recommender models using PyTorch.
145
147
[ lightfm] ( https://github.com/lyst/lightfm ) - Recommendation algorithms for both implicit and explicit feedback.
146
148
[ funk-svd] ( https://github.com/gbolmier/funk-svd ) - Fast SVD.
149
+ [ pywFM] ( https://github.com/jfloff/pywFM ) - Factorization.
147
150
148
151
#### Decision Tree Models
149
152
[ lightgbm] ( https://github.com/Microsoft/LightGBM ) - Gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, [ doc] ( https://sites.google.com/view/lauraepp/parameters ) .
@@ -236,6 +239,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/),
236
239
237
240
##### Snippets
238
241
[ Simple Keras models] ( https://gist.github.com/candlewill/552fa102352ccce42fd829ae26277d24 )
242
+ [ Entity Embeddings of Categorical Variables] ( https://arxiv.org/abs/1604.06737 ) , [ code] ( https://github.com/entron/entity-embedding-rossmann ) , [ kaggle] ( https://www.kaggle.com/aquatic/entity-embedding-neural-net/code )
239
243
240
244
#### GPU
241
245
[ cuML] ( https://github.com/rapidsai/cuml ) - Run traditional tabular ML tasks on GPUs.
@@ -336,6 +340,7 @@ RandomSurvivalForests (R packages: randomForestSRC, ggRandomForests).
336
340
[ edward] ( https://github.com/blei-lab/edward ) - Probabilistic modeling, inference, and criticism, [ Mixture Density Networks (MNDs)] ( http://edwardlib.org/tutorials/mixture-density-network ) , [ MDN Explanation] ( https://towardsdatascience.com/a-hitchhikers-guide-to-mixture-density-networks-76b435826cca ) .
337
341
338
342
#### Stacking Models and Ensembles
343
+ [ Model Stacking Blog Post] ( http://blog.kaggle.com/2017/06/15/stacking-made-easy-an-introduction-to-stacknet-by-competitions-grandmaster-marios-michailidis-kazanova/ )
339
344
[ mlxtend] ( https://github.com/rasbt/mlxtend ) - ` EnsembleVoteClassifier ` , ` StackingRegressor ` , ` StackingCVRegressor ` for model stacking.
340
345
[ vecstack] ( https://github.com/vecxoz/vecstack ) - Stacking ML models.
341
346
[ StackNet] ( https://github.com/kaz-Anova/StackNet ) - Stacking ML models.
@@ -440,13 +445,10 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe
440
445
[ dateparser] ( https://dateparser.readthedocs.io/en/latest/ ) - A better date parser.
441
446
[ jellyfish] ( https://github.com/jamesturk/jellyfish ) - Approximate string matching.
442
447
443
-
444
448
#### Blogs
445
449
[ PocketCluster] ( https://blog.pocketcluster.io/ ) - Blog.
446
450
[ Distill.pub] ( https://distill.pub/ ) - Blog.
447
451
448
-
449
-
450
452
#### Awesome Lists
451
453
[ Awesome Adversarial Machine Learning] ( https://github.com/yenchenlin/awesome-adversarial-machine-learning )
452
454
[ Awesome AI Booksmarks] ( https://github.com/goodrahstar/my-awesome-AI-bookmarks )
0 commit comments