|
19 | 19 | ## Contents
|
20 | 20 | * [Machine Learning](#machine-learning)
|
21 | 21 | * [Deep Learning](#deep-learning)
|
| 22 | +* [Web Scraping](#web-scraping) |
22 | 23 | * [Data Manipulation](#data-manipulation)
|
23 | 24 | * [Feature Engineering](#feature-engineering)
|
24 | 25 | * [Visualization](#visualization)
|
|
187 | 188 | * [Caffe2](https://github.com/pytorch/pytorch/tree/master/caffe2) - A lightweight, modular, and scalable deep learning framework (now a part of PyTorch).
|
188 | 189 | * [hipCaffe](https://github.com/ROCmSoftwarePlatform/hipCaffe) - The HIP port of Caffe. <img height="20" src="img/amd_big.png" alt="Possible to run on AMD GPU">
|
189 | 190 |
|
| 191 | +## Web Scraping |
| 192 | +* [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/): The easiest library to scrape static websites for beginners |
| 193 | +* [Scrapy](https://scrapy.org/): Fast and extensible scraping library. Can write rules and create customized scraper without touching the coure |
| 194 | +* [Selenium](https://selenium-python.readthedocs.io/installation.html#introduction): Use Selenium Python API to access all functionalities of Selenium WebDriver in an intuitive way like a real user. |
| 195 | +* [Pattern](https://github.com/clips/pattern): High level scraping for well-establish websites such as Google, Twitter, and Wikipedia. Also has NLP, machine learning algorithms, and visualization |
| 196 | +* [twitterscraper](https://github.com/taspinar/twitterscraper): Efficient library to scrape twitter |
| 197 | + |
190 | 198 | ## Data Manipulation
|
191 | 199 |
|
192 | 200 | ### Data Containers
|
193 | 201 | * [pandas](https://pandas.pydata.org/pandas-docs/stable/) - Powerful Python data analysis toolkit.
|
| 202 | +* [pandas_profiling](https://github.com/pandas-profiling/pandas-profiling) - Create HTML profiling reports from pandas DataFrame objects |
194 | 203 | * [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated">
|
195 | 204 | * [blaze](https://github.com/blaze/blaze) - NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
|
196 | 205 | * [pandasql](https://github.com/yhat/pandasql) - Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
|
|
217 | 226 | * [meza](https://github.com/reubano/meza) - A Python toolkit for processing tabular data.
|
218 | 227 | * [Prodmodel](https://github.com/prodmodel/prodmodel) - Build system for data science pipelines.
|
219 | 228 | * [dopanda](https://github.com/dovpanda-dev/dovpanda) - Hints and tips for using pandas in an analysis environment. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
|
| 229 | +* [CircleCi](https://circleci.com/): Automates your software builds, tests, and deployments. |
220 | 230 |
|
221 | 231 | ## Feature Engineering
|
222 | 232 |
|
|
235 | 245 | * [scikit-rebate](https://github.com/EpistasisLab/scikit-rebate) - A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning. <img height="20" src="img/sklearn_big.png" alt="sklearn">
|
236 | 246 |
|
237 | 247 | ## Visualization
|
| 248 | +### General Purposes |
238 | 249 | * [Matplotlib](https://github.com/matplotlib/matplotlib) - Plotting with Python.
|
239 | 250 | * [seaborn](https://github.com/mwaskom/seaborn) - Statistical data visualization using matplotlib.
|
240 |
| -* [Bokeh](https://github.com/bokeh/bokeh) - Interactive Web Plotting for Python. |
241 |
| -* [HoloViews](https://github.com/ioam/holoviews) - Stop plotting your data - annotate your data and let it visualize itself. |
242 | 251 | * [prettyplotlib](https://github.com/olgabot/prettyplotlib) - Painlessly create beautiful matplotlib plots.
|
243 | 252 | * [python-ternary](https://github.com/marcharper/python-ternary) - Ternary plotting library for python with matplotlib.
|
244 | 253 | * [missingno](https://github.com/ResidentMario/missingno) - Missing data visualization module for Python.
|
245 | 254 | * [chartify](https://github.com/spotify/chartify/) - Python library that makes it easy for data scientists to create charts.
|
246 | 255 | * [physt](https://github.com/janpipek/physt) - Improved histograms.
|
| 256 | +### Interactive plots |
247 | 257 | * [animatplot](https://github.com/t-makaro/animatplot) - A python package for animating plots build on matplotlib.
|
248 | 258 | * [plotly](https://plot.ly/python/) - A Python library that makes interactive and publication-quality graphs.
|
| 259 | +* [Bokeh](https://github.com/bokeh/bokeh) - Interactive Web Plotting for Python. |
| 260 | +* [Altair](https://altair-viz.github.io/) - Declarative statistical visualization library for Python. Can easily do many data transformation within the code to create graph |
| 261 | +* [bqplot](https://github.com/bqplot/bqplot) - Plotting library for IPython/Jupyter notebooks |
| 262 | +### Map |
249 | 263 | * [folium](https://python-visualization.github.io/folium/quickstart.html#Getting-Started) - Makes it easy to visualize data on an interactive open street map
|
250 | 264 | * [geemap](https://github.com/giswqs/geemap) - Python package for interactive mapping with Google Earth Engine (GEE)
|
| 265 | +### Automatic Plotting |
| 266 | +* [HoloViews](https://github.com/ioam/holoviews) - Stop plotting your data - annotate your data and let it visualize itself. |
| 267 | +* [AutoViz](https://github.com/AutoViML/AutoViz): Visualize data automatically with 1 line of code (ideal for machine learning) |
| 268 | +* [SweetViz](https://github.com/fbdesignpro/sweetviz): Visualize and compare datasets, target values and associations, with one line of code. |
251 | 269 |
|
252 |
| - |
253 |
| -## Deployment |
254 |
| -* [datapane](https://datapane.com/) - A collection of APIs to turn scripts and notebooks into interactive reports. |
255 |
| -* [fastapi](https://fastapi.tiangolo.com/) - Modern, fast (high-performance), web framework for building APIs with Python |
256 |
| -* [streamlit](https://www.streamlit.io/) - Make it easy to deploy machine learning model |
| 270 | +### NLP |
| 271 | +* [pyLDAvis](https://github.com/bmabey/pyLDAvis): Visualize interactive topic model |
257 | 272 |
|
258 | 273 |
|
259 | 274 | ## Deployment
|
260 | 275 | * [datapane](https://datapane.com/) - A collection of APIs to turn scripts and notebooks into interactive reports.
|
| 276 | +* [binder](https://mybinder.org/) - Enable sharing and execute Jupyter Notebooks |
261 | 277 | * [fastapi](https://fastapi.tiangolo.com/) - Modern, fast (high-performance), web framework for building APIs with Python
|
262 | 278 | * [streamlit](https://www.streamlit.io/) - Make it easy to deploy machine learning model
|
263 | 279 |
|
| 280 | + |
264 | 281 | ## Model Explanation
|
265 | 282 | * [Alibi](https://github.com/SeldonIO/alibi) - Algorithms for monitoring and explaining machine learning models.
|
266 | 283 | * [anchor](https://github.com/marcotcr/anchor) - Code for "High-Precision Model-Agnostic Explanations" paper.
|
|
380 | 397 | * [flair](https://github.com/zalandoresearch/flair) - Very simple framework for state-of-the-art NLP.
|
381 | 398 | * [spaCy](https://spacy.io/) - Industrial-Strength Natural Language Processing.
|
382 | 399 |
|
| 400 | + |
383 | 401 | ## Computer Audition
|
384 | 402 | * [librosa](https://github.com/librosa/librosa) - Python library for audio and music analysis.
|
385 | 403 | * [Yaafe](https://github.com/Yaafe/Yaafe) - Audio features extraction.
|
|
0 commit comments