Skip to content

Commit dd0f208

Browse files
mierzejkkrzjoa
andauthored
Data processing (krzjoa#41)
* Introduce xarray to Data Manipulation / Data Containers. * Introduce Hamilton to Data Manipulation / Pipelines. * Introduce NumExpre to Computations. * Introduce Google OR-Tools to Optimization. Co-authored-by: Krzysztof Joachimiak <joachimiak.krzysztof@gmail.com>
1 parent 072dc90 commit dd0f208

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

README.md

+5
Original file line numberDiff line numberDiff line change
@@ -192,8 +192,10 @@
192192
* [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - A package which allow to write your own flavor of Pandas easily.
193193
* [pandas-log](https://github.com/eyaltrabelsi/pandas-log) - A package which allow to provide feedback about basic pandas operations and find both buisness logic and performance issues.
194194
* [vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second.
195+
* [xarray](https://github.com/pydata/xarray) - Xarray combines the best features of NumPy and pandas for multidimensional data selection by supplementing numerical axis labels with named dimensions for more intuitive, concise, and less error-prone indexing routines.
195196
* [sk-transformer](https://github.com/chrislemke/sk-transformers) - A collection of various pandas & scikit-learn compatible transformers for all kinds of preprocessing and feature engineering steps <img height="20" src="img/pandas_big.png" alt="pandas compatible">
196197

198+
197199
### Pipelines
198200
* [pdpipe](https://github.com/shaypal5/pdpipe) - Sasy pipelines for pandas DataFrames.
199201
* [SSPipe](https://sspipe.github.io/) - Python pipe (|) operator with support for DataFrames and Numpy and Pytorch.
@@ -205,6 +207,7 @@
205207
* [meza](https://github.com/reubano/meza) - A Python toolkit for processing tabular data.
206208
* [Prodmodel](https://github.com/prodmodel/prodmodel) - Build system for data science pipelines.
207209
* [dopanda](https://github.com/dovpanda-dev/dovpanda) - Hints and tips for using pandas in an analysis environment. <img height="20" src="img/pandas_big.png" alt="pandas compatible">
210+
* [Hamilton](https://github.com/stitchfix/hamilton) - A microframework for dataframe generation that applies Directed Acyclic Graphs specified by a flow of lazily evaluated Python functions.
208211

209212
### Data-centric AI
210213
* [cleanlab](https://github.com/cleanlab/cleanlab) - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
@@ -357,6 +360,7 @@
357360
* [POT](https://github.com/rflamary/POT) - Python Optimal Transport library.
358361
* [Talos](https://github.com/autonomio/talos) - Hyperparameter Optimization for Keras Models.
359362
* [nlopt](https://github.com/stevengj/nlopt) - Library for nonlinear optimization (global and local, constrained or unconstrained).
363+
* [OR-Tools](https://developers.google.com/optimization) - An open source software suite for optimization by Google; provides a unified programming interface to a half dozen solvers: SCIP, GLPK, GLOP, CP-SAT, CPLEX, and Gurobi.
360364

361365
## Time Series
362366
* [sktime](https://github.com/alan-turing-institute/sktime) - A unified framework for machine learning with time series. <img height="20" src="img/sklearn_big.png" alt="sklearn">
@@ -449,6 +453,7 @@
449453
* [numdifftools](https://github.com/pbrod/numdifftools) - Solve automatic numerical differentiation problems in one or more variables.
450454
* [quaternion](https://github.com/moble/quaternion) - Add built-in support for quaternions to numpy.
451455
* [adaptive](https://github.com/python-adaptive/adaptive) - Tools for adaptive and parallel samping of mathematical functions.
456+
* [NumExpr](https://github.com/pydata/numexpr) - A fast numerical expression evaluator for NumPy that comes with an integrated computing virtual machine to speed calculations up by avoiding memory allocation for intermediate results.
452457

453458
## Spatial Analysis
454459
* [GeoPandas](https://github.com/geopandas/geopandas) - Python tools for geographic data. <img height="20" src="img/pandas_big.png" alt="pandas compatible">

0 commit comments

Comments
 (0)