From 175c4b56f20d55cc4411e0f33904c720f01c61a5 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 10 Jan 2023 18:08:27 +0100 Subject: [PATCH 001/152] Bleedthrough --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 42546b3..fb1f46c 100644 --- a/README.md +++ b/README.md @@ -243,6 +243,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [allencell](https://www.allencell.org/segmenter.html) - Tools for the 3D segmentation of intracellular structures. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microsopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. [ashlar](https://github.com/labsyspharm/ashlar) - Image stitching and registration. +[cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. +Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). #### Domain Adaptation / Batch-Effect Correction [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). From 879783c4e55e63d3c1ffff584db330a559a4d8b7 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 10 Jan 2023 18:40:25 +0100 Subject: [PATCH 002/152] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index fb1f46c..24fe58a 100644 --- a/README.md +++ b/README.md @@ -245,6 +245,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [ashlar](https://github.com/labsyspharm/ashlar) - Image stitching and registration. [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). +Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). #### Domain Adaptation / Batch-Effect Correction [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). From 20cbffbee5e02cc8c81ce2a4f473dd7ab278345d Mon Sep 17 00:00:00 2001 From: Darigov Research <30328618+darigovresearch@users.noreply.github.com> Date: Sun, 15 Jan 2023 23:27:34 +0000 Subject: [PATCH 003/152] refactor: Small copyediting changes See diff for details --- README.md | 94 +++++++++++++++++++++++++++---------------------------- 1 file changed, 47 insertions(+), 47 deletions(-) diff --git a/README.md b/README.md index 24fe58a..ff5ea0f 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1 [nbdime](https://github.com/jupyter/nbdime) - Diff two notebook files, Alternative GitHub App: [ReviewNB](https://www.reviewnb.com/). [RISE](https://github.com/damianavila/RISE) - Turn Jupyter notebooks into presentations. [qgrid](https://github.com/quantopian/qgrid) - Pandas `DataFrame` sorting. -[pivottablejs](https://github.com/nicolaskruchten/jupyter_pivottablejs) - Drag n drop Pivot Tables and Charts for jupyter notebooks. +[pivottablejs](https://github.com/nicolaskruchten/jupyter_pivottablejs) - Drag n drop Pivot Tables and Charts for Jupyter notebooks. [itables](https://github.com/mwouts/itables) - Interactive tables in Jupyter. [jupyter-datatables](https://github.com/CermakM/jupyter-datatables) - Interactive tables in Jupyter. [debugger](https://blog.jupyter.org/a-visual-debugger-for-jupyter-914e61716559) - Visual debugger for Jupyter. @@ -42,11 +42,11 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1 [vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames. [pandarallel](https://github.com/nalepae/pandarallel) - Parallelize pandas operations. [xarray](https://github.com/pydata/xarray/) - Extends pandas to n-dimensional arrays. -[swifter](https://github.com/jmcarpenter2/swifter) - Apply any function to a pandas dataframe faster. +[swifter](https://github.com/jmcarpenter2/swifter) - Apply any function to a pandas DataFrame faster. [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. [pandas-log](https://github.com/eyaltrabelsi/pandas-log) - Find business logic issues and performance issues in pandas. [pandapy](https://github.com/firmai/pandapy) - Additional features for pandas. -[lux](https://github.com/lux-org/lux) - Dataframe visualization within Jupyter. +[lux](https://github.com/lux-org/lux) - DataFrame visualization within Jupyter. [dtale](https://github.com/man-group/dtale) - View and analyze Pandas data structures, integrating with Jupyter. [polars](https://github.com/pola-rs/polars) - Multi-threaded alternative to pandas. [duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. @@ -82,8 +82,8 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1 [bolz](https://github.com/Blosc/bcolz) - A columnar data container that can be compressed. [cupy](https://github.com/cupy/cupy) - NumPy-like API accelerated with CUDA. [petastorm](https://github.com/uber/petastorm) - Data access library for parquet files by Uber. -[zarr](https://github.com/zarr-developers/zarr-python) - Distributed numpy arrays. -[NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by nvidia. +[zarr](https://github.com/zarr-developers/zarr-python) - Distributed NumPy arrays. +[NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by Nvidia. [tensorstore](https://github.com/google/tensorstore) - Reading and writing large multi-dimensional arrays (Google). #### Distributed Systems @@ -95,7 +95,7 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1 [xsv](https://github.com/BurntSushi/xsv) - Command line tool for indexing, slicing, analyzing, splitting and joining CSV files. [csvkit](https://csvkit.readthedocs.io/en/1.0.3/) - Another command line tool for CSV files. [csvsort](https://pypi.org/project/csvsort/) - Sort large csv files. -[tsv-utils](https://github.com/eBay/tsv-utils) - Tools for working with CSV files by ebay. +[tsv-utils](https://github.com/eBay/tsv-utils) - Tools for working with CSV files by eBay. [cheat](https://github.com/cheat/cheat) - Make cheatsheets for command line commands. #### Classical Statistics @@ -120,7 +120,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [torch-two-sample](https://github.com/josipd/torch-two-sample) - Friedman-Rafsky Test: Compare two population based on a multivariate generalization of the Runstest. [Explanation](https://www.real-statistics.com/multivariate-statistics/multivariate-normal-distribution/friedman-rafsky-test/), [Application](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5014134/) ##### Interim Analyses / Sequential Analysis / Stopping -[Squential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. + [Sequential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. [Treatment Effects Monitoring](https://online.stat.psu.edu/stat509/node/75/) - Design and Analysis of Clinical Trials PennState. [sequential](https://cran.r-project.org/web/packages/Sequential/Sequential.pdf) - Exact Sequential Analysis for Poisson and Binomial Data (R package). [confseq](https://github.com/gostevehoward/confseq) - Uniform boundaries, confidence sequences, and always-valid p-values. @@ -179,7 +179,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [littleballoffur](https://github.com/benedekrozemberczki/littleballoffur) - Sampling from graphs. #### Noisy Labels -[cleanlab](https://github.com/cleanlab/cleanlab) - Machine learning with noisy labels, finding mislabeled data, and uncertainty quantification. Also see awesome list below. +[cleanlab](https://github.com/cleanlab/cleanlab) - Machine learning with noisy labels, finding mislabelled data, and uncertainty quantification. Also see awesome list below. [doubtlab](https://github.com/koaning/doubtlab) - Find bad or noisy labels. #### Train / Test Split @@ -199,7 +199,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [tsfresh](https://github.com/blue-yonder/tsfresh) - Time series feature engineering. [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines. [feature_engine](https://github.com/solegalli/feature_engine) - Encoders, transformers, etc. -[NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by nvidia. +[NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by Nvidia. #### Computer Vision [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) @@ -241,7 +241,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. [allencell](https://www.allencell.org/segmenter.html) - Tools for the 3D segmentation of intracellular structures. -[py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microsopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. +[py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. [ashlar](https://github.com/labsyspharm/ashlar) - Image stitching and registration. [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). @@ -255,7 +255,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [nimfa](https://github.com/mims-harvard/nimfa) - Nonnegative matrix factorization. [scgen](https://github.com/theislab/scgen) - Batch removal. [Doc](https://scgen.readthedocs.io/en/stable/). [CORAL](https://github.com/google-research/google-research/tree/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn) - Correcting for Batch Effects Using Wasserstein Distance, [Code](https://github.com/google-research/google-research/blob/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn/transform.py#L152), [Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050548/). -[adapt](https://github.com/adapt-python/adapt) - Aweseome Domain Adaptation Python Toolbox. +[adapt](https://github.com/adapt-python/adapt) - Awesome Domain Adaptation Python Toolbox. [pytorch-adapt](https://github.com/KevinMusgrave/pytorch-adapt) - Various neural network models for domain adaptation. #### Feature Engineering Images @@ -389,7 +389,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [streamlit](https://github.com/streamlit/streamlit) - Dashboarding solution. [Resources](https://github.com/marcskovmadsen/awesome-streamlit), [Gallery](https://awesome-streamlit.org/) [Components](https://www.streamlit.io/components), [bokeh-events](https://github.com/ash2shukla/streamlit-bokeh-events). [mercury](https://github.com/mljar/mercury) - Convert Python notebook to web app, [Example](https://github.com/pplonski/dashboard-python-jupyter-notebook). [dash](https://dash.plot.ly/gallery) - Dashboarding solution by plot.ly. [Resources](https://github.com/ucg8j/awesome-dash). -[visdom](https://github.com/facebookresearch/visdom) - Dashboarding library by facebook. +[visdom](https://github.com/facebookresearch/visdom) - Dashboarding library by Facebook. [panel](https://panel.pyviz.org/index.html) - Dashboarding solution. [altair example](https://github.com/xhochy/altair-vue-vega-example) - [Video](https://www.youtube.com/watch?v=4L568emKOvs). [voila](https://github.com/QuantStack/voila) - Turn Jupyter notebooks into standalone web applications. @@ -510,7 +510,7 @@ See also Microscopy Section above. ##### Drug discovery [TDC](https://github.com/mims-harvard/TDC/tree/main) - Drug Discovery and Development. -[DeepPurpose](https://github.com/kexinhuang12345/DeepPurpose) - Deep Learning Based Molecular Modeling and Prediction Toolkit. +[DeepPurpose](https://github.com/kexinhuang12345/DeepPurpose) - Deep Learning Based Molecular Modelling and Prediction Toolkit. ##### Courses [mit6874](https://mit6874.github.io/) - Computational Systems Biology: Deep Learning in the Life Sciences. @@ -565,9 +565,9 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [keras-tuner](https://github.com/keras-team/keras-tuner) - Hyperparameter tuning for Keras. [hyperas](https://github.com/maxpumperla/hyperas) - Keras + Hyperopt: Convenient hyperparameter optimization wrapper. [elephas](https://github.com/maxpumperla/elephas) - Distributed Deep learning with Keras & Spark. -[tflearn](https://github.com/tflearn/tflearn) - Neural Networks on top of tensorflow. -[tensorlayer](https://github.com/tensorlayer/tensorlayer) - Neural Networks on top of tensorflow, [tricks](https://github.com/wagamamaz/tensorlayer-tricks). -[tensorforce](https://github.com/reinforceio/tensorforce) - Tensorflow for applied reinforcement learning. +[tflearn](https://github.com/tflearn/tflearn) - Neural Networks on top of TensorFlow. +[tensorlayer](https://github.com/tensorlayer/tensorlayer) - Neural Networks on top of TensorFlow, [tricks](https://github.com/wagamamaz/tensorlayer-tricks). +[tensorforce](https://github.com/reinforceio/tensorforce) - TensorFlow for applied reinforcement learning. [autokeras](https://github.com/jhfjhfj1/autokeras) - AutoML for deep learning. [PlotNeuralNet](https://github.com/HarisIqbal88/PlotNeuralNet) - Plot neural networks. [lucid](https://github.com/tensorflow/lucid) - Neural network interpretability, [Activation Maps](https://openai.com/blog/introducing-activation-atlases/). @@ -577,16 +577,16 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [hiddenlayer](https://github.com/waleedka/hiddenlayer) - Training metrics. [imgclsmob](https://github.com/osmr/imgclsmob) - Pretrained models. [netron](https://github.com/lutzroeder/netron) - Visualizer for deep learning and machine learning models. -[ffcv](https://github.com/libffcv/ffcv) - Fast dataloder. - -##### Libs Pytorch -[Good Pytorch Introduction](https://cs230.stanford.edu/blog/pytorch/) -[skorch](https://github.com/dnouri/skorch) - Scikit-learn compatible neural network library that wraps pytorch, [talk](https://www.youtube.com/watch?v=0J7FaLk0bmQ), [slides](https://github.com/thomasjpfan/skorch_talk). -[fastai](https://github.com/fastai/fastai) - Neural Networks in pytorch. -[timm](https://github.com/rwightman/pytorch-image-models) - Pytorch image models. -[ignite](https://github.com/pytorch/ignite) - Highlevel library for pytorch. +[ffcv](https://github.com/libffcv/ffcv) - Fast dataloader. + +##### Libs PyTorch +[Good PyTorch Introduction](https://cs230.stanford.edu/blog/pytorch/) +[skorch](https://github.com/dnouri/skorch) - Scikit-learn compatible neural network library that wraps PyTorch, [talk](https://www.youtube.com/watch?v=0J7FaLk0bmQ), [slides](https://github.com/thomasjpfan/skorch_talk). +[fastai](https://github.com/fastai/fastai) - Neural Networks in PyTorch. +[timm](https://github.com/rwightman/pytorch-image-models) - PyTorch image models. +[ignite](https://github.com/pytorch/ignite) - Highlevel library for PyTorch. [torchcv](https://github.com/donnyyou/torchcv) - Deep Learning in Computer Vision. -[pytorch-optimizer](https://github.com/jettify/pytorch-optimizer) - Collection of optimizers for pytorch. +[pytorch-optimizer](https://github.com/jettify/pytorch-optimizer) - Collection of optimizers for PyTorch. [pytorch-lightning](https://github.com/PyTorchLightning/PyTorch-lightning) - Wrapper around PyTorch. [lightly](https://github.com/lightly-ai/lightly) - MoCo, SimCLR, SimSiam, Barlow Twins, BYOL, NNCLR. [MONAI](https://github.com/project-monai/monai) - Deep learning in healthcare imaging. @@ -623,7 +623,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), ##### Image Classification [nfnets](https://github.com/ypeleg/nfnets-keras) - Neural network. [efficientnet](https://github.com/lukemelas/EfficientNet-PyTorch) - Neural network. -[pycls](https://github.com/facebookresearch/pycls) - Pytorch image classification networks: ResNet, ResNeXt, EfficientNet, and RegNet (by Facebook). +[pycls](https://github.com/facebookresearch/pycls) - PyTorch image classification networks: ResNet, ResNeXt, EfficientNet, and RegNet (by Facebook). ##### Applications and Snippets [SPADE](https://github.com/nvlabs/spade) - Semantic Image Synthesis. @@ -642,10 +642,10 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po [Awesome GAN Applications](https://github.com/nashory/gans-awesome-applications) [The GAN Zoo](https://github.com/hindupuravinash/the-gan-zoo) - List of Generative Adversarial Networks. [CycleGAN and Pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) - Various image-to-image tasks. -[Tensorflow GAN implementations](https://github.com/hwalsuklee/tensorflow-generative-model-collections) -[Pytorch GAN implementations](https://github.com/znxlwm/pytorch-generative-model-collections) -[Pytorch GAN implementations](https://github.com/eriklindernoren/PyTorch-GAN#adversarial-autoencoder) -[StudioGAN](https://github.com/POSTECH-CVLab/PyTorch-StudioGAN) - Pytorch GAN implementations. +[TensorFlow GAN implementations](https://github.com/hwalsuklee/tensorflow-generative-model-collections) +[PyTorch GAN implementations](https://github.com/znxlwm/pytorch-generative-model-collections) +[PyTorch GAN implementations](https://github.com/eriklindernoren/PyTorch-GAN#adversarial-autoencoder) +[StudioGAN](https://github.com/POSTECH-CVLab/PyTorch-StudioGAN) - PyTorch GAN implementations. ##### Transformers [SegFormer](https://github.com/NVlabs/SegFormer) - Simple and Efficient Design for Semantic Segmentation with Transformers. @@ -664,7 +664,7 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po [cugraph](https://github.com/rapidsai/cugraph) - RAPIDS, Graph library on the GPU. [pytorch-geometric](https://github.com/rusty1s/pytorch_geometric) - Various methods for deep learning on graphs. [dgl](https://github.com/dmlc/dgl) - Deep Graph Library. -[graph_nets](https://github.com/deepmind/graph_nets) - Build graph networks in Tensorflow, by deepmind. +[graph_nets](https://github.com/deepmind/graph_nets) - Build graph networks in TensorFlow, by DeepMind. #### Model conversion [hummingbird](https://github.com/microsoft/hummingbird) - Compile trained ML models into tensor computations (by Microsoft). @@ -699,10 +699,10 @@ Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teach [Contrastive Representation Learning](https://lilianweng.github.io/lil-log/2021/05/31/contrastive-representation-learning.html) [metric-learn](https://github.com/scikit-learn-contrib/metric-learn) - Supervised and weakly-supervised metric learning algorithms. -[pytorch-metric-learning](https://github.com/KevinMusgrave/pytorch-metric-learning) - Pytorch metric learning. +[pytorch-metric-learning](https://github.com/KevinMusgrave/pytorch-metric-learning) - PyTorch metric learning. [deep_metric_learning](https://github.com/ronekko/deep_metric_learning) - Methods for deep metric learning. [ivis](https://bering-ivis.readthedocs.io/en/latest/supervised.html) - Metric learning using siamese neural networks. -[tensorflow similarity](https://github.com/tensorflow/similarity) - Metric learning. +[TensorFlow similarity](https://github.com/tensorflow/similarity) - Metric learning. #### Distance Functions [scipy.spatial](https://docs.scipy.org/doc/scipy/reference/spatial.distance.html) - All kinds of distance metrics. @@ -768,10 +768,10 @@ Other measures: #### Signal Processing and Filtering [Stanford Lecture Series on Fourier Transformation](https://see.stanford.edu/Course/EE261), [Youtube](https://www.youtube.com/watch?v=gZNm7L96pfY&list=PLB24BC7956EE040CD&index=1), [Lecture Notes](https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf). -[Visual fourier explanation](https://dsego.github.io/demystifying-fourier/). +[Visual Fourier explanation](https://dsego.github.io/demystifying-fourier/). [The Scientist & Engineer's Guide to Digital Signal Processing (1999)](https://www.analog.com/en/education/education-library/scientist_engineers_guide.html). [Kalman Filter article](https://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures). -[Kalman Filter book](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python) - Focuses on intuition using Jupyter Notebooks. Includes Baysian and various Kalman filters. +[Kalman Filter book](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python) - Focuses on intuition using Jupyter Notebooks. Includes Bayesian and various Kalman filters. [Interactive Tool](https://fiiir.com/) for FIR and IIR filters, [Examples](https://plot.ly/python/fft-filters/). [filterpy](https://github.com/rlabbe/filterpy) - Kalman filtering and optimal estimation library. @@ -782,7 +782,7 @@ Other measures: [statsmodels](https://www.statsmodels.org/dev/tsa.html) - Time series analysis, [seasonal decompose](https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html) [example](https://gist.github.com/balzer82/5cec6ad7adc1b550e7ee), [SARIMA](https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html), [granger causality](http://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.grangercausalitytests.html). [kats](https://github.com/facebookresearch/kats) - Time series prediction library by Facebook. [prophet](https://github.com/facebook/prophet) - Time series prediction library by Facebook. -[neural_prophet](https://github.com/ourownstory/neural_prophet) - Time series prediction built on Pytorch. +[neural_prophet](https://github.com/ourownstory/neural_prophet) - Time series prediction built on PyTorch. [pyramid](https://github.com/tgsmith61591/pyramid), [pmdarima](https://github.com/tgsmith61591/pmdarima) - Wrapper for (Auto-) ARIMA. [modeltime](https://cran.r-project.org/web/packages/modeltime/index.html) - Time series forecasting framework (R package). [pyflux](https://github.com/RJT1990/pyflux) - Time series prediction algorithms (ARIMA, GARCH, GAS, Bayesian). @@ -803,7 +803,7 @@ https://machinelearningmastery.com/time-series-forecasting-long-short-term-memor [pastas](https://pastas.readthedocs.io/en/latest/examples.html) - Simulation of time series. [fastdtw](https://github.com/slaypni/fastdtw) - Dynamic Time Warp Distance. [fable](https://www.rdocumentation.org/packages/fable/versions/0.0.0.9000) - Time Series Forecasting (R package). -[pydlm](https://github.com/wwrechard/pydlm) - Bayesian time series modeling ([R package](https://cran.r-project.org/web/packages/bsts/index.html), [Blog post](http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html)) +[pydlm](https://github.com/wwrechard/pydlm) - Bayesian time series modelling ([R package](https://cran.r-project.org/web/packages/bsts/index.html), [Blog post](http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html)) [PyAF](https://github.com/antoinecarme/pyaf) - Automatic Time Series Forecasting. [luminol](https://github.com/linkedin/luminol) - Anomaly Detection and Correlation library from Linkedin. [matrixprofile-ts](https://github.com/target/matrixprofile-ts) - Detecting patterns and anomalies, [website](https://www.cs.ucr.edu/~eamonn/MatrixProfile.html), [ppt](https://www.cs.ucr.edu/~eamonn/Matrix_Profile_Tutorial_Part1.pdf), [alternative](https://github.com/matrix-profile-foundation/mass-ts). @@ -836,7 +836,7 @@ Tutorial on using cvxpy: [1](https://calmcode.io/cvxpy-one/the-stigler-diet.html [bt](https://github.com/pmorissette/bt) - Backtesting algorithms. [alpaca-trade-api-python](https://github.com/alpacahq/alpaca-trade-api-python) - Commission-free trading through API. [eiten](https://github.com/tradytics/eiten) - Eigen portfolios, minimum variance portfolios and other algorithmic investing strategies. -[tf-quant-finance](https://github.com/google/tf-quant-finance) - Quantitative finance tools in tensorflow, by Google. +[tf-quant-finance](https://github.com/google/tf-quant-finance) - Quantitative finance tools in TensorFlow, by Google. [quantstats](https://github.com/ranaroussi/quantstats) - Portfolio management. [Riskfolio-Lib](https://github.com/dcajasn/Riskfolio-Lib) - Portfolio optimization and strategic asset allocation. [OpenBBTerminal](https://github.com/OpenBB-finance/OpenBBTerminal) - Terminal. @@ -902,7 +902,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [Bours - Confounding](https://edisciplinas.usp.br/pluginfile.php/5625667/mod_resource/content/3/Nontechnicalexplanation-counterfactualdefinition-confounding.pdf) [Bours - Effect Modification and Interaction](https://www.sciencedirect.com/science/article/pii/S0895435621000330) -#### Probabilistic Modeling and Bayes +#### Probabilistic Modelling and Bayes [Intro](https://erikbern.com/2018/10/08/the-hackers-guide-to-uncertainty-estimates.html), [Guide](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers) [PyMC3](https://docs.pymc.io/) - Bayesian modelling, [intro](https://docs.pymc.io/notebooks/getting_started) [numpyro](https://github.com/pyro-ppl/numpyro) - Probabilistic programming with numpy, built on [pyro](https://github.com/pyro-ppl/pyro). @@ -910,9 +910,9 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [pmlearn](https://github.com/pymc-learn/pymc-learn) - Probabilistic machine learning. [arviz](https://github.com/arviz-devs/arviz) - Exploratory analysis of Bayesian models. [zhusuan](https://github.com/thu-ml/zhusuan) - Bayesian deep learning, generative models. -[edward](https://github.com/blei-lab/edward) - Probabilistic modeling, inference, and criticism, [Mixture Density Networks (MNDs)](http://edwardlib.org/tutorials/mixture-density-network), [MDN Explanation](https://towardsdatascience.com/a-hitchhikers-guide-to-mixture-density-networks-76b435826cca). +[edward](https://github.com/blei-lab/edward) - Probabilistic modelling, inference, and criticism, [Mixture Density Networks (MNDs)](http://edwardlib.org/tutorials/mixture-density-network), [MDN Explanation](https://towardsdatascience.com/a-hitchhikers-guide-to-mixture-density-networks-76b435826cca). [Pyro](https://github.com/pyro-ppl/pyro) - Deep Universal Probabilistic Programming. -[tensorflow probability](https://github.com/tensorflow/probability) - Deep learning and probabilistic modelling, [talk1](https://www.youtube.com/watch?v=KJxmC5GCWe4), [notebook talk1](https://github.com/AlxndrMlk/PyDataGlobal2021/blob/main/00_PyData_Global_2021_nb_full.ipynb), [talk2](https://www.youtube.com/watch?v=BrwKURU-wpk), [example](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_TFP.ipynb). +[TensorFlow probability](https://github.com/tensorflow/probability) - Deep learning and probabilistic modelling, [talk1](https://www.youtube.com/watch?v=KJxmC5GCWe4), [notebook talk1](https://github.com/AlxndrMlk/PyDataGlobal2021/blob/main/00_PyData_Global_2021_nb_full.ipynb), [talk2](https://www.youtube.com/watch?v=BrwKURU-wpk), [example](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_TFP.ipynb). [bambi](https://github.com/bambinos/bambi) - High-level Bayesian model-building interface on top of PyMC3. [neural-tangents](https://github.com/google/neural-tangents) - Infinite Neural Networks. [bnlearn](https://github.com/erdogant/bnlearn) - Bayesian networks, parameter learning, inference and sampling methods. @@ -920,8 +920,8 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y #### Gaussian Processes [Visualization](http://www.infinitecuriosity.org/vizgp/), [Article](https://distill.pub/2019/visual-exploration-gaussian-processes/) [GPyOpt](https://github.com/SheffieldML/GPyOpt) - Gaussian process optimization. -[GPflow](https://github.com/GPflow/GPflow) - Gaussian processes (Tensorflow). -[gpytorch](https://gpytorch.ai/) - Gaussian processes (Pytorch). +[GPflow](https://github.com/GPflow/GPflow) - Gaussian processes (TensorFlow). +[gpytorch](https://gpytorch.ai/) - Gaussian processes (PyTorch). #### Stacking Models and Ensembles [Model Stacking Blog Post](http://blog.kaggle.com/2017/06/15/stacking-made-easy-an-introduction-to-stacknet-by-competitions-grandmaster-marios-michailidis-kazanova/) @@ -974,7 +974,7 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin [captum](https://github.com/pytorch/captum) - Model interpretability and understanding for PyTorch. #### Automated Machine Learning -[AdaNet](https://github.com/tensorflow/adanet) - Automated machine learning based on tensorflow. +[AdaNet](https://github.com/tensorflow/adanet) - Automated machine learning based on TensorFlow. [tpot](https://github.com/EpistasisLab/tpot) - Automated machine learning tool, optimizes machine learning pipelines. [auto_ml](https://github.com/ClimbsRocks/auto_ml) - Automated machine learning for analytics & production. [autokeras](https://github.com/jhfjhfj1/autokeras) - AutoML for deep learning. @@ -986,11 +986,11 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin #### Graph Representation Learning [Karate Club](https://github.com/benedekrozemberczki/karateclub) - Unsupervised learning on graphs. -[Pytorch Geometric](https://github.com/rusty1s/pytorch_geometric) - Graph representation learning with PyTorch. +[PyTorch Geometric](https://github.com/rusty1s/pytorch_geometric) - Graph representation learning with PyTorch. [DLG](https://github.com/dmlc/dgl) - Graph representation learning with TensorFlow. #### Convex optimization -[cvxpy](https://github.com/cvxgrp/cvxpy) - Modeling language for convex optimization problems. Tutorial: [1](https://calmcode.io/cvxpy-one/the-stigler-diet.html), [2](https://calmcode.io/cvxpy-two/introduction.html) +[cvxpy](https://github.com/cvxgrp/cvxpy) - Modelling language for convex optimization problems. Tutorial: [1](https://calmcode.io/cvxpy-one/the-stigler-diet.html), [2](https://calmcode.io/cvxpy-two/introduction.html) #### Evolutionary Algorithms & Optimization [deap](https://github.com/DEAP/deap) - Evolutionary computation framework (Genetic Algorithm, Evolution strategies). @@ -1166,7 +1166,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Visual Transformer](https://github.com/dk-liang/Awesome-Visual-Transformer) #### Lectures -[NYU Deep Learning SP21](https://www.youtube.com/playlist?list=PLLHTzKZzVU9e6xUfG10TkTWApKSZCzuBI) - Youtube Playlist. +[NYU Deep Learning SP21](https://www.youtube.com/playlist?list=PLLHTzKZzVU9e6xUfG10TkTWApKSZCzuBI) - YouTube Playlist. #### Things I google a lot [Color codes](https://github.com/d3/d3-3.x-api-reference/blob/master/Ordinal-Scales.md#categorical-colors) From 2c6556567ee869ed5a9b8c1515bd7ff5d2c2b59a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 25 Jan 2023 20:17:21 +0100 Subject: [PATCH 004/152] py-shiny --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ff5ea0f..966ab56 100644 --- a/README.md +++ b/README.md @@ -385,6 +385,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [Named Colors Wheel](https://arantius.github.io/web-color-wheel/) - Color wheel for all named HTML colors. #### Dashboards +[py-shiny](https://github.com/rstudio/py-shiny) - Shiny for Python, [talk](https://www.youtube.com/watch?v=ijRBbtT2tgc). [superset](https://github.com/apache/superset) - Dashboarding solution by Apache. [streamlit](https://github.com/streamlit/streamlit) - Dashboarding solution. [Resources](https://github.com/marcskovmadsen/awesome-streamlit), [Gallery](https://awesome-streamlit.org/) [Components](https://www.streamlit.io/components), [bokeh-events](https://github.com/ash2shukla/streamlit-bokeh-events). [mercury](https://github.com/mljar/mercury) - Convert Python notebook to web app, [Example](https://github.com/pplonski/dashboard-python-jupyter-notebook). From 436c2d4d75b4ecc5ec667ecb027b476f3f58bb5a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 30 Jan 2023 10:16:18 +0100 Subject: [PATCH 005/152] Foundations of Data Science Book --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 966ab56..fb0eeb2 100644 --- a/README.md +++ b/README.md @@ -1117,6 +1117,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [datasharing](https://github.com/jtleek/datasharing) - Guide to data sharing. ##### Books +[Blum - Foundations of Data Science](https://www.cs.cornell.edu/jeh/book.pdf?file=book.pdf) [Chan - Introduction to Probability for Data Science](https://probability4datascience.com/index.html) [Colonescu - Principles of Econometrics with R](https://bookdown.org/ccolonescu/RPoE4/) From 0a32af5d47547132fc0f1e9b56a68196febe8ce1 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 2 Feb 2023 16:00:23 +0100 Subject: [PATCH 006/152] Removed dead link. --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index fb0eeb2..99f14b8 100644 --- a/README.md +++ b/README.md @@ -1073,7 +1073,6 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe [m2cgen](https://github.com/BayesWitnesses/m2cgen) - Transpile trained ML models into other languages. [sklearn-porter](https://github.com/nok/sklearn-porter) - Transpile trained scikit-learn estimators to C, Java, JavaScript and others. [mlflow](https://mlflow.org/) - Manage the machine learning lifecycle, including experimentation, reproducibility and deployment. -[modelchimp](https://github.com/ModelChimp/modelchimp) - Experiment Tracking. [skll](https://github.com/EducationalTestingService/skll) - Command-line utilities to make it easier to run machine learning experiments. [BentoML](https://github.com/bentoml/BentoML) - Package and deploy machine learning models for serving in production. [dagster](https://github.com/dagster-io/dagster) - Tool with focus on dependency graphs. From 8b907053927d13a5e91e9286ca0f424019e13d9d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 27 Feb 2023 17:08:20 +0100 Subject: [PATCH 007/152] Update README.md --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 99f14b8..8e401b7 100644 --- a/README.md +++ b/README.md @@ -225,6 +225,11 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. +#### Labsyspharm +[mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). +[MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. +[cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). + #### Packages [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. @@ -235,7 +240,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). [ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. [CSBDeep](https://github.com/CSBDeep/CSBDeep) - Image denoising, restoration and object detection, [Project page](https://csbdeep.bioimagecomputing.com/tools/). -[mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Paper](https://www.nature.com/articles/s41592-021-01308-y). [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. [stardist](https://github.com/stardist/stardist) - Object Detection with Star-convex Shapes. [nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. From c701a4e5286bf6bd483124d07afcdc3ca83d35a7 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 1 Mar 2023 13:29:31 +0100 Subject: [PATCH 008/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 8e401b7..8ed249c 100644 --- a/README.md +++ b/README.md @@ -1065,7 +1065,6 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe ##### Data Versioning, Databases, Pipelines and Model Serving [dvc](https://github.com/iterative/dvc) - Version control for large files. -[hangar](https://github.com/tensorwerk/hangar-py) - Version control for tensor data. [kedro](https://github.com/quantumblacklabs/kedro) - Build data pipelines. [feast](https://github.com/feast-dev/feast) - Feature store. [Video](https://www.youtube.com/watch?v=_omcXenypmo). [pinecone](https://www.pinecone.io/) - Database for vector search applications. From 1cdd2ffe968e455bad8d7c3d443d1b817057faf2 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 1 Mar 2023 13:38:55 +0100 Subject: [PATCH 009/152] Removed unmaintained packages. --- README.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/README.md b/README.md index 8ed249c..360d48c 100644 --- a/README.md +++ b/README.md @@ -1058,10 +1058,7 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe [cog](https://github.com/replicate/cog) - Facilitates building Docker images. ##### Dependency Management -[dephell](https://github.com/dephell/dephell) - Dependency management. [poetry](https://github.com/python-poetry/poetry) - Dependency management. -[pyup](https://github.com/pyupio/pyup) - Dependency management. -[pypi-timemachine](https://github.com/astrofrog/pypi-timemachine) - Install packages with pip as if you were in the past. ##### Data Versioning, Databases, Pipelines and Model Serving [dvc](https://github.com/iterative/dvc) - Version control for large files. From 955108cdb23156af081bda1c1e709f0b335c4c04 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 2 Mar 2023 16:25:59 +0100 Subject: [PATCH 010/152] DESeq2 --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 360d48c..fa2f708 100644 --- a/README.md +++ b/README.md @@ -494,6 +494,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www ##### Sequencing [Single cell tutorial](https://github.com/theislab/single-cell-tutorial). +[DESeq2](http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html) - Analyzing RNA-seq data (R package). [cellxgene](https://github.com/chanzuckerberg/cellxgene) - Interactive explorer for single-cell transcriptomics data. [scanpy](https://github.com/theislab/scanpy) - Analyze single-cell gene expression data, [tutorial](https://github.com/theislab/single-cell-tutorial). [besca](https://github.com/bedapub/besca) - Beyond single-cell analysis. From 11ae6e4ab5f5da5d4aae0551e92ebdca66d4b5b9 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 6 Mar 2023 22:51:09 +0100 Subject: [PATCH 011/152] Awesome Single Cell --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index fa2f708..2583029 100644 --- a/README.md +++ b/README.md @@ -1160,6 +1160,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Pytorch](https://github.com/bharathgs/Awesome-pytorch-list) [Awesome Quantitative Finance](https://github.com/wilsonfreitas/awesome-quant) [Awesome Recommender Systems](https://github.com/grahamjenson/list_of_recommender_systems) +[Awesome Single Cell](https://github.com/seandavi/awesome-single-cell) [Awesome Semantic Segmentation](https://github.com/mrgloom/awesome-semantic-segmentation) [Awesome Sentence Embedding](https://github.com/Separius/awesome-sentence-embedding) [Awesome Time Series](https://github.com/MaxBenChrist/awesome_time_series_in_python) From 23c0168f2bc6121671828222f63052019587d8f0 Mon Sep 17 00:00:00 2001 From: Andy Kipp Date: Thu, 9 Mar 2023 19:41:38 +0600 Subject: [PATCH 012/152] Added xonsh shell --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2583029..b254f2b 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ [sklearn_pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - Helpful `DataFrameMapper` class. [missingno](https://github.com/ResidentMario/missingno) - Missing data visualization. [rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - Plugin to display .csv files with nice colors. +[xonsh](https://xon.sh) - Python-powered shell as alternative to Bash for simplifying data science automations. #### Environment and Jupyter [General Jupyter Tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/) From b77a955ffb92fe5d440ac2e07639338bd72e646c Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 13 Mar 2023 11:03:01 +0100 Subject: [PATCH 013/152] seg-eval --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b254f2b..b03d940 100644 --- a/README.md +++ b/README.md @@ -236,6 +236,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. [Tree of Microscopy](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). +[seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). [skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). [cidre](https://github.com/smithk/cidre) - Illumination correction method for optical microscopy. [BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). From 4f146614c0998e13b4e0974247cba863ccebab38 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 7 Apr 2023 10:07:36 +0200 Subject: [PATCH 014/152] Cell-ACDC --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b03d940..e7bec30 100644 --- a/README.md +++ b/README.md @@ -252,6 +252,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). +[Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Cell segmentation and tracking. #### Domain Adaptation / Batch-Effect Correction [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). From 07e67ab9f6fd872949a6569aec0de6e1203dbae8 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 29 Apr 2023 16:59:08 +0200 Subject: [PATCH 015/152] PlateEditor --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index e7bec30..64d9dc0 100644 --- a/README.md +++ b/README.md @@ -490,6 +490,9 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www #### Biology / Bioinformatics +##### Assay +[PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). + ##### Biostatistics / Robust statistics [MinCovDet](https://scikit-learn.org/stable/modules/generated/sklearn.covariance.MinCovDet.html) - Robust estimator of covariance, RMPV, [Paper](https://wires.onlinelibrary.wiley.com/doi/full/10.1002/wics.1421), [App1](https://journals.sagepub.com/doi/10.1177/1087057112469257?url_ver=Z39.88-2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub++0pubmed&), [App2](https://www.cell.com/cell-reports/pdf/S2211-1247(21)00694-X.pdf). [winsorize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize) - Simple adjustment of outliers. From bfd15d73967645d7926a1bb55d7112326d48de2d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 2 May 2023 19:02:15 +0200 Subject: [PATCH 016/152] DESeq2 -> PyDESeq2 --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 64d9dc0..24df77d 100644 --- a/README.md +++ b/README.md @@ -500,7 +500,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www ##### Sequencing [Single cell tutorial](https://github.com/theislab/single-cell-tutorial). -[DESeq2](http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html) - Analyzing RNA-seq data (R package). +[PyDESeq2](https://github.com/owkin/PyDESeq2) - Analyzing RNA-seq data. [cellxgene](https://github.com/chanzuckerberg/cellxgene) - Interactive explorer for single-cell transcriptomics data. [scanpy](https://github.com/theislab/scanpy) - Analyze single-cell gene expression data, [tutorial](https://github.com/theislab/single-cell-tutorial). [besca](https://github.com/bedapub/besca) - Beyond single-cell analysis. From 4064e7362d6df35d8ad3afee9d647e5c9d5642cd Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 5 May 2023 16:34:15 +0200 Subject: [PATCH 017/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 24df77d..399ce4c 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,6 @@ Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/1 [swifter](https://github.com/jmcarpenter2/swifter) - Apply any function to a pandas DataFrame faster. [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. [pandas-log](https://github.com/eyaltrabelsi/pandas-log) - Find business logic issues and performance issues in pandas. -[pandapy](https://github.com/firmai/pandapy) - Additional features for pandas. [lux](https://github.com/lux-org/lux) - DataFrame visualization within Jupyter. [dtale](https://github.com/man-group/dtale) - View and analyze Pandas data structures, integrating with Jupyter. [polars](https://github.com/pola-rs/polars) - Multi-threaded alternative to pandas. From 177a4ad3f01e8ef8c048a9022a80796ffe24dc4f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 20:04:18 +0200 Subject: [PATCH 018/152] CellSeg --- README.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 399ce4c..6d814e6 100644 --- a/README.md +++ b/README.md @@ -229,29 +229,31 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). [MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. [cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). +[ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. + +#### Segmentation +[Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +[CellSeg](https://github.com/michaellee1/CellSeg) - Cell segmentation. [Paper](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04570-9) +[cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). +[stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. +[UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. +[nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. +[allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. +[Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. #### Packages [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. -[Tree of Microscopy](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). -[cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). [skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). [cidre](https://github.com/smithk/cidre) - Illumination correction method for optical microscopy. [BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). -[ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. [CSBDeep](https://github.com/CSBDeep/CSBDeep) - Image denoising, restoration and object detection, [Project page](https://csbdeep.bioimagecomputing.com/tools/). -[UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. -[stardist](https://github.com/stardist/stardist) - Object Detection with Star-convex Shapes. -[nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. -[allencell](https://www.allencell.org/segmenter.html) - Tools for the 3D segmentation of intracellular structures. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. -[ashlar](https://github.com/labsyspharm/ashlar) - Image stitching and registration. [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). -[Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Cell segmentation and tracking. #### Domain Adaptation / Batch-Effect Correction [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). From 00f9bdb5d529fbf2af090a90736b408976641f04 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 20:20:10 +0200 Subject: [PATCH 019/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 6d814e6..b10648a 100644 --- a/README.md +++ b/README.md @@ -233,7 +233,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Segmentation [Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). -[CellSeg](https://github.com/michaellee1/CellSeg) - Cell segmentation. [Paper](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04570-9) [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. From 6a27653c7cdc9b2213ef85279e98686117ba7786 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 21:08:10 +0200 Subject: [PATCH 020/152] zarr --- README.md | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index b10648a..dc43da4 100644 --- a/README.md +++ b/README.md @@ -204,10 +204,12 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Computer Vision [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) -#### Image Cleanup +#### Image Viewers [Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing package. [napari](https://github.com/napari/napari) - Multi-dimensional image viewer. [fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. + +#### Image Cleanup [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. [aydin](https://github.com/royerlab/aydin) - Image denoising. [unprocessing](https://github.com/timothybrooks/unprocessing) - Image denoising by reverting the image processing pipeline. @@ -216,7 +218,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal ##### Tutorials [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. - +[python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. ##### Datasets [jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. [MedMNIST](https://github.com/MedMNIST/MedMNIST) - Datasets for 2D and 3D Biomedical Image Classification. @@ -225,13 +227,21 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. -#### Labsyspharm +##### Data Formats and Converters +OME-Zarr - [Paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [Standard](https://ngff.openmicroscopy.org/latest/) +[bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. +[raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. +[BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). +[napari](https://napari.org/stable) - Viewer for various image formats. +[vizarr](https://github.com/hms-dbmi/vizarr) - Viewer for zarr files. + +##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). [MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. [cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). [ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. -#### Segmentation +##### Segmentation [Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. @@ -239,8 +249,9 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. [allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. +[ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. -#### Packages +##### Packages [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. [seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). @@ -509,8 +520,6 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www ##### Image-related See also Microscopy Section above. -[Overview over cell segmentation algorithms](https://biomag-lab.github.io/microscopy-tree/) -[python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. [mahotas](http://luispedro.org/software/mahotas/) - Image processing (Bioinformatics), [example](https://github.com/luispedro/python-image-tutorial/blob/master/Segmenting%20cell%20images%20(fluorescent%20microscopy).ipynb). [imagepy](https://github.com/Image-Py/imagepy) - Software package for bioimage analysis. [scimap](https://github.com/labsyspharm/scimap) - Spatial Single-Cell Analysis Toolkit. @@ -518,7 +527,6 @@ See also Microscopy Section above. [imglyb](https://github.com/imglib/imglyb) - Viewer for large images, [talk](https://www.youtube.com/watch?v=Ddo5z5qGMb8), [slides](https://github.com/hanslovsky/scipy-2019/blob/master/scipy-2019-imglyb.pdf). [microscopium](https://github.com/microscopium/microscopium) - Unsupervised clustering of images + viewer, [talk](https://www.youtube.com/watch?v=ytEQl9xs8FQ). [cytokit](https://github.com/hammerlab/cytokit) - Analyzing properties of cells in fluorescent microscopy datasets. -[ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. ##### Drug discovery [TDC](https://github.com/mims-harvard/TDC/tree/main) - Drug Discovery and Development. From 189c95527077eb7ebeeeb7fc9e08588a95c142a7 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 21:12:21 +0200 Subject: [PATCH 021/152] REMBI --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index dc43da4..afd350a 100644 --- a/README.md +++ b/README.md @@ -234,6 +234,7 @@ OME-Zarr - [Paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.f [BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). [napari](https://napari.org/stable) - Viewer for various image formats. [vizarr](https://github.com/hms-dbmi/vizarr) - Viewer for zarr files. +REMBI model - Recommended Metadata for Biological Images, [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c) ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From caedcf53922c80d94be2863b80e8344711cbc654 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 21:15:30 +0200 Subject: [PATCH 022/152] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index afd350a..8f59418 100644 --- a/README.md +++ b/README.md @@ -228,13 +228,13 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. ##### Data Formats and Converters -OME-Zarr - [Paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [Standard](https://ngff.openmicroscopy.org/latest/) +OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [standard](https://ngff.openmicroscopy.org/latest/) [bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. [raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. [BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). [napari](https://napari.org/stable) - Viewer for various image formats. [vizarr](https://github.com/hms-dbmi/vizarr) - Viewer for zarr files. -REMBI model - Recommended Metadata for Biological Images, [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c) +REMBI model - Recommended Metadata for Biological Images, [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheed with additional info](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919). ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From 1cc9ed346f1365ab71e7e6c3dea6fbfd770f27db Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 May 2023 21:32:24 +0200 Subject: [PATCH 023/152] REMBI --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 8f59418..40dbd55 100644 --- a/README.md +++ b/README.md @@ -234,7 +234,9 @@ OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.f [BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). [napari](https://napari.org/stable) - Viewer for various image formats. [vizarr](https://github.com/hms-dbmi/vizarr) - Viewer for zarr files. -REMBI model - Recommended Metadata for Biological Images, [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheed with additional info](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919). +REMBI model - Recommended Metadata for Biological Images + * BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/) + * [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From 26913c121921775c3f6a7d8c5992b3ae37ec99a6 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 8 May 2023 14:50:18 +0200 Subject: [PATCH 024/152] Bioimaging and Bioimage Analysis Guide --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 40dbd55..a5bf465 100644 --- a/README.md +++ b/README.md @@ -219,6 +219,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal ##### Tutorials [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. +[Bioimaging and Bioimage Analysis Guide](https://www.bioimagingguide.org/welcome.html) + ##### Datasets [jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. [MedMNIST](https://github.com/MedMNIST/MedMNIST) - Datasets for 2D and 3D Biomedical Image Classification. From b16fc22bcae0e17edc2e2926a21bc0cc40d9f246 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 8 May 2023 15:10:38 +0200 Subject: [PATCH 025/152] OMERO --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a5bf465..4bb056b 100644 --- a/README.md +++ b/README.md @@ -205,9 +205,10 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) #### Image Viewers +[fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. [Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing package. [napari](https://github.com/napari/napari) - Multi-dimensional image viewer. -[fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. +[OMERO](https://www.openmicroscopy.org/omero/) - Feature rich image viewer for high-content screening. [IDR](https://idr.openmicroscopy.org/) uses OMERO. [Intro](https://www.youtube.com/watch?v=nSCrMO_c-5s) #### Image Cleanup [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. From 3c09717ffedf6840289662dab5baa69ca5fc84c7 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 10 May 2023 11:41:13 +0200 Subject: [PATCH 026/152] ipyflow --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 4bb056b..116fae6 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,7 @@ [General Jupyter Tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/) Fixing environment: [link](https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/) Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/17/jupyter-notebook-debugging/), [video](https://www.youtube.com/watch?v=Z0ssNAbe81M&t=1h44m15s), [cheatsheet](https://nblock.org/2011/11/15/pdb-cheatsheet/) +[ipyflow](https://github.com/ipyflow/ipyflow) - IPython kernel for Jupyter with additional features. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. [nteract](https://nteract.io/) - Open Jupyter Notebooks with doubleclick. [papermill](https://github.com/nteract/papermill) - Parameterize and execute Jupyter notebooks, [tutorial](https://pbpython.com/papermil-rclone-report-1.html). From 13f1d96db8c7873b4c249d4f217443c3f5012ea9 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 10 May 2023 14:53:43 +0200 Subject: [PATCH 027/152] MEDIAR and cell segmentation datasets --- README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 116fae6..9222597 100644 --- a/README.md +++ b/README.md @@ -227,7 +227,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. [MedMNIST](https://github.com/MedMNIST/MedMNIST) - Datasets for 2D and 3D Biomedical Image Classification. [CytoImageNet](https://github.com/stan-hua/CytoImageNet) - Huge diverse dataset like ImageNet but for cell images. -[cellpose dataset](https://www.cellpose.org/dataset) - Cell images. [Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. @@ -250,6 +249,7 @@ REMBI model - Recommended Metadata for Biological Images ##### Segmentation [Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +[MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. @@ -258,6 +258,12 @@ REMBI model - Recommended Metadata for Biological Images [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. [ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. +##### Cell Segmentation Datasets +[cellpose](https://www.cellpose.org/dataset) - Cell images. +[omnipose](http://www.cellpose.org/dataset_omnipose) - Cell images. +[LIVECell](https://github.com/sartorius-research/LIVECell) - Cell images. +[Sartorius](https://www.kaggle.com/competitions/sartorius-cell-instance-segmentation/overview) - Neurons. + ##### Packages [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. From 7a9f6182baed99415e9b34622247ca29f958a31c Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 10 May 2023 19:56:12 +0200 Subject: [PATCH 028/152] Spring Cleaning --- README.md | 155 +++++++++++++----------------------------------------- 1 file changed, 37 insertions(+), 118 deletions(-) diff --git a/README.md b/README.md index 9222597..a9c8609 100644 --- a/README.md +++ b/README.md @@ -4,100 +4,73 @@ #### Core [pandas](https://pandas.pydata.org/) - Data structures built on top of [numpy](https://www.numpy.org/). -[scikit-learn](https://scikit-learn.org/stable/) - Core ML library. +[scikit-learn](https://scikit-learn.org/stable/) - Core ML library, [intelex](https://github.com/intel/scikit-learn-intelex). [matplotlib](https://matplotlib.org/) - Plotting library. [seaborn](https://seaborn.pydata.org/) - Data visualization library based on matplotlib. -[datatile](https://github.com/polyaxon/datatile) - Basic statistics using `DataFrameSummary(df).summary()`. -[pandas_profiling](https://github.com/pandas-profiling/pandas-profiling) - Descriptive statistics using `ProfileReport`. +[ydata-profiling](https://github.com/ydataai/ydata-profiling) - Descriptive statistics using `ProfileReport`. [sklearn_pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - Helpful `DataFrameMapper` class. [missingno](https://github.com/ResidentMario/missingno) - Missing data visualization. -[rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - Plugin to display .csv files with nice colors. -[xonsh](https://xon.sh) - Python-powered shell as alternative to Bash for simplifying data science automations. +[rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - VSCode plugin to display .csv files with nice colors. + +#### General Python Programming +[more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. +[tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). +[loguru](https://github.com/Delgan/loguru) - Python logging. +[pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. +[poetry](https://github.com/python-poetry/poetry) - Dependency management. +[dateparser](https://github.com/scrapinghub/dateparser) - A better date parser. + +#### Pandas Tricks, Alternatives and Additions +[pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. +[polars](https://github.com/pola-rs/polars) - Multi-threaded alternative to pandas. +[xarray](https://github.com/pydata/xarray/) - Extends pandas to n-dimensional arrays. +[pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. +[duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. + +#### Pandas Parallelization +[modin](https://github.com/modin-project/modin) - Parallelization library for faster pandas `DataFrame`. +[vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames. +[pandarallel](https://github.com/nalepae/pandarallel) - Parallelize pandas operations. +[swifter](https://github.com/jmcarpenter2/swifter) - Apply any function to a pandas DataFrame faster. #### Environment and Jupyter -[General Jupyter Tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/) -Fixing environment: [link](https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/) -Python debugger (pdb) - [blog post](https://www.blog.pythonlibrary.org/2018/10/17/jupyter-notebook-debugging/), [video](https://www.youtube.com/watch?v=Z0ssNAbe81M&t=1h44m15s), [cheatsheet](https://nblock.org/2011/11/15/pdb-cheatsheet/) +[Jupyter Tricks](https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/) [ipyflow](https://github.com/ipyflow/ipyflow) - IPython kernel for Jupyter with additional features. -[pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. [nteract](https://nteract.io/) - Open Jupyter Notebooks with doubleclick. [papermill](https://github.com/nteract/papermill) - Parameterize and execute Jupyter notebooks, [tutorial](https://pbpython.com/papermil-rclone-report-1.html). [nbdime](https://github.com/jupyter/nbdime) - Diff two notebook files, Alternative GitHub App: [ReviewNB](https://www.reviewnb.com/). [RISE](https://github.com/damianavila/RISE) - Turn Jupyter notebooks into presentations. [qgrid](https://github.com/quantopian/qgrid) - Pandas `DataFrame` sorting. -[pivottablejs](https://github.com/nicolaskruchten/jupyter_pivottablejs) - Drag n drop Pivot Tables and Charts for Jupyter notebooks. +[lux](https://github.com/lux-org/lux) - DataFrame visualization within Jupyter. +[pandasgui](https://github.com/adamerose/pandasgui) - GUI for viewing, plotting and analyzing Pandas DataFrames. +[dtale](https://github.com/man-group/dtale) - View and analyze Pandas data structures, integrating with Jupyter. [itables](https://github.com/mwouts/itables) - Interactive tables in Jupyter. -[jupyter-datatables](https://github.com/CermakM/jupyter-datatables) - Interactive tables in Jupyter. -[debugger](https://blog.jupyter.org/a-visual-debugger-for-jupyter-914e61716559) - Visual debugger for Jupyter. -[nbcommands](https://github.com/vinayak-mehta/nbcommands) - View and search notebooks from terminal. [handcalcs](https://github.com/connorferster/handcalcs) - More convenient way of writing mathematical equations in Jupyter. [notebooker](https://github.com/man-group/notebooker) - Productionize and schedule Jupyter Notebooks. [bamboolib](https://github.com/tkrabel/bamboolib) - Intuitive GUI for tables. [voila](https://github.com/QuantStack/voila) - Turn Jupyter notebooks into standalone web applications. [voila-gridstack](https://github.com/voila-dashboards/voila-gridstack) - Voila grid layout. -#### Pandas Tricks, Alternatives and Additions -[Pandas Tricks](https://towardsdatascience.com/5-lesser-known-pandas-tricks-e8ab1dd21431) -[Using df.pipe() (video)](https://www.youtube.com/watch?v=yXGCKqo5cEY) -[pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. -[modin](https://github.com/modin-project/modin) - Parallelization library for faster pandas `DataFrame`. -[vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames. -[pandarallel](https://github.com/nalepae/pandarallel) - Parallelize pandas operations. -[xarray](https://github.com/pydata/xarray/) - Extends pandas to n-dimensional arrays. -[swifter](https://github.com/jmcarpenter2/swifter) - Apply any function to a pandas DataFrame faster. -[pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. -[pandas-log](https://github.com/eyaltrabelsi/pandas-log) - Find business logic issues and performance issues in pandas. -[lux](https://github.com/lux-org/lux) - DataFrame visualization within Jupyter. -[dtale](https://github.com/man-group/dtale) - View and analyze Pandas data structures, integrating with Jupyter. -[polars](https://github.com/pola-rs/polars) - Multi-threaded alternative to pandas. -[duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. - -#### Scikit-Learn Alternatives -[scikit-learn-intelex](https://github.com/intel/scikit-learn-intelex) - Intel extension for scikit-learn for speed. - -#### Helpful -[drawdata](https://github.com/koaning/drawdata) - Quickly draw some points and export them as csv, [website](https://drawdata.xyz/). -[tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). -[icecream](https://github.com/gruns/icecream) - Simple debugging output. -[loguru](https://github.com/Delgan/loguru) - Python logging. -[pyprojroot](https://github.com/chendaniely/pyprojroot) - Helpful `here()` command from R. -[intake](https://github.com/intake/intake) - Loading datasets made easier, [talk](https://www.youtube.com/watch?v=s7Ww5-vD2Os&t=33m40s). - #### Extraction [textract](https://github.com/deanmalmgren/textract) - Extract text from any document. -[camelot](https://github.com/socialcopsdev/camelot) - Extract text from PDF. #### Big Data [spark](https://docs.databricks.com/spark/latest/dataframes-datasets/introduction-to-dataframes-python.html#work-with-dataframes) - `DataFrame` for big data, [cheatsheet](https://gist.github.com/crawles/b47e23da8218af0b9bd9d47f5242d189), [tutorial](https://github.com/ericxiao251/spark-syntax). -[sparkit-learn](https://github.com/lensacom/sparkit-learn), [spark-deep-learning](https://github.com/databricks/spark-deep-learning) - ML frameworks for spark. -[koalas](https://github.com/databricks/koalas) - Pandas API on Apache Spark. [dask](https://github.com/dask/dask), [dask-ml](http://ml.dask.org/) - Pandas `DataFrame` for big data and machine learning library, [resources](https://matthewrocklin.com/blog//work/2018/07/17/dask-dev), [talk1](https://www.youtube.com/watch?v=ccfsbuqsjgI), [talk2](https://www.youtube.com/watch?v=RA_2qdipVng), [notebooks](https://github.com/dask/dask-ec2/tree/master/notebooks), [videos](https://www.youtube.com/user/mdrocklin). -[dask-gateway](https://github.com/jcrist/dask-gateway) - Managing dask clusters. -[turicreate](https://github.com/apple/turicreate) - Helpful `SFrame` class for out-of-memory dataframes. [h2o](https://github.com/h2oai/h2o-3) - Helpful `H2OFrame` class for out-of-memory dataframes. [datatable](https://github.com/h2oai/datatable) - Data Table for big data support. [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library, [Intro](https://www.youtube.com/watch?v=6XzS5XcpicM&t=2m50s). +[cupy](https://github.com/cupy/cupy) - NumPy-like API accelerated with CUDA. [ray](https://github.com/ray-project/ray/) - Flexible, high-performance distributed execution framework. -[mars](https://github.com/mars-project/mars) - Tensor-based unified framework for large-scale data computation. [bottleneck](https://github.com/kwgoodman/bottleneck) - Fast NumPy array functions written in C. -[bolz](https://github.com/Blosc/bcolz) - A columnar data container that can be compressed. -[cupy](https://github.com/cupy/cupy) - NumPy-like API accelerated with CUDA. [petastorm](https://github.com/uber/petastorm) - Data access library for parquet files by Uber. [zarr](https://github.com/zarr-developers/zarr-python) - Distributed NumPy arrays. [NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by Nvidia. [tensorstore](https://github.com/google/tensorstore) - Reading and writing large multi-dimensional arrays (Google). -#### Distributed Systems -[nextflow](https://github.com/goodwright/nextflow.py) - Run scripts and workflow graphs in Docker image using Google Life Sciences, AWS Batch, [Website](https://github.com/nextflow-io/nextflow). -[dsub](https://github.com/DataBiosphere/dsub) - Run batch computing tasks in Docker image in the Google Cloud. - #### Command line tools, CSV -[ni](https://github.com/spencertipping/ni) - Command line tool for big data. -[xsv](https://github.com/BurntSushi/xsv) - Command line tool for indexing, slicing, analyzing, splitting and joining CSV files. -[csvkit](https://csvkit.readthedocs.io/en/1.0.3/) - Another command line tool for CSV files. +[csvkit](https://github.com/wireservice/csvkit) - Command line tool for CSV files. [csvsort](https://pypi.org/project/csvsort/) - Sort large csv files. -[tsv-utils](https://github.com/eBay/tsv-utils) - Tools for working with CSV files by eBay. -[cheat](https://github.com/cheat/cheat) - Make cheatsheets for command line commands. #### Classical Statistics @@ -121,7 +94,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [torch-two-sample](https://github.com/josipd/torch-two-sample) - Friedman-Rafsky Test: Compare two population based on a multivariate generalization of the Runstest. [Explanation](https://www.real-statistics.com/multivariate-statistics/multivariate-normal-distribution/friedman-rafsky-test/), [Application](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5014134/) ##### Interim Analyses / Sequential Analysis / Stopping - [Sequential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. +[Sequential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. [Treatment Effects Monitoring](https://online.stat.psu.edu/stat509/node/75/) - Design and Analysis of Clinical Trials PennState. [sequential](https://cran.r-project.org/web/packages/Sequential/Sequential.pdf) - Exact Sequential Analysis for Poisson and Binomial Data (R package). [confseq](https://github.com/gostevehoward/confseq) - Uniform boundaries, confidence sequences, and always-valid p-values. @@ -167,17 +140,13 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Exploration and Cleaning [Checklist](https://github.com/r0f1/ml_checklist). -[pandasgui](https://github.com/adamerose/pandasgui) - GUI for viewing, plotting and analyzing Pandas DataFrames. -[janitor](https://pyjanitor.readthedocs.io/) - Clean messy column names. +[pyjanitor](https://github.com/pyjanitor-devs/pyjanitor) - Clean messy column names. [pandera](https://github.com/unionai-oss/pandera) - Data / Schema validation. [impyute](https://github.com/eltonlaw/impyute) - Imputations. [fancyimpute](https://github.com/iskandr/fancyimpute) - Matrix completion and imputation algorithms. [imbalanced-learn](https://github.com/scikit-learn-contrib/imbalanced-learn) - Resampling for imbalanced datasets. [tspreprocess](https://github.com/MaxBenChrist/tspreprocess) - Time series preprocessing: Denoising, Compression, Resampling. [Kaggler](https://github.com/jeongyoonlee/Kaggler) - Utility functions (`OneHotEncoder(min_obs=100)`) -[pyupset](https://github.com/ImSoErgodic/py-upset) - Visualizing intersecting sets. -[pyemd](https://github.com/wmayner/pyemd) - Earth Mover's Distance / Wasserstein distance, similarity between histograms. [OpenCV implementation](https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html), [POT implementation](https://pythonot.github.io/auto_examples/plot_OT_2D_samples.html) -[littleballoffur](https://github.com/benedekrozemberczki/littleballoffur) - Sampling from graphs. #### Noisy Labels [cleanlab](https://github.com/cleanlab/cleanlab) - Machine learning with noisy labels, finding mislabelled data, and uncertainty quantification. Also see awesome list below. @@ -187,11 +156,11 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [iterative-stratification](https://github.com/trent-b/iterative-stratification) - Stratification of multilabel data. #### Feature Engineering -[Talk](https://www.youtube.com/watch?v=68ABAU_V8qI) +[Vincent Warmerdam: Untitled12.ipynb](https://www.youtube.com/watch?v=yXGCKqo5cEY) - Using df.pipe() +[Vincent Warmerdam: Winning with Simple, even Linear, Models](https://www.youtube.com/watch?v=68ABAU_V8qI) [sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html) - Pipeline, [examples](https://github.com/jem1031/pandas-pipelines-custom-transformers). [pdpipe](https://github.com/shaypal5/pdpipe) - Pipelines for DataFrames. [scikit-lego](https://github.com/koaning/scikit-lego) - Custom transformers for pipelines. -[skoot](https://github.com/tgsmith61591/skoot) - Pipeline helper functions. [categorical-encoding](https://github.com/scikit-learn-contrib/categorical-encoding) - Categorical encoding of variables, [vtreat (R package)](https://cran.r-project.org/web/packages/vtreat/vignettes/vtreat.html). [dirty_cat](https://github.com/dirty-cat/dirty_cat) - Encoding dirty categorical variables. [patsy](https://github.com/pydata/patsy/) - R-like syntax for statistical models. @@ -200,7 +169,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [tsfresh](https://github.com/blue-yonder/tsfresh) - Time series feature engineering. [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines. [feature_engine](https://github.com/solegalli/feature_engine) - Encoders, transformers, etc. -[NVTabular](https://github.com/NVIDIA/NVTabular) - Feature engineering and preprocessing library for tabular data by Nvidia. #### Computer Vision [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) @@ -214,7 +182,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Image Cleanup [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. [aydin](https://github.com/royerlab/aydin) - Image denoising. -[unprocessing](https://github.com/timothybrooks/unprocessing) - Image denoising by reverting the image processing pipeline. #### Microscopy / Segmentation @@ -373,10 +340,6 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [linearsdr](https://github.com/HarrisQ/linearsdr) - Linear Sufficient Dimension Reduction (R package). [PHATE](https://github.com/KrishnaswamyLab/PHATE) - Tool for visualizing high dimensional data. -#### Training-related -[iterative-stratification](https://github.com/trent-b/iterative-stratification) - Cross validators with stratification for multilabel data. -[livelossplot](https://github.com/stared/livelossplot) - Live training loss plot in Jupyter Notebook. - #### Visualization [All charts](https://datavizproject.com/), [Austrian monuments](https://github.com/njanakiev/austrian-monuments-visualization). [Better heatmaps and correlation plots](https://towardsdatascience.com/better-heatmaps-and-correlation-matrix-plots-in-python-41445d0f2bec). @@ -455,12 +418,10 @@ Predict economic indicators from Open Street Map [ipynb](https://github.com/njan #### Recommender Systems Examples: [1](https://lazyprogrammer.me/tutorial-on-collaborative-filtering-and-matrix-factorization-in-python/), [2](https://medium.com/@james_aka_yale/the-4-recommendation-engines-that-can-predict-your-movie-tastes-bbec857b8223), [2-ipynb](https://github.com/khanhnamle1994/movielens/blob/master/Content_Based_and_Collaborative_Filtering_Models.ipynb), [3](https://www.kaggle.com/morrisb/how-to-recommend-anything-deep-recommender). [surprise](https://github.com/NicolasHug/Surprise) - Recommender, [talk](https://www.youtube.com/watch?v=d7iIb_XVkZs). -[turicreate](https://github.com/apple/turicreate) - Recommender. [implicit](https://github.com/benfred/implicit) - Fast Collaborative Filtering for Implicit Feedback Datasets. [spotlight](https://github.com/maciejkula/spotlight) - Deep recommender models using PyTorch. [lightfm](https://github.com/lyst/lightfm) - Recommendation algorithms for both implicit and explicit feedback. [funk-svd](https://github.com/gbolmier/funk-svd) - Fast SVD. -[pywFM](https://github.com/jfloff/pywFM) - Factorization. #### Decision Tree Models [Intro to Decision Trees and Random Forests](https://victorzhou.com/blog/intro-to-random-forests/), Intro to Gradient Boosting [1](https://explained.ai/gradient-boosting/), [2](https://www.gormanalysis.com/blog/gradient-boosting-explained/), [Decision Tree Visualization](https://explained.ai/decision-tree-viz/index.html) @@ -468,22 +429,15 @@ Examples: [1](https://lazyprogrammer.me/tutorial-on-collaborative-filtering-and- [xgboost](https://github.com/dmlc/xgboost) - Gradient boosting (GBDT, GBRT or GBM) library, [doc](https://sites.google.com/view/lauraepp/parameters), Methods for CIs: [link1](https://stats.stackexchange.com/questions/255783/confidence-interval-for-xgb-forecast), [link2](https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b). [catboost](https://github.com/catboost/catboost) - Gradient boosting. [h2o](https://github.com/h2oai/h2o-3) - Gradient boosting and general machine learning framework. -[snapml](https://www.zurich.ibm.com/snapml/) - Gradient boosting and general machine learning framework by IBM, for CPU and GPU. [PyPI](https://pypi.org/project/snapml/) [pycaret](https://github.com/pycaret/pycaret) - Wrapper for xgboost, lightgbm, catboost etc. -[thundergbm](https://github.com/Xtra-Computing/thundergbm) - GBDTs and Random Forest. -[h2o](https://github.com/h2oai/h2o-3) - Gradient boosting. [forestci](https://github.com/scikit-learn-contrib/forest-confidence-interval) - Confidence intervals for random forests. -[scikit-garden](https://github.com/scikit-garden/scikit-garden) - Quantile Regression. [grf](https://github.com/grf-labs/grf) - Generalized random forest. [dtreeviz](https://github.com/parrt/dtreeviz) - Decision tree visualization and model interpretation. [Nuance](https://github.com/SauceCat/Nuance) - Decision tree visualization. [rfpimp](https://github.com/parrt/random-forest-importances) - Feature Importance for RandomForests using Permuation Importance. Why the default feature importance for random forests is wrong: [link](http://explained.ai/rf-importance/index.html) -[treeinterpreter](https://github.com/andosa/treeinterpreter) - Interpreting scikit-learn's decision tree and random forest predictions. [bartpy](https://github.com/JakeColtman/bartpy) - Bayesian Additive Regression Trees. -[infiniteboost](https://github.com/arogozhnikov/infiniteboost) - Combination of RFs and GBDTs. [merf](https://github.com/manifoldai/merf) - Mixed Effects Random Forest for Clustering, [video](https://www.youtube.com/watch?v=gWj4ZwB7f3o) -[rrcf](https://github.com/kLabUM/rrcf) - Robust Random Cut Forest algorithm for anomaly detection on streams. [groot](https://github.com/tudelft-cda-lab/GROOT) - Robust decision trees. [linear-tree](https://github.com/cerlymarco/linear-tree) - Trees with linear models at the leaves. @@ -504,14 +458,10 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics, [example](https://github.com/mapequation/infomap/blob/master/examples/python/infomap-examples.ipynb). [datasketch](https://github.com/ekzhu/datasketch) - Probabilistic data structures for large data (MinHash, HyperLogLog). [flair](https://github.com/zalandoresearch/flair) - NLP Framework by Zalando. -[stanfordnlp](https://github.com/stanfordnlp/stanfordnlp) - NLP Library. +[stanza](https://github.com/stanfordnlp/stanza) - NLP Library. [Chatistics](https://github.com/MasterScrat/Chatistics) - Turn Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames. -[textvec](https://github.com/textvec/textvec) - Supervised text vectorization tool. [textdistance](https://github.com/life4/textdistance) - Collection for comparing distances between two or more sequences. -##### Papers -[Search Engine Correlation](https://arxiv.org/pdf/1107.2691.pdf) - #### Biology / Bioinformatics ##### Assay @@ -538,8 +488,6 @@ See also Microscopy Section above. [scimap](https://github.com/labsyspharm/scimap) - Spatial Single-Cell Analysis Toolkit. [CellProfiler](https://github.com/CellProfiler/CellProfiler) - Biological image analysis. [imglyb](https://github.com/imglib/imglyb) - Viewer for large images, [talk](https://www.youtube.com/watch?v=Ddo5z5qGMb8), [slides](https://github.com/hanslovsky/scipy-2019/blob/master/scipy-2019-imglyb.pdf). -[microscopium](https://github.com/microscopium/microscopium) - Unsupervised clustering of images + viewer, [talk](https://www.youtube.com/watch?v=ytEQl9xs8FQ). -[cytokit](https://github.com/hammerlab/cytokit) - Analyzing properties of cells in fluorescent microscopy datasets. ##### Drug discovery [TDC](https://github.com/mims-harvard/TDC/tree/main) - Drug Discovery and Development. @@ -651,7 +599,6 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), ##### Image Annotation [cvat](https://github.com/openvinotoolkit/cvat) - Image annotation tool. -[pigeon](https://github.com/agermanidis/pigeon) - Create annotations from within a Jupyter notebook. ##### Image Classification [nfnets](https://github.com/ypeleg/nfnets-keras) - Neural network. @@ -904,6 +851,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [banpei](https://github.com/tsurubee/banpei) - Anomaly detection library based on singular spectrum transformation. [telemanom](https://github.com/khundman/telemanom) - Detect anomalies in multivariate time series data using LSTMs. [luminaire](https://github.com/zillow/luminaire) - Anomaly Detection for time series. +[rrcf](https://github.com/kLabUM/rrcf) - Robust Random Cut Forest algorithm for anomaly detection on streams. #### Concept Drift & Domain Shift [TorchDrift](https://github.com/TorchDrift/TorchDrift) - Drift Detection for PyTorch Models. @@ -915,9 +863,6 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y #### Ranking [lightning](https://github.com/scikit-learn-contrib/lightning) - Large-scale linear classification, regression and ranking. -#### Scoring -[SLIM](https://github.com/ustunb/slim-python) - Scoring systems for classification, Supersparse linear integer models. - #### Causal Inference [CS 594 Causal Inference and Learning](https://www.cs.uic.edu/~elena/courses/fall19/cs594cil.html) [Statistical Rethinking](https://github.com/rmcelreath/stat_rethinking_2022) - Video Lecture Series, Bayesian Statistics, Causal Models, [R](https://bookdown.org/content/4857/), [python](https://github.com/pymc-devs/resources/tree/master/Rethinking_2), [numpyro1](https://github.com/asuagar/statrethink-course-numpyro-2019), [numpyro2](https://fehiepsi.github.io/rethinking-numpyro/), [tensorflow-probability](https://github.com/ksachdeva/rethinking-tensorflow-probability). @@ -975,10 +920,6 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin [awesome-conformal-prediction](https://github.com/valeman/awesome-conformal-prediction) - Uncertainty quantification. [uncertainty-toolbox](https://github.com/uncertainty-toolbox/uncertainty-toolbox) - Predictive uncertainty quantification, calibration, metrics, and visualization. -#### Interpretable Classifiers and Regressors -[skope-rules](https://github.com/scikit-learn-contrib/skope-rules) - Interpretable classifier, IF-THEN rules. -[sklearn-expertsys](https://github.com/tmadl/sklearn-expertsys) - Interpretable classifiers, Bayesian Rule List classifier. - #### Model Explanation, Interpretability, Feature Importance [Princeton - Reproducibility Crisis in ML‑based Science](https://sites.google.com/princeton.edu/rep-workshop) [Book](https://christophm.github.io/interpretable-ml-book/agnostic.html), [Examples](https://github.com/jphall663/interpretable_machine_learning_with_python) @@ -992,9 +933,6 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin [pycebox](https://github.com/AustinRochford/PyCEbox) - Individual Conditional Expectation Plot Toolbox. [pdpbox](https://github.com/SauceCat/PDPbox) - Partial dependence plot toolbox, [example](https://www.kaggle.com/dansbecker/partial-plots). [partial_dependence](https://github.com/nyuvis/partial_dependence) - Visualize and cluster partial dependence. -[skater](https://github.com/datascienceinc/Skater) - Unified framework to enable model interpretation. -[anchor](https://github.com/marcotcr/anchor) - High-Precision Model-Agnostic Explanations for classifiers. -[l2x](https://github.com/Jianbo-Lab/L2X) - Instancewise feature selection as methodology for model interpretation. [contrastive_explanation](https://github.com/MarcelRobeer/ContrastiveExplanation) - Contrastive explanations. [DrWhy](https://github.com/ModelOriented/DrWhy) - Collection of tools for explainable AI. [lucid](https://github.com/tensorflow/lucid) - Neural network interpretability. @@ -1009,10 +947,8 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin #### Automated Machine Learning [AdaNet](https://github.com/tensorflow/adanet) - Automated machine learning based on TensorFlow. [tpot](https://github.com/EpistasisLab/tpot) - Automated machine learning tool, optimizes machine learning pipelines. -[auto_ml](https://github.com/ClimbsRocks/auto_ml) - Automated machine learning for analytics & production. [autokeras](https://github.com/jhfjhfj1/autokeras) - AutoML for deep learning. [nni](https://github.com/Microsoft/nni) - Toolkit for neural architecture search and hyper-parameter tuning by Microsoft. -[automl-gs](https://github.com/minimaxir/automl-gs) - Automated machine learning. [mljar](https://github.com/mljar/mljar-supervised) - Automated machine learning. [automl_zero](https://github.com/google-research/google-research/tree/master/automl_zero) - Automatically discover computer programs that can solve machine learning tasks from Google. [AlphaPy](https://github.com/ScottfreeLLC/AlphaPy) - Automated Machine Learning using scikit-learn xgboost, LightGBM and others. @@ -1056,7 +992,6 @@ Optometrist algorithm - [paper](https://www.nature.com/articles/s41598-017-06645 sklearn - [PassiveAggressiveClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.PassiveAggressiveClassifier.html), [PassiveAggressiveRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.PassiveAggressiveRegressor.html). [river](https://github.com/online-ml/river) - Online machine learning. [Kaggler](https://github.com/jeongyoonlee/Kaggler) - Online Learning algorithms. -[onelearn](https://github.com/onelearn/onelearn) - Online Random Forests. #### Active Learning [Talk](https://www.youtube.com/watch?v=0efyjq5rWS4) @@ -1072,6 +1007,7 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe #### Deployment and Lifecycle Management ##### Workflow Scheduling and Orchestration +[nextflow](https://github.com/goodwright/nextflow.py) - Run scripts and workflow graphs in Docker image using Google Life Sciences, AWS Batch, [Website](https://github.com/nextflow-io/nextflow). [airflow](https://github.com/apache/airflow) - Schedule and monitor workflows. [prefect](https://github.com/PrefectHQ/prefect) - Python specific workflow scheduling. [dagster](https://github.com/dagster-io/dagster) - Development, production and observation of data assets. @@ -1085,9 +1021,6 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe [Optimize Docker Image Size](https://www.augmentedmind.de/2022/02/06/optimize-docker-image-size/) [cog](https://github.com/replicate/cog) - Facilitates building Docker images. -##### Dependency Management -[poetry](https://github.com/python-poetry/poetry) - Dependency management. - ##### Data Versioning, Databases, Pipelines and Model Serving [dvc](https://github.com/iterative/dvc) - Version control for large files. [kedro](https://github.com/quantumblacklabs/kedro) - Build data pipelines. @@ -1119,20 +1052,6 @@ Gilbert Strang - [Linear Algebra](https://ocw.mit.edu/courses/mathematics/18-06- Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machine Learning ](https://ocw.mit.edu/courses/mathematics/18-065-matrix-methods-in-data-analysis-signal-processing-and-machine-learning-spring-2018/) -#### Other -[daft](https://github.com/dfm/daft) - Render probabilistic graphical models using matplotlib. -[unyt](https://github.com/yt-project/unyt) - Working with units. -[scrapy](https://github.com/scrapy/scrapy) - Web scraping library. -[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) - ML Toolkit from Microsoft. -[Python Record Linkage Toolkit](https://github.com/J535D165/recordlinkage) - link records in or between data sources. - -#### General Python Programming -[more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. -[funcy](https://github.com/Suor/funcy) - Fancy and practical functional tools. -[dateparser](https://dateparser.readthedocs.io/en/latest/) - A better date parser. -[jellyfish](https://github.com/jamesturk/jellyfish) - Approximate string matching. -[coloredlogs](https://github.com/xolox/python-coloredlogs) - Colored logging output. - #### Resources [Distill.pub](https://distill.pub/) - Blog. [Machine Learning Videos](https://github.com/dustinvtran/ml-videos) From e3b8592236cea423648e09a06d4cc21a37099352 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 10 May 2023 19:58:40 +0200 Subject: [PATCH 029/152] hydra --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a9c8609..a481232 100644 --- a/README.md +++ b/README.md @@ -16,9 +16,10 @@ [more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. [tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). [loguru](https://github.com/Delgan/loguru) - Python logging. +[dateparser](https://github.com/scrapinghub/dateparser) - A better date parser. +[hydra](https://github.com/facebookresearch/hydra) - Configuration management. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. [poetry](https://github.com/python-poetry/poetry) - Dependency management. -[dateparser](https://github.com/scrapinghub/dateparser) - A better date parser. #### Pandas Tricks, Alternatives and Additions [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. From 1e1d6408d9deccbea91047646a4a887928e51e07 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 11 May 2023 13:37:03 +0200 Subject: [PATCH 030/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a481232..21bbe41 100644 --- a/README.md +++ b/README.md @@ -217,7 +217,7 @@ REMBI model - Recommended Metadata for Biological Images ##### Segmentation [Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). -[MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. +[MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. From acea859443431edded51afd969396ec479c0bd08 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 11 May 2023 15:17:57 +0200 Subject: [PATCH 031/152] quartets --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 21bbe41..76b76a9 100644 --- a/README.md +++ b/README.md @@ -130,6 +130,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Wainer - The Most Dangerous Equation](http://www-stat.wharton.upenn.edu/~hwainer/Readings/Most%20Dangerous%20eqn.pdf) [Gigerenzer - The Bias Bias in Behavioral Economics](https://www.nowpublishers.com/article/Details/RBE-0092) [Cook - Estimating the chances of something that hasn’t happened yet](https://www.johndcook.com/blog/2010/03/30/statistical-rule-of-three/) +[Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) @@ -138,6 +139,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [researchpy](https://github.com/researchpy/researchpy) - Helpful `summary_cont()` function for summary statistics (Table 1). [zEpid](https://github.com/pzivich/zEpid) - Epidemiology analysis package, [Tutorial](https://github.com/pzivich/Python-for-Epidemiologists). [tipr](https://github.com/LucyMcGowan/tipr) - Sensitivity analyses for unmeasured confounders (R package). +[quartets](https://github.com/r-causal/quartets) - Anscombe’s Quartet, Causal Quartet, [Datasaurus Dozen](https://github.com/jumpingrivers/datasauRus) and others (R package). #### Exploration and Cleaning [Checklist](https://github.com/r0f1/ml_checklist). From 5c58cd0b2132139d37feba7bcea1b3a8dc9363d0 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 14 May 2023 08:07:22 +0200 Subject: [PATCH 032/152] fractal --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 76b76a9..91abcce 100644 --- a/README.md +++ b/README.md @@ -211,7 +211,10 @@ REMBI model - Recommended Metadata for Biological Images * BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/) * [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) -##### Labsyspharm +##### Platforms and Pipelines +[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. + +###### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). [MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. [cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). From 3148411885676f12bcf4fc85387720bf8e230218 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 14 May 2023 14:53:33 +0200 Subject: [PATCH 033/152] Update README.md --- README.md | 216 ++++++++++++++++++++++++++---------------------------- 1 file changed, 104 insertions(+), 112 deletions(-) diff --git a/README.md b/README.md index 91abcce..d1efea0 100644 --- a/README.md +++ b/README.md @@ -176,98 +176,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Computer Vision [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) -#### Image Viewers -[fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. -[Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing package. -[napari](https://github.com/napari/napari) - Multi-dimensional image viewer. -[OMERO](https://www.openmicroscopy.org/omero/) - Feature rich image viewer for high-content screening. [IDR](https://idr.openmicroscopy.org/) uses OMERO. [Intro](https://www.youtube.com/watch?v=nSCrMO_c-5s) - -#### Image Cleanup -[DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. -[aydin](https://github.com/royerlab/aydin) - Image denoising. - -#### Microscopy / Segmentation - -##### Tutorials -[Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. -[python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. -[Bioimaging and Bioimage Analysis Guide](https://www.bioimagingguide.org/welcome.html) - -##### Datasets -[jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. -[MedMNIST](https://github.com/MedMNIST/MedMNIST) - Datasets for 2D and 3D Biomedical Image Classification. -[CytoImageNet](https://github.com/stan-hua/CytoImageNet) - Huge diverse dataset like ImageNet but for cell images. -[Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. -[broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. - -##### Data Formats and Converters -OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [standard](https://ngff.openmicroscopy.org/latest/) -[bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. -[raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. -[BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). -[napari](https://napari.org/stable) - Viewer for various image formats. -[vizarr](https://github.com/hms-dbmi/vizarr) - Viewer for zarr files. -REMBI model - Recommended Metadata for Biological Images - * BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/) - * [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) - -##### Platforms and Pipelines -[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. - -###### Labsyspharm -[mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). -[MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. -[cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). -[ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. - -##### Segmentation -[Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). -[MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. -[cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). -[stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. -[UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. -[nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. -[allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. -[Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. -[ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. - -##### Cell Segmentation Datasets -[cellpose](https://www.cellpose.org/dataset) - Cell images. -[omnipose](http://www.cellpose.org/dataset_omnipose) - Cell images. -[LIVECell](https://github.com/sartorius-research/LIVECell) - Cell images. -[Sartorius](https://www.kaggle.com/competitions/sartorius-cell-instance-segmentation/overview) - Neurons. - -##### Packages -[Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) -[BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. -[seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). -[skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). -[cidre](https://github.com/smithk/cidre) - Illumination correction method for optical microscopy. -[BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). -[CSBDeep](https://github.com/CSBDeep/CSBDeep) - Image denoising, restoration and object detection, [Project page](https://csbdeep.bioimagecomputing.com/tools/). -[atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. -[py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. -[cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. -Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). -Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). - -#### Domain Adaptation / Batch-Effect Correction -[Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). -[R Tutorial on correcting batch effects](https://broadinstitute.github.io/2019_scWorkshop/correcting-batch-effects.html). -[harmonypy](https://github.com/slowkow/harmonypy) - Fuzzy k-means and locally linear adjustments. -[pyliger](https://github.com/welch-lab/pyliger) - Batch-effect correction, [Example](https://github.com/welch-lab/pyliger/blob/master/pyliger/factorization/_iNMF_ANLS.py#L65), [R package](https://github.com/welch-lab/liger). -[nimfa](https://github.com/mims-harvard/nimfa) - Nonnegative matrix factorization. -[scgen](https://github.com/theislab/scgen) - Batch removal. [Doc](https://scgen.readthedocs.io/en/stable/). -[CORAL](https://github.com/google-research/google-research/tree/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn) - Correcting for Batch Effects Using Wasserstein Distance, [Code](https://github.com/google-research/google-research/blob/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn/transform.py#L152), [Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050548/). -[adapt](https://github.com/adapt-python/adapt) - Awesome Domain Adaptation Python Toolbox. -[pytorch-adapt](https://github.com/KevinMusgrave/pytorch-adapt) - Various neural network models for domain adaptation. - -#### Feature Engineering Images -[skimage](https://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.regionprops) - Regionprops: area, eccentricity, extent. -[mahotas](https://github.com/luispedro/mahotas) - Zernike, Haralick, LBP, and TAS features. -[pyradiomics](https://github.com/AIM-Harvard/pyradiomics) - Radiomics features from medical imaging. -[pyefd](https://github.com/hbldh/pyefd) - Elliptical feature descriptor, approximating a contour with a Fourier series. - #### Feature Selection [Overview Paper](https://www.sciencedirect.com/science/article/pii/S016794731930194X), [Talk](https://www.youtube.com/watch?v=JsArBz46_3s), [Repo](https://github.com/Yimeng-Zhang/feature-engineering-and-feature-selection) Blog post series - [1](http://blog.datadive.net/selecting-good-features-part-i-univariate-selection/), [2](http://blog.datadive.net/selecting-good-features-part-ii-linear-models-and-regularization/), [3](http://blog.datadive.net/selecting-good-features-part-iii-random-forests/), [4](http://blog.datadive.net/selecting-good-features-part-iv-stability-selection-rfe-and-everything-side-by-side/) @@ -468,15 +376,114 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [Chatistics](https://github.com/MasterScrat/Chatistics) - Turn Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames. [textdistance](https://github.com/life4/textdistance) - Collection for comparing distances between two or more sequences. -#### Biology / Bioinformatics +#### Bio Image Analysis +[Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) + +##### Tutorials +[Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. +[python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. +[Bioimaging and Bioimage Analysis Guide](https://www.bioimagingguide.org/welcome.html) + +##### Datasets +[jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. +[MedMNIST](https://github.com/MedMNIST/MedMNIST) - Datasets for 2D and 3D Biomedical Image Classification. +[CytoImageNet](https://github.com/stan-hua/CytoImageNet) - Huge diverse dataset like ImageNet but for cell images. +[Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. +[broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. -##### Assay +#### Assay +[BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. [PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). -##### Biostatistics / Robust statistics +#### Biostatistics / Robust statistics +[Z-factor](https://en.wikipedia.org/wiki/Z-factor) - Measure of statistical effect size. [MinCovDet](https://scikit-learn.org/stable/modules/generated/sklearn.covariance.MinCovDet.html) - Robust estimator of covariance, RMPV, [Paper](https://wires.onlinelibrary.wiley.com/doi/full/10.1002/wics.1421), [App1](https://journals.sagepub.com/doi/10.1177/1087057112469257?url_ver=Z39.88-2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub++0pubmed&), [App2](https://www.cell.com/cell-reports/pdf/S2211-1247(21)00694-X.pdf). -[winsorize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize) - Simple adjustment of outliers. [moderated z-score](https://clue.io/connectopedia/replicate_collapse) - Weighted average of z-scores based on Spearman correlation. +[winsorize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize) - Simple adjustment of outliers. + +##### Data Formats and Converters +OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [standard](https://ngff.openmicroscopy.org/latest/) +[bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. +[raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. +[BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). +REMBI model - Recommended Metadata for Biological Images + * BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/) + * [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) + +#### Image Viewers +[vizarr](https://github.com/hms-dbmi/vizarr) - Browser-based image viewer for zarr format. +[avivator](https://github.com/hms-dbmi/viv) - Browser-based image viewer for tiff files. +[napari](https://github.com/napari/napari) - Image viewer and image processing tool. +[Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing tool. +[OMERO](https://www.openmicroscopy.org/omero/) - Image viewer for high-content screening. [IDR](https://idr.openmicroscopy.org/) uses OMERO. [Intro](https://www.youtube.com/watch?v=nSCrMO_c-5s) +[fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. + +##### Image Restoration and Denoising +[aydin](https://github.com/royerlab/aydin) - Image denoising. +[DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. +[CSBDeep](https://github.com/CSBDeep/CSBDeep) - Content-aware image restoration, [Project page](https://csbdeep.bioimagecomputing.com/tools/). + +##### Illumination correction + Bleed through correction +[skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). +[cidre](https://github.com/smithk/cidre) - Illumination correction method for optical microscopy. +[BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). +[cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. +Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). +Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). + +##### Platforms and Pipelines +[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. +[atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. +[py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. + +##### Labsyspharm +[mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). +[MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. +[cylinter](https://github.com/labsyspharm/cylinter) - Quality assurance for microscopy images, [Website](https://labsyspharm.github.io/cylinter/). +[ashlar](https://github.com/labsyspharm/ashlar) - Whole-slide microscopy image stitching and registration. +[scimap](https://github.com/labsyspharm/scimap) - Spatial Single-Cell Analysis Toolkit. + +##### Cell Segmentation +[Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +[MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. +[cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). +[stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. +[UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. +[nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. +[allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. +[Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. +[ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. +[EmbedSeg](https://github.com/juglab/EmbedSeg) - Embedding-based Instance Segmentation. + +##### Cell Segmentation Datasets +[cellpose](https://www.cellpose.org/dataset) - Cell images. +[omnipose](http://www.cellpose.org/dataset_omnipose) - Cell images. +[LIVECell](https://github.com/sartorius-research/LIVECell) - Cell images. +[Sartorius](https://www.kaggle.com/competitions/sartorius-cell-instance-segmentation/overview) - Neurons. +[EmbedSeg](https://github.com/juglab/EmbedSeg/releases/tag/v0.1.0) - 2D + 3D images. + +##### Evaluation +[seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). + +##### Feature Engineering Images +[Computer vision challenges in drug discovery - Maciej Hermanowicz](https://www.youtube.com/watch?v=Y5GJmnIhvFk) +[CellProfiler](https://github.com/CellProfiler/CellProfiler) - Biological image analysis. +[scikit-image](https://github.com/scikit-image/scikit-image) - Image processing. +[scikit-image regionprops](https://scikit-image.org/docs/dev/api/skimage.measure.html#skimage.measure.regionprops) - Regionprops: area, eccentricity, extent. +[mahotas](https://github.com/luispedro/mahotas) - Zernike, Haralick, LBP, and TAS features, [example](https://github.com/luispedro/python-image-tutorial/blob/master/Segmenting%20cell%20images%20(fluorescent%20microscopy).ipynb). +[pyradiomics](https://github.com/AIM-Harvard/pyradiomics) - Radiomics features from medical imaging. +[pyefd](https://github.com/hbldh/pyefd) - Elliptical feature descriptor, approximating a contour with a Fourier series. + +#### Domain Adaptation / Batch-Effect Correction +[Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). +[R Tutorial on correcting batch effects](https://broadinstitute.github.io/2019_scWorkshop/correcting-batch-effects.html). +[harmonypy](https://github.com/slowkow/harmonypy) - Fuzzy k-means and locally linear adjustments. +[pyliger](https://github.com/welch-lab/pyliger) - Batch-effect correction, [Example](https://github.com/welch-lab/pyliger/blob/master/pyliger/factorization/_iNMF_ANLS.py#L65), [R package](https://github.com/welch-lab/liger). +[nimfa](https://github.com/mims-harvard/nimfa) - Nonnegative matrix factorization. +[scgen](https://github.com/theislab/scgen) - Batch removal. [Doc](https://scgen.readthedocs.io/en/stable/). +[CORAL](https://github.com/google-research/google-research/tree/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn) - Correcting for Batch Effects Using Wasserstein Distance, [Code](https://github.com/google-research/google-research/blob/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn/transform.py#L152), [Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050548/). +[adapt](https://github.com/adapt-python/adapt) - Awesome Domain Adaptation Python Toolbox. +[pytorch-adapt](https://github.com/KevinMusgrave/pytorch-adapt) - Various neural network models for domain adaptation. ##### Sequencing [Single cell tutorial](https://github.com/theislab/single-cell-tutorial). @@ -487,28 +494,13 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [janggu](https://github.com/BIMSBbioinfo/janggu) - Deep Learning for Genomics. [gdsctools](https://github.com/CancerRxGene/gdsctools) - Drug responses in the context of the Genomics of Drug Sensitivity in Cancer project, ANOVA, IC50, MoBEM, [doc](https://gdsctools.readthedocs.io/en/master/). -##### Image-related -See also Microscopy Section above. -[mahotas](http://luispedro.org/software/mahotas/) - Image processing (Bioinformatics), [example](https://github.com/luispedro/python-image-tutorial/blob/master/Segmenting%20cell%20images%20(fluorescent%20microscopy).ipynb). -[imagepy](https://github.com/Image-Py/imagepy) - Software package for bioimage analysis. -[scimap](https://github.com/labsyspharm/scimap) - Spatial Single-Cell Analysis Toolkit. -[CellProfiler](https://github.com/CellProfiler/CellProfiler) - Biological image analysis. -[imglyb](https://github.com/imglib/imglyb) - Viewer for large images, [talk](https://www.youtube.com/watch?v=Ddo5z5qGMb8), [slides](https://github.com/hanslovsky/scipy-2019/blob/master/scipy-2019-imglyb.pdf). - ##### Drug discovery [TDC](https://github.com/mims-harvard/TDC/tree/main) - Drug Discovery and Development. [DeepPurpose](https://github.com/kexinhuang12345/DeepPurpose) - Deep Learning Based Molecular Modelling and Prediction Toolkit. -##### Courses -[mit6874](https://mit6874.github.io/) - Computational Systems Biology: Deep Learning in the Life Sciences. - -#### Image Processing -[Talk](https://www.youtube.com/watch?v=Y5GJmnIhvFk) -[cv2](https://github.com/skvark/opencv-python) - OpenCV, classical algorithms: [Gaussian Filter](https://docs.opencv.org/3.1.0/d4/d13/tutorial_py_filtering.html), [Morphological Transformations](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html). -[scikit-image](https://github.com/scikit-image/scikit-image) - Image processing. - #### Neural Networks [Convolutional Neural Networks for Visual Recognition](https://cs231n.github.io/) - Stanford CS class. +[mit6874](https://mit6874.github.io/) - Computational Systems Biology: Deep Learning in the Life Sciences. [ConvNet Shape Calculator](https://madebyollin.github.io/convnet-calculator/) - Calculate output dimensions of Conv2D layer. [Great Gradient Descent Article](https://towardsdatascience.com/10-gradient-descent-optimisation-algorithms-86989510b5e9). [Intro to semi-supervised learning](https://lilianweng.github.io/lil-log/2021/12/05/semi-supervised-learning.html). From 111f878572847925851aadc3770fe274ce77281e Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 14 May 2023 14:55:07 +0200 Subject: [PATCH 034/152] Update README.md --- README.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/README.md b/README.md index d1efea0..55fcb7e 100644 --- a/README.md +++ b/README.md @@ -595,9 +595,6 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [Detic](https://github.com/facebookresearch/Detic) - Detector with image classes that can use image-level labels (facebookresearch). [EasyCV](https://github.com/alibaba/EasyCV) - Image segmentation, classification, metric-learning, object detection, pose estimation. -##### Image Annotation -[cvat](https://github.com/openvinotoolkit/cvat) - Image annotation tool. - ##### Image Classification [nfnets](https://github.com/ypeleg/nfnets-keras) - Neural network. [efficientnet](https://github.com/lukemelas/EfficientNet-PyTorch) - Neural network. From 9b20a278619f27aa8518f2090f3fd3a00bbcb036 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 14 May 2023 14:56:38 +0200 Subject: [PATCH 035/152] Update README.md --- README.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 55fcb7e..6d3ca6e 100644 --- a/README.md +++ b/README.md @@ -405,10 +405,8 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [standard](https://ngff.openmicroscopy.org/latest/) [bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. [raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. -[BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). -REMBI model - Recommended Metadata for Biological Images - * BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/) - * [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) +[BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). +REMBI model - Recommended Metadata for Biological Images, BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/), [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) #### Image Viewers [vizarr](https://github.com/hms-dbmi/vizarr) - Browser-based image viewer for zarr format. From 63747612731d929bb5c58e3cce5c31f642bc55ab Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 15 May 2023 20:18:29 +0200 Subject: [PATCH 036/152] BioImage Model Zoo --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 6d3ca6e..7166e5f 100644 --- a/README.md +++ b/README.md @@ -442,7 +442,8 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [scimap](https://github.com/labsyspharm/scimap) - Spatial Single-Cell Analysis Toolkit. ##### Cell Segmentation -[Overview](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +[microscopy-tree](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +[BioImage.IO](https://bioimage.io/#/) - BioImage Model Zoo. [MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. From 67f846fb3031d192e0700104d5729092c234015b Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 17 May 2023 17:00:52 +0200 Subject: [PATCH 037/152] hatch and Microscopy Resolution Calculator --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 7166e5f..62820c0 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ [hydra](https://github.com/facebookresearch/hydra) - Configuration management. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. [poetry](https://github.com/python-poetry/poetry) - Dependency management. +[hatch](https://github.com/pypa/hatch) - Python project management. #### Pandas Tricks, Alternatives and Additions [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. @@ -391,8 +392,9 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. -#### Assay +#### Microscopy + Assay [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. +[Microscopy Resolution Calculator](https://www.microscope.healthcare.nikon.com/microtools/resolution-calculator) - Calculate resolution of images (Nikon). [PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). #### Biostatistics / Robust statistics From e68f7cda82823492aa2f82bab37208eb07b48b7d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 18 May 2023 10:05:11 +0200 Subject: [PATCH 038/152] matrix data formats --- README.md | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 62820c0..45a6d90 100644 --- a/README.md +++ b/README.md @@ -392,24 +392,30 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [Haghighi](https://github.com/carpenterlab/2021_Haghighi_NatureMethods) - Gene Expression and Morphology Profiles. [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. -#### Microscopy + Assay -[BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. -[Microscopy Resolution Calculator](https://www.microscope.healthcare.nikon.com/microtools/resolution-calculator) - Calculate resolution of images (Nikon). -[PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). - #### Biostatistics / Robust statistics [Z-factor](https://en.wikipedia.org/wiki/Z-factor) - Measure of statistical effect size. [MinCovDet](https://scikit-learn.org/stable/modules/generated/sklearn.covariance.MinCovDet.html) - Robust estimator of covariance, RMPV, [Paper](https://wires.onlinelibrary.wiley.com/doi/full/10.1002/wics.1421), [App1](https://journals.sagepub.com/doi/10.1177/1087057112469257?url_ver=Z39.88-2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub++0pubmed&), [App2](https://www.cell.com/cell-reports/pdf/S2211-1247(21)00694-X.pdf). [moderated z-score](https://clue.io/connectopedia/replicate_collapse) - Weighted average of z-scores based on Spearman correlation. [winsorize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize) - Simple adjustment of outliers. -##### Data Formats and Converters +#### Microscopy + Assay +[BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. +[Microscopy Resolution Calculator](https://www.microscope.healthcare.nikon.com/microtools/resolution-calculator) - Calculate resolution of images (Nikon). +[PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). + +##### Image Formats and Converters OME-Zarr - [paper](https://www.biorxiv.org/content/10.1101/2023.02.17.528834v1.full), [standard](https://ngff.openmicroscopy.org/latest/) [bioformats2raw](https://github.com/glencoesoftware/bioformats2raw) - Various formats to zarr. [raw2ometiff](https://github.com/glencoesoftware/raw2ometiff) - Zarr to tiff. [BatchConvert](https://github.com/Euro-BioImaging/BatchConvert) - Wrapper for bioformats2raw to parallelize conversions with nextflow, [video](https://www.youtube.com/watch?v=DeCWV274l0c). REMBI model - Recommended Metadata for Biological Images, BioImage Archive: [Study Component Guidance](https://www.ebi.ac.uk/bioimage-archive/rembi-help-examples/), [File List Guide](https://www.ebi.ac.uk/bioimage-archive/help-file-list/), [paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8606015/), [video](https://www.youtube.com/watch?v=GVmfOpuP2_c), [spreadsheet](https://docs.google.com/spreadsheets/d/1Ck1NeLp-ZN4eMGdNYo2nV6KLEdSfN6oQBKnnWU6Npeo/edit#gid=1023506919) +##### Matrix Formats +[anndata](https://github.com/scverse/anndata) - annotated data matrices in memory and on disk, [Docs](https://anndata.readthedocs.io/en/latest/index.html). +[muon](https://github.com/scverse/muon) - Multimodal omics framework. +[mudata](https://github.com/scverse/mudata) - Multimodal Data (.h5mu) implementation. +[bdz](https://github.com/openssbd/bdz) - Zarr-based format for storing quantitative biological dynamics data. + #### Image Viewers [vizarr](https://github.com/hms-dbmi/vizarr) - Browser-based image viewer for zarr format. [avivator](https://github.com/hms-dbmi/viv) - Browser-based image viewer for tiff files. From f5bde1fdebeaefe48f4b8284a6459eb06fe7a736 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 19 May 2023 23:16:06 +0200 Subject: [PATCH 039/152] Thermofisher Spectrum Viewer --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 45a6d90..69cfc5a 100644 --- a/README.md +++ b/README.md @@ -400,6 +400,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www #### Microscopy + Assay [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. +[Thermofisher Spectrum Viewer](https://www.thermofisher.com/order/stain-it) - Thermofisher Spectrum Viewer. [Microscopy Resolution Calculator](https://www.microscope.healthcare.nikon.com/microtools/resolution-calculator) - Calculate resolution of images (Nikon). [PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). From 5125f64bfa92203ffe1c05f0a1ef224b55fa9272 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 20 May 2023 17:41:26 +0200 Subject: [PATCH 040/152] Cleanup Dead Links --- README.md | 40 +++++++++++++++++----------------------- 1 file changed, 17 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 69cfc5a..6f1e24a 100644 --- a/README.md +++ b/README.md @@ -86,7 +86,7 @@ [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html#statistical-tests) - Statistical tests. [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) - Statistical post-hoc tests for pairwise multiple comparisons. Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandaltman.html), [2](http://www.statsmodels.org/dev/generated/statsmodels.graphics.agreement.mean_diff_plot.html) - Plot for agreement between two methods of measurement. -[ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html), Tutorials: [One-way](https://pythonfordatascience.org/anova-python/), [Two-way](https://pythonfordatascience.org/anova-2-way-n-way/), [Type 1,2,3 explained](https://mcfromnz.wordpress.com/2011/03/02/anova-type-iiiiii-ss-explained/). +[ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html) ##### Statistical Tests [test_proportions_2indep](https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.test_proportions_2indep.html) - Proportion test. @@ -97,7 +97,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal ##### Interim Analyses / Sequential Analysis / Stopping [Sequential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. -[Treatment Effects Monitoring](https://online.stat.psu.edu/stat509/node/75/) - Design and Analysis of Clinical Trials PennState. [sequential](https://cran.r-project.org/web/packages/Sequential/Sequential.pdf) - Exact Sequential Analysis for Poisson and Binomial Data (R package). [confseq](https://github.com/gostevehoward/confseq) - Uniform boundaries, confidence sequences, and always-valid p-values. @@ -126,9 +125,9 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Greenland - Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) [Lindeløv - Common statistical tests are linear models](https://lindeloev.github.io/tests-as-linear/) -[Chatruc - The Central Limit Theorem and its misuse](https://lambdaclass.com/data_etudes/central_limit_theorem_misuse/) +[Chatruc - The Central Limit Theorem and its misuse](https://web.archive.org/web/20191229234155/https://lambdaclass.com/data_etudes/central_limit_theorem_misuse/) [Al-Saleh - Properties of the Standard Deviation that are Rarely Mentioned in Classrooms](http://www.stat.tugraz.at/AJS/ausg093/093Al-Saleh.pdf) -[Wainer - The Most Dangerous Equation](http://www-stat.wharton.upenn.edu/~hwainer/Readings/Most%20Dangerous%20eqn.pdf) +[Wainer - The Most Dangerous Equation](http://nsmn1.uh.edu/dgraur/niv/themostdangerousequation.pdf) [Gigerenzer - The Bias Bias in Behavioral Economics](https://www.nowpublishers.com/article/Details/RBE-0092) [Cook - Estimating the chances of something that hasn’t happened yet](https://www.johndcook.com/blog/2010/03/30/statistical-rule-of-three/) [Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) @@ -277,7 +276,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [dtreeviz](https://github.com/parrt/dtreeviz) - Decision tree visualization and model interpretation. [chartify](https://github.com/spotify/chartify/) - Generate charts. [VivaGraphJS](https://github.com/anvaka/VivaGraphJS) - Graph visualization (JS package). -[pm](https://github.com/anvaka/pm) - Navigatable 3D graph visualization (JS package), [example](https://w2v-vis-dot-hcg-team-di.appspot.com/#/galaxy/word2vec?cx=5698&cy=-5135&cz=5923&lx=0.1127&ly=0.3238&lz=-0.1680&lw=0.9242&ml=150&s=1.75&l=1&v=hc). +[pm](https://github.com/anvaka/pm) - Navigatable 3D graph visualization (JS package). [python-ternary](https://github.com/marcharper/python-ternary) - Triangle plots. [falcon](https://github.com/uwdata/falcon) - Interactive visualizations for big data. [hiplot](https://github.com/facebookresearch/hiplot) - High dimensional Interactive Plotting. @@ -296,7 +295,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M #### Dashboards [py-shiny](https://github.com/rstudio/py-shiny) - Shiny for Python, [talk](https://www.youtube.com/watch?v=ijRBbtT2tgc). [superset](https://github.com/apache/superset) - Dashboarding solution by Apache. -[streamlit](https://github.com/streamlit/streamlit) - Dashboarding solution. [Resources](https://github.com/marcskovmadsen/awesome-streamlit), [Gallery](https://awesome-streamlit.org/) [Components](https://www.streamlit.io/components), [bokeh-events](https://github.com/ash2shukla/streamlit-bokeh-events). +[streamlit](https://github.com/streamlit/streamlit) - Dashboarding solution. [Resources](https://github.com/marcskovmadsen/awesome-streamlit), [Gallery](http://awesome-streamlit.org/) [Components](https://www.streamlit.io/components), [bokeh-events](https://github.com/ash2shukla/streamlit-bokeh-events). [mercury](https://github.com/mljar/mercury) - Convert Python notebook to web app, [Example](https://github.com/pplonski/dashboard-python-jupyter-notebook). [dash](https://dash.plot.ly/gallery) - Dashboarding solution by plot.ly. [Resources](https://github.com/ucg8j/awesome-dash). [visdom](https://github.com/facebookresearch/visdom) - Dashboarding library by Facebook. @@ -316,7 +315,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [gmaps](https://github.com/pbugnion/gmaps) - Google Maps for Jupyter notebooks. [stadiamaps](https://stadiamaps.com/) - Plot geographical maps. [datashader](https://github.com/bokeh/datashader) - Draw millions of points on a map. -[sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.BallTree.html) - BallTree, [Example](https://tech.minodes.com/experiments-with-in-memory-spatial-radius-queries-in-python-e40c9e66cf63). +[sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.BallTree.html) - BallTree. [pynndescent](https://github.com/lmcinnes/pynndescent) - Nearest neighbor descent for approximate nearest neighbors. [geocoder](https://github.com/DenisCarriere/geocoder) - Geocoding of addresses, IP addresses. Conversion of different geo formats: [talk](https://www.youtube.com/watch?v=eHRggqAvczE), [repo](https://github.com/dillongardner/PyDataSpatialAnalysis) @@ -325,7 +324,7 @@ Low Level Geospatial Tools (GEOS, GDAL/OGR, PROJ.4) Vector Data (Shapely, Fiona, Pyproj) Raster Data (Rasterio) Plotting (Descartes, Catropy) -Predict economic indicators from Open Street Map [ipynb](https://github.com/njanakiev/osm-predict-economic-measurements/blob/master/osm-predict-economic-indicators.ipynb). +[Predict economic indicators from Open Street Map](https://janakiev.com/blog/osm-predict-economic-indicators/). [PySal](https://github.com/pysal/pysal) - Python Spatial Analysis Library. [geography](https://github.com/ushahidi/geograpy) - Extract countries, regions and cities from a URL or text. [cartogram](https://go-cart.io/cartogram) - Distorted maps based on population. @@ -370,7 +369,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [annoy](https://github.com/spotify/annoy) - Approximate nearest neighbor search. [faiss](https://github.com/facebookresearch/faiss) - Approximate nearest neighbor search. [pysparnn](https://github.com/facebookresearch/pysparnn) - Approximate nearest neighbor search. -[infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics, [example](https://github.com/mapequation/infomap/blob/master/examples/python/infomap-examples.ipynb). +[infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics. [datasketch](https://github.com/ekzhu/datasketch) - Probabilistic data structures for large data (MinHash, HyperLogLog). [flair](https://github.com/zalandoresearch/flair) - NLP Framework by Zalando. [stanza](https://github.com/stanfordnlp/stanza) - NLP Library. @@ -486,7 +485,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). [R Tutorial on correcting batch effects](https://broadinstitute.github.io/2019_scWorkshop/correcting-batch-effects.html). [harmonypy](https://github.com/slowkow/harmonypy) - Fuzzy k-means and locally linear adjustments. -[pyliger](https://github.com/welch-lab/pyliger) - Batch-effect correction, [Example](https://github.com/welch-lab/pyliger/blob/master/pyliger/factorization/_iNMF_ANLS.py#L65), [R package](https://github.com/welch-lab/liger). +[pyliger](https://github.com/welch-lab/pyliger) - Batch-effect correction, [R package](https://github.com/welch-lab/liger). [nimfa](https://github.com/mims-harvard/nimfa) - Nonnegative matrix factorization. [scgen](https://github.com/theislab/scgen) - Batch removal. [Doc](https://scgen.readthedocs.io/en/stable/). [CORAL](https://github.com/google-research/google-research/tree/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn) - Correcting for Batch Effects Using Wasserstein Distance, [Code](https://github.com/google-research/google-research/blob/30e54523f08d963ced3fbb37c00e9225579d2e1d/correct_batch_effects_wdn/transform.py#L152), [Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7050548/). @@ -514,11 +513,11 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [Intro to semi-supervised learning](https://lilianweng.github.io/lil-log/2021/12/05/semi-supervised-learning.html). ##### Tutorials & Viewer -fast.ai course - [Lessons 1-7](https://course.fast.ai/videos/?lesson=1), [Lessons 8-14](http://course18.fast.ai/lessons/lessons2.html) +[fast.ai course](https://course.fast.ai/) - Practical Deep Learning for Coders. [Tensorflow without a PhD](https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd) - Neural Network course by Google. Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [PPT](http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture12.pdf) [Tensorflow Playground](https://playground.tensorflow.org/) -[Visualization of optimization algorithms](https://vis.ensmallen.org/), [Another visualization](https://github.com/jettify/pytorch-optimizer) +[Visualization of optimization algorithms](http://vis.ensmallen.org/), [Another visualization](https://github.com/jettify/pytorch-optimizer) [cutouts-explorer](https://github.com/mgckind/cutouts-explorer) - Image Viewer. ##### Image Related @@ -706,7 +705,6 @@ Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teach [FCPS](https://github.com/Mthrun/FCPS) - Fundamental Clustering Problems Suite (R package). [GaussianMixture](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html) - Generalized k-means clustering using a mixture of Gaussian distributions, [video](https://www.youtube.com/watch?v=aICqoAG5BXQ). [nmslib](https://github.com/nmslib/nmslib) - Similarity search library and toolkit for evaluation of k-NN methods. -[buckshotpp](https://github.com/zjohn77/buckshotpp) - Outlier-resistant and scalable clustering algorithm. [merf](https://github.com/manifoldai/merf) - Mixed Effects Random Forest for Clustering, [video](https://www.youtube.com/watch?v=gWj4ZwB7f3o) [tree-SNE](https://github.com/isaacrob/treesne) - Hierarchical clustering algorithm based on t-SNE. [MiniSom](https://github.com/JustGlowing/minisom) - Pure Python implementation of the Self Organizing Maps. @@ -775,7 +773,7 @@ Other measures: [nupic](https://github.com/numenta/nupic) - Hierarchical Temporal Memory (HTM) for Time Series Prediction and Anomaly Detection. [tensorflow](https://github.com/tensorflow/tensorflow/) - LSTM and others, examples: [link]( https://machinelearningmastery.com/time-series-forecasting-long-short-term-memory-network-python/ -), [link](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/timeseries), [link](https://github.com/hzy46/TensorFlow-Time-Series-Examples), [Explain LSTM](https://github.com/slundberg/shap/blob/master/notebooks/deep_explainer/Keras%20LSTM%20for%20IMDB%20Sentiment%20Classification.ipynb), seq2seq: [1](https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/), [2](https://github.com/guillaume-chevalier/seq2seq-signal-prediction), [3](https://github.com/JEddy92/TimeSeries_Seq2Seq/blob/master/notebooks/TS_Seq2Seq_Intro.ipynb), [4](https://github.com/LukeTonin/keras-seq-2-seq-signal-prediction) +), [link](https://github.com/hzy46/TensorFlow-Time-Series-Examples), seq2seq: [1](https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/), [2](https://github.com/guillaume-chevalier/seq2seq-signal-prediction), [3](https://github.com/JEddy92/TimeSeries_Seq2Seq/blob/master/notebooks/TS_Seq2Seq_Intro.ipynb), [4](https://github.com/LukeTonin/keras-seq-2-seq-signal-prediction) [tspreprocess](https://github.com/MaxBenChrist/tspreprocess) - Preprocessing: Denoising, Compression, Resampling. [tsfresh](https://github.com/blue-yonder/tsfresh) - Time series feature engineering. [tsfel](https://github.com/fraunhoferportugal/tsfel) - Time series feature extraction. @@ -783,7 +781,7 @@ https://machinelearningmastery.com/time-series-forecasting-long-short-term-memor [gatspy](https://www.astroml.org/gatspy/) - General tools for Astronomical Time Series, [talk](https://www.youtube.com/watch?v=E4NMZyfao2c). [gendis](https://github.com/IBCNServices/GENDIS) - shapelets, [example](https://github.com/IBCNServices/GENDIS/blob/master/gendis/example.ipynb). [tslearn](https://github.com/rtavenar/tslearn) - Time series clustering and classification, `TimeSeriesKMeans`, `TimeSeriesKMeans`. -[pastas](https://pastas.readthedocs.io/en/latest/examples.html) - Simulation of time series. +[pastas](https://github.com/pastas/pastas) - Analysis of Groundwater Time Series. [fastdtw](https://github.com/slaypni/fastdtw) - Dynamic Time Warp Distance. [fable](https://www.rdocumentation.org/packages/fable/versions/0.0.0.9000) - Time Series Forecasting (R package). [pydlm](https://github.com/wwrechard/pydlm) - Bayesian time series modelling ([R package](https://cran.r-project.org/web/packages/bsts/index.html), [Blog post](http://www.unofficialgoogledatascience.com/2017/07/fitting-bayesian-structural-time-series.html)) @@ -850,7 +848,7 @@ RandomSurvivalForests (R packages: randomForestSRC, ggRandomForests). [eif](https://github.com/sahandha/eif) - Extended Isolation Forest. [AnomalyDetection](https://github.com/twitter/AnomalyDetection) - Anomaly detection (R package). [luminol](https://github.com/linkedin/luminol) - Anomaly Detection and Correlation library from Linkedin. -Distances for comparing histograms and detecting outliers - [Talk](https://www.youtube.com/watch?v=U7xdiGc7IRU): [Kolmogorov-Smirnov](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.ks_2samp.html), [Wasserstein](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html), [Energy Distance (Cramer)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.energy_distance.html), [Kullback-Leibler divergence](https://scipy.github.io/devdocs/generated/scipy.stats.entropy.html). +Distances for comparing histograms and detecting outliers - [Talk](https://www.youtube.com/watch?v=U7xdiGc7IRU): [Kolmogorov-Smirnov](https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.ks_2samp.html), [Wasserstein](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html), [Energy Distance (Cramer)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.energy_distance.html), [Kullback-Leibler divergence](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.kl_div.html). [banpei](https://github.com/tsurubee/banpei) - Anomaly detection library based on singular spectrum transformation. [telemanom](https://github.com/khundman/telemanom) - Detect anomalies in multivariate time series data using LSTMs. [luminaire](https://github.com/zillow/luminaire) - Anomaly Detection for time series. @@ -885,7 +883,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y #### Probabilistic Modelling and Bayes [Intro](https://erikbern.com/2018/10/08/the-hackers-guide-to-uncertainty-estimates.html), [Guide](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers) -[PyMC3](https://docs.pymc.io/) - Bayesian modelling, [intro](https://docs.pymc.io/notebooks/getting_started) +[PyMC3](https://www.pymc.io/projects/docs/en/stable/learn.html) - Bayesian modelling. [numpyro](https://github.com/pyro-ppl/numpyro) - Probabilistic programming with numpy, built on [pyro](https://github.com/pyro-ppl/pyro). [pomegranate](https://github.com/jmschrei/pomegranate) - Probabilistic modelling, [talk](https://www.youtube.com/watch?v=dE5j6NW-Kzg). [pmlearn](https://github.com/pymc-learn/pymc-learn) - Probabilistic machine learning. @@ -931,7 +929,7 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin [lime](https://github.com/marcotcr/lime) - Explaining the predictions of any machine learning classifier, [talk](https://www.youtube.com/watch?v=C80SQe16Rao), [Warning (Myth 7)](https://crazyoscarchang.github.io/2019/02/16/seven-myths-in-machine-learning-research/). [lime_xgboost](https://github.com/jphall663/lime_xgboost) - Create LIMEs for XGBoost. [eli5](https://github.com/TeamHG-Memex/eli5) - Inspecting machine learning classifiers and explaining their predictions. -[lofo-importance](https://github.com/aerdem4/lofo-importance) - Leave One Feature Out Importance, [talk](https://www.youtube.com/watch?v=zqsQ2ojj7sE), examples: [1](https://www.kaggle.com/divrikwicky/pf-f-lofo-importance-on-adversarial-validation), [2](https://www.kaggle.com/divrikwicky/lofo-importance), [3](https://www.kaggle.com/divrikwicky/santanderctp-lofo-feature-importance). +[lofo-importance](https://github.com/aerdem4/lofo-importance) - Leave One Feature Out Importance, [talk](https://www.youtube.com/watch?v=zqsQ2ojj7sE). [pybreakdown](https://github.com/MI2DataLab/pyBreakDown) - Generate feature contribution plots. [pycebox](https://github.com/AustinRochford/PyCEbox) - Individual Conditional Expectation Plot Toolbox. [pdpbox](https://github.com/SauceCat/PDPbox) - Partial dependence plot toolbox, [example](https://www.kaggle.com/dansbecker/partial-plots). @@ -984,7 +982,6 @@ Optometrist algorithm - [paper](https://www.nature.com/articles/s41598-017-06645 [optuna](https://github.com/pfnet/optuna) - Hyperparamter optimization, [Talk](https://www.youtube.com/watch?v=tcrcLRopTX0). [skopt](https://scikit-optimize.github.io/) - `BayesSearchCV` for Hyperparameter search. [tune](https://ray.readthedocs.io/en/latest/tune.html) - Hyperparameter search with a focus on deep learning and deep reinforcement learning. -[hypergraph](https://github.com/aljabr0/hypergraph) - Global optimization methods and hyperparameter optimization. [bbopt](https://github.com/evhub/bbopt) - Black box hyperparameter optimization. [dragonfly](https://github.com/dragonfly/dragonfly) - Scalable Bayesian optimisation. [botorch](https://github.com/pytorch/botorch) - Bayesian optimization in PyTorch. @@ -1096,7 +1093,6 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Machine Learning Books](http://matpalm.com/blog/cool_machine_learning_books/) [Awesome Machine Learning Interpretability](https://github.com/jphall663/awesome-machine-learning-interpretability) [Awesome Machine Learning Operations](https://github.com/EthicalML/awesome-machine-learning-operations) -[Awesome Metric Learning](https://github.com/kdhht2334/Survey_of_Deep_Metric_Learning) [Awesome Monte Carlo Tree Search](https://github.com/benedekrozemberczki/awesome-monte-carlo-tree-search-papers) [Awesome Neural Network Visualization](https://github.com/ashishpatel26/Tools-to-Design-or-Visualize-Architecture-of-Neural-Network) [Awesome Online Machine Learning](https://github.com/MaxHalford/awesome-online-machine-learning) @@ -1105,7 +1101,6 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Python](https://github.com/vinta/awesome-python) [Awesome Python Data Science](https://github.com/krzjoa/awesome-python-datascience) [Awesome Python Data Science](https://github.com/thomasjpfan/awesome-python-data-science) -[Awesome Python Data Science](https://github.com/amitness/toolbox) [Awesome Pytorch](https://github.com/bharathgs/Awesome-pytorch-list) [Awesome Quantitative Finance](https://github.com/wilsonfreitas/awesome-quant) [Awesome Recommender Systems](https://github.com/grahamjenson/list_of_recommender_systems) @@ -1121,10 +1116,9 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [NYU Deep Learning SP21](https://www.youtube.com/playlist?list=PLLHTzKZzVU9e6xUfG10TkTWApKSZCzuBI) - YouTube Playlist. #### Things I google a lot -[Color codes](https://github.com/d3/d3-3.x-api-reference/blob/master/Ordinal-Scales.md#categorical-colors) +[Color Codes](https://github.com/d3/d3-3.x-api-reference/blob/master/Ordinal-Scales.md#categorical-colors) [Frequency codes for time series](https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases) [Date parsing codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior) -[Feature Calculators tsfresh](https://github.com/blue-yonder/tsfresh/blob/master/tsfresh/feature_extraction/feature_calculators.py) ## Contributing Do you know a package that should be on this list? Did you spot a package that is no longer maintained and should be removed from this list? Then feel free to read the [contribution guidelines](CONTRIBUTING.md) and submit your pull request or create a new issue. From 116d32de00ba0a62ad37f7cd06072668902790f8 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 14 Jul 2023 01:13:48 +0200 Subject: [PATCH 041/152] jupyter-scatter --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 6f1e24a..ba3c7e5 100644 --- a/README.md +++ b/README.md @@ -286,6 +286,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [largeVis](https://github.com/elbamos/largeVis) - Visualize embeddings (t-SNE etc.) (R package). [proplot](https://github.com/proplot-dev/proplot) - Matplotlib wrapper. [morpheus](https://software.broadinstitute.org/morpheus/) - Broad Institute tool matrix visualization and analysis software. [Source](https://github.com/cmap/morpheus.js), Tutorial: [1](https://www.youtube.com/watch?v=0nkYDeekhtQ), [2](https://www.youtube.com/watch?v=r9mN6MsxUb0), [Code](https://github.com/broadinstitute/BBBC021_Morpheus_Exercise). +[jupyter-scatter](https://github.com/flekschas/jupyter-scatter) - Interactive 2D scatter plot widget for Jupyter. #### Colors [palettable](https://github.com/jiffyclub/palettable) - Color palettes from [colorbrewer2](https://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3). From 1aeddeccc27025ec78c0f23b2447a8acf3b8546e Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 14 Jul 2023 14:10:56 +0200 Subject: [PATCH 042/152] bioimaging --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ba3c7e5..aef94cd 100644 --- a/README.md +++ b/README.md @@ -381,6 +381,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) ##### Tutorials +[bioimaging.org](https://www.bioimagingguide.org/welcome.html) - A biologists guide to planning and performing quantitative bioimaging experiments. [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. [Bioimaging and Bioimage Analysis Guide](https://www.bioimagingguide.org/welcome.html) From 049b51962acee19385830c60fa56847c9883a8ea Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 14 Jul 2023 14:11:29 +0200 Subject: [PATCH 043/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index aef94cd..3ec81d8 100644 --- a/README.md +++ b/README.md @@ -384,7 +384,6 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [bioimaging.org](https://www.bioimagingguide.org/welcome.html) - A biologists guide to planning and performing quantitative bioimaging experiments. [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. -[Bioimaging and Bioimage Analysis Guide](https://www.bioimagingguide.org/welcome.html) ##### Datasets [jump-cellpainting](https://github.com/jump-cellpainting/datasets) - Cellpainting dataset. From 610500c9d3b41bc03bcb8ae5f12fc7a6693368e8 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 19 Jul 2023 14:03:18 +0200 Subject: [PATCH 044/152] SpectraViewer --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3ec81d8..dc47b4c 100644 --- a/README.md +++ b/README.md @@ -400,6 +400,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www #### Microscopy + Assay [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. +[SpectraViewer](https://www.perkinelmer.com/lab-products-and-services/spectraviewer) - Visualize the spectral compatibility of fluorophores (PerkinElmer). [Thermofisher Spectrum Viewer](https://www.thermofisher.com/order/stain-it) - Thermofisher Spectrum Viewer. [Microscopy Resolution Calculator](https://www.microscope.healthcare.nikon.com/microtools/resolution-calculator) - Calculate resolution of images (Nikon). [PlateEditor](https://github.com/vindelorme/PlateEditor) - Drug Layout for plates, [app](https://plateeditor.sourceforge.io/), [zip](https://sourceforge.net/projects/plateeditor/), [paper](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252488). From 1247afcc79bd3b78fac3a864effd93e336470f31 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 27 Jul 2023 20:36:59 +0200 Subject: [PATCH 045/152] micro-sam --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index dc47b4c..3da6c4e 100644 --- a/README.md +++ b/README.md @@ -463,6 +463,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. [ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. [EmbedSeg](https://github.com/juglab/EmbedSeg) - Embedding-based Instance Segmentation. +[micro-sam](https://github.com/computational-cell-analytics/micro-sam) - SegmentAnything for Microscopy. ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. From fb28d59776bec62589ef006ac27d3381348f5651 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 5 Aug 2023 16:50:17 +0200 Subject: [PATCH 046/152] Metrics reloaded --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3da6c4e..98bd3be 100644 --- a/README.md +++ b/README.md @@ -593,8 +593,8 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [visualkeras](https://github.com/paulgavrikov/visualkeras) - Visualize Keras networks. ##### Object detection / Instance Segmentation +[Metrics reloaded: Recommendations for image analysis validation](https://arxiv.org/abs/2206.01653) - Guide for choosing correct image analysis metrics, [Code](https://github.com/Project-MONAI/MetricsReloaded), [Twitter Thread](https://twitter.com/lena_maierhein/status/1625450342006521857) [Good Yolo Explanation](https://jonathan-hui.medium.com/real-time-object-detection-with-yolo-yolov2-28b1b93e2088) -[segmentation_models](https://github.com/qubvel/segmentation_models) - Segmentation models with pretrained backbones: Unet, FPN, Linknet, PSPNet. [yolact](https://github.com/dbolya/yolact) - Fully convolutional model for real-time instance segmentation. [EfficientDet Pytorch](https://github.com/toandaominh1997/EfficientDet.Pytorch), [EfficientDet Keras](https://github.com/xuannianz/EfficientDet) - Scalable and Efficient Object Detection. [detectron2](https://github.com/facebookresearch/detectron2) - Object Detection (Mask R-CNN) by Facebook. From 04836ed61593252fda2fb36200a91786f5ea7b24 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 7 Aug 2023 10:27:06 +0200 Subject: [PATCH 047/152] SCIP --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 98bd3be..f838f6c 100644 --- a/README.md +++ b/README.md @@ -444,6 +444,9 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. +##### Microscopy Pipelines +[SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. + ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). [MCQuant](https://github.com/labsyspharm/quantification) - Quantification of cell features. From 411e6593afda69e44fa6f0b693820203d4d20cea Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 8 Aug 2023 23:51:30 +0200 Subject: [PATCH 048/152] Image Viewers, DeepCell --- README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index f838f6c..2e85a81 100644 --- a/README.md +++ b/README.md @@ -425,7 +425,11 @@ REMBI model - Recommended Metadata for Biological Images, BioImage Archive: [Stu [Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing tool. [OMERO](https://www.openmicroscopy.org/omero/) - Image viewer for high-content screening. [IDR](https://idr.openmicroscopy.org/) uses OMERO. [Intro](https://www.youtube.com/watch?v=nSCrMO_c-5s) [fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. - +Image Data Explorer - Microscopy Image Viewer, [Shiny App](https://shiny-portal.embl.de/shinyapps/app/01_image-data-explorer), [Video](https://www.youtube.com/watch?v=H8zIZvOt1MA). +[ImSwitch](https://github.com/ImSwitch/ImSwitch) - Microscopy Image Viewer, [Doc](https://imswitch.readthedocs.io/en/stable/gui.html), [Video](https://www.youtube.com/watch?v=XsbnMkGSPQQ). +[pixmi](https://github.com/piximi/piximi) - Web-based image annotation and classification tool, [App](https://www.piximi.app/). +[DeepCell Label](https://label.deepcell.org/) - Data labeling tool to segment images, [Video](https://www.youtube.com/watch?v=zfsvUBkEeow). + ##### Image Restoration and Denoising [aydin](https://github.com/royerlab/aydin) - Image denoising. [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. @@ -446,6 +450,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins ##### Microscopy Pipelines [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. +[DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). @@ -467,6 +472,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. [EmbedSeg](https://github.com/juglab/EmbedSeg) - Embedding-based Instance Segmentation. [micro-sam](https://github.com/computational-cell-analytics/micro-sam) - SegmentAnything for Microscopy. +[deepcell-tf](https://github.com/vanvalenlab/deepcell-tf/tree/master) - Cell segmentation, [DeepCell](https://deepcell.org/). ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. From 50b7750592e4b07fef5f12c969e90de6e0eea91c Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 10 Aug 2023 10:51:39 +0200 Subject: [PATCH 049/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2e85a81..1be411f 100644 --- a/README.md +++ b/README.md @@ -444,7 +444,7 @@ Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.yout Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). ##### Platforms and Pipelines -[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. +[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. From d4dd18a4db89df9ede98b831197a7ec0277a271a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 11 Aug 2023 00:55:58 +0200 Subject: [PATCH 050/152] huey --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 1be411f..3b5c59f 100644 --- a/README.md +++ b/README.md @@ -1027,6 +1027,7 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe [kestra](https://github.com/kestra-io/kestra) - Workflow orchestration. [cml](https://github.com/iterative/cml) - CI/CD for Machine Learning Projects. [rocketry](https://github.com/Miksus/rocketry) - Task scheduling. +[huey](https://github.com/coleifer/huey) - Task queue. ##### Containerization and Docker [Reduce size of docker images (video)](https://www.youtube.com/watch?v=Z1Al4I4Os_A) From 5fb3e5d41953fd4b690b617221283a61b3bcfa11 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 11 Aug 2023 00:56:18 +0200 Subject: [PATCH 051/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3b5c59f..c0fccf7 100644 --- a/README.md +++ b/README.md @@ -1019,7 +1019,7 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe #### Deployment and Lifecycle Management ##### Workflow Scheduling and Orchestration -[nextflow](https://github.com/goodwright/nextflow.py) - Run scripts and workflow graphs in Docker image using Google Life Sciences, AWS Batch, [Website](https://github.com/nextflow-io/nextflow). +[nextflow](https://github.com/goodwright/nextflow.py) - Run scripts and workflow graphs in Docker image using Google Life Sciences, AWS Batch, [Website](https://github.com/nextflow-io/nextflow). [airflow](https://github.com/apache/airflow) - Schedule and monitor workflows. [prefect](https://github.com/PrefectHQ/prefect) - Python specific workflow scheduling. [dagster](https://github.com/dagster-io/dagster) - Development, production and observation of data assets. From 263569ad02d564da78136aaf1094fa36ff6cc75b Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 12 Aug 2023 17:17:41 +0200 Subject: [PATCH 052/152] pyenv --- README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index c0fccf7..846d4b1 100644 --- a/README.md +++ b/README.md @@ -13,14 +13,16 @@ [rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - VSCode plugin to display .csv files with nice colors. #### General Python Programming +[Python Best Practices Guide](https://github.com/qiwihui/pocket_readings/issues/1148#issuecomment-874448132) +[pyenv](https://github.com/pyenv/pyenv) - Manage multiple Python versions on your system. +[poetry](https://github.com/python-poetry/poetry) - Dependency management. +[pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. +[hydra](https://github.com/facebookresearch/hydra) - Configuration management. +[hatch](https://github.com/pypa/hatch) - Python project management. [more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. [tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). [loguru](https://github.com/Delgan/loguru) - Python logging. -[dateparser](https://github.com/scrapinghub/dateparser) - A better date parser. -[hydra](https://github.com/facebookresearch/hydra) - Configuration management. -[pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. -[poetry](https://github.com/python-poetry/poetry) - Dependency management. -[hatch](https://github.com/pypa/hatch) - Python project management. + #### Pandas Tricks, Alternatives and Additions [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. From 4a30e4de6bde7e456582840745ba1f84d11db26a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 5 Sep 2023 13:54:01 +0200 Subject: [PATCH 053/152] High-Content Screening Assay Design --- README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 846d4b1..adc985b 100644 --- a/README.md +++ b/README.md @@ -395,11 +395,19 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [broadinstitute/lincs-profiling-complementarity](https://github.com/broadinstitute/lincs-profiling-complementarity) - Cellpainting vs. L1000 assay. #### Biostatistics / Robust statistics -[Z-factor](https://en.wikipedia.org/wiki/Z-factor) - Measure of statistical effect size. [MinCovDet](https://scikit-learn.org/stable/modules/generated/sklearn.covariance.MinCovDet.html) - Robust estimator of covariance, RMPV, [Paper](https://wires.onlinelibrary.wiley.com/doi/full/10.1002/wics.1421), [App1](https://journals.sagepub.com/doi/10.1177/1087057112469257?url_ver=Z39.88-2003&rfr_id=ori%3Arid%3Acrossref.org&rfr_dat=cr_pub++0pubmed&), [App2](https://www.cell.com/cell-reports/pdf/S2211-1247(21)00694-X.pdf). [moderated z-score](https://clue.io/connectopedia/replicate_collapse) - Weighted average of z-scores based on Spearman correlation. [winsorize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.winsorize.html#scipy.stats.mstats.winsorize) - Simple adjustment of outliers. +#### High-Content Screening Assay Design +[Zhang XHD (2008) - Novel analytic criteria and effective plate designs for quality control in genome-wide RNAi screens](https://slas-discovery.org/article/S2472-5552(22)08204-1/pdf) +[Iversen - A Comparison of Assay Performance Measures in Screening Assays, Signal Window, Z′ Factor, and Assay Variability Ratio](https://www.slas-discovery.org/article/S2472-5552(22)08460-X/pdf) +[Z-factor](https://en.wikipedia.org/wiki/Z-factor) - Measure of statistical effect size. +[Z'-factor](https://link.springer.com/referenceworkentry/10.1007/978-3-540-47648-1_6298) - Measure of statistical effect size. +[CV](https://en.wikipedia.org/wiki/Coefficient_of_variation) - Coefficient of variation. +[SSMD](https://en.wikipedia.org/wiki/Strictly_standardized_mean_difference) - Strictly standardized mean difference. +[Signal Window](https://www.intechopen.com/chapters/48130) - Assay quality measurement. + #### Microscopy + Assay [BD Spectrum Viewer](https://www.bdbiosciences.com/en-us/resources/bd-spectrum-viewer) - Calculate spectral overlap, bleed through for fluorescence microscopy dyes. [SpectraViewer](https://www.perkinelmer.com/lab-products-and-services/spectraviewer) - Visualize the spectral compatibility of fluorophores (PerkinElmer). From 2a219e593baa42704999a5aa47500280b8b81698 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 12 Sep 2023 16:29:11 +0200 Subject: [PATCH 054/152] How large is that number in the Law of Large Numbers? --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index adc985b..59074d6 100644 --- a/README.md +++ b/README.md @@ -132,7 +132,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Wainer - The Most Dangerous Equation](http://nsmn1.uh.edu/dgraur/niv/themostdangerousequation.pdf) [Gigerenzer - The Bias Bias in Behavioral Economics](https://www.nowpublishers.com/article/Details/RBE-0092) [Cook - Estimating the chances of something that hasn’t happened yet](https://www.johndcook.com/blog/2010/03/30/statistical-rule-of-three/) -[Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) +[Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) +[How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) From fc2a64aa0eef5917e3313e240706210ebb6a609f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 26 Sep 2023 13:54:56 +0200 Subject: [PATCH 055/152] monkeybread --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 59074d6..0f0a034 100644 --- a/README.md +++ b/README.md @@ -523,6 +523,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [besca](https://github.com/bedapub/besca) - Beyond single-cell analysis. [janggu](https://github.com/BIMSBbioinfo/janggu) - Deep Learning for Genomics. [gdsctools](https://github.com/CancerRxGene/gdsctools) - Drug responses in the context of the Genomics of Drug Sensitivity in Cancer project, ANOVA, IC50, MoBEM, [doc](https://gdsctools.readthedocs.io/en/master/). +[monkeybread](https://github.com/immunitastx/monkeybread) - Analysis of single-cell spatial transcriptomics data. ##### Drug discovery [TDC](https://github.com/mims-harvard/TDC/tree/main) - Drug Discovery and Development. From bb229efc74a54708ba32df133335d7204448a8bb Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 29 Sep 2023 00:22:42 +0200 Subject: [PATCH 056/152] temporian --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 0f0a034..11ffa79 100644 --- a/README.md +++ b/README.md @@ -173,6 +173,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [mlxtend](https://rasbt.github.io/mlxtend/user_guide/feature_extraction/LinearDiscriminantAnalysis/) - LDA. [featuretools](https://github.com/Featuretools/featuretools) - Automated feature engineering, [example](https://github.com/WillKoehrsen/automated-feature-engineering/blob/master/walk_through/Automated_Feature_Engineering.ipynb). [tsfresh](https://github.com/blue-yonder/tsfresh) - Time series feature engineering. +[temporian](https://github.com/google/temporian) - Time series feature engineering by Google. [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines. [feature_engine](https://github.com/solegalli/feature_engine) - Encoders, transformers, etc. From f234ffb9b01f5c5d2028fa1105ed1f27d12b3857 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 13 Oct 2023 11:23:50 +0200 Subject: [PATCH 057/152] qupath --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 11ffa79..8cd585c 100644 --- a/README.md +++ b/README.md @@ -456,9 +456,11 @@ Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.yout Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). ##### Platforms and Pipelines +[CellProfiler](https://github.com/CellProfiler/CellProfiler), [CellProfilerAnalyst](https://github.com/CellProfiler/CellProfiler-Analyst) - Create image analysis pipelines. [fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. +[qupath](https://github.com/qupath/qupath) - Image analysis. ##### Microscopy Pipelines [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. From cc79a3f43f9c1700e88c75902189cf306bfc9556 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 13 Oct 2023 17:17:49 +0200 Subject: [PATCH 058/152] The Prosecutor's Fallacy --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 8cd585c..adb902f 100644 --- a/README.md +++ b/README.md @@ -133,7 +133,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Gigerenzer - The Bias Bias in Behavioral Economics](https://www.nowpublishers.com/article/Details/RBE-0092) [Cook - Estimating the chances of something that hasn’t happened yet](https://www.johndcook.com/blog/2010/03/30/statistical-rule-of-three/) [Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) -[How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) +[How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) +[The Prosecutor's Fallacy](https://www.cebm.ox.ac.uk/news/views/the-prosecutors-fallacy) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) From 883da97c42303941b2c3ef8d59b7299899185555 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 20 Oct 2023 11:25:23 +0200 Subject: [PATCH 059/152] IMCWorkflow --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index adb902f..d385516 100644 --- a/README.md +++ b/README.md @@ -466,6 +466,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins ##### Microscopy Pipelines [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. [DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. +[IMCWorkflow](https://github.com/BodenmillerGroup/IMCWorkflow/) - Image analysis pipeline using [steinbock](https://github.com/BodenmillerGroup/steinbock), [Twitter](https://twitter.com/NilsEling/status/1715020265963258087), [Paper](https://www.nature.com/articles/s41596-023-00881-0), [workflow](https://bodenmillergroup.github.io/IMCDataAnalysis/). ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From 4cfdafac78d7b9a96c5cbcb3c1a3754c9ff38472 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 21 Oct 2023 13:34:59 +0200 Subject: [PATCH 060/152] evaluate --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index d385516..bf254bf 100644 --- a/README.md +++ b/README.md @@ -940,6 +940,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [combo](https://github.com/yzhao062/combo) - Combining ML models (stacking, ensembling). #### Model Evaluation +[evaluate](https://github.com/huggingface/evaluate) - Evaluate machine learning models (huggingface). [pycm](https://github.com/sepandhaghighi/pycm) - Multi-class confusion matrix. [pandas_ml](https://github.com/pandas-ml/pandas-ml) - Confusion matrix. Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learning-curve/). From d84a3d82efe8c4e3769b72c2b404407d99eddfff Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 21 Oct 2023 16:07:38 +0200 Subject: [PATCH 061/152] feature-engine --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bf254bf..964136f 100644 --- a/README.md +++ b/README.md @@ -176,7 +176,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [tsfresh](https://github.com/blue-yonder/tsfresh) - Time series feature engineering. [temporian](https://github.com/google/temporian) - Time series feature engineering by Google. [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines. -[feature_engine](https://github.com/solegalli/feature_engine) - Encoders, transformers, etc. +[feature-engine](https://github.com/feature-engine/feature_engine) - Encoders, transformers, etc. #### Computer Vision [Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) From c2907bf8209134f1df9f9ec36b8e6301bf2a3e1d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 23 Oct 2023 21:12:08 +0200 Subject: [PATCH 062/152] Permutation Importance --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 964136f..59eb120 100644 --- a/README.md +++ b/README.md @@ -954,6 +954,7 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin #### Model Explanation, Interpretability, Feature Importance [Princeton - Reproducibility Crisis in ML‑based Science](https://sites.google.com/princeton.edu/rep-workshop) [Book](https://christophm.github.io/interpretable-ml-book/agnostic.html), [Examples](https://github.com/jphall663/interpretable_machine_learning_with_python) +scikit-learn - [Permutation Importance](https://scikit-learn.org/stable/modules/generated/sklearn.inspection.permutation_importance.html) (can be used on any trained classifier) and [Partial Dependence](https://scikit-learn.org/stable/modules/generated/sklearn.inspection.partial_dependence.html) [shap](https://github.com/slundberg/shap) - Explain predictions of machine learning models, [talk](https://www.youtube.com/watch?v=C80SQe16Rao), [Good Shap intro](https://www.aidancooper.co.uk/a-non-technical-guide-to-interpreting-shap-analyses/). [treeinterpreter](https://github.com/andosa/treeinterpreter) - Interpreting scikit-learn's decision tree and random forest predictions. [lime](https://github.com/marcotcr/lime) - Explaining the predictions of any machine learning classifier, [talk](https://www.youtube.com/watch?v=C80SQe16Rao), [Warning (Myth 7)](https://crazyoscarchang.github.io/2019/02/16/seven-myths-in-machine-learning-research/). From aa0ffe32d7860a7b16ac0e042d9ff88a93d07f81 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 27 Oct 2023 14:59:13 +0200 Subject: [PATCH 063/152] pyvips --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 59eb120..9392e5f 100644 --- a/README.md +++ b/README.md @@ -508,6 +508,7 @@ Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins [mahotas](https://github.com/luispedro/mahotas) - Zernike, Haralick, LBP, and TAS features, [example](https://github.com/luispedro/python-image-tutorial/blob/master/Segmenting%20cell%20images%20(fluorescent%20microscopy).ipynb). [pyradiomics](https://github.com/AIM-Harvard/pyradiomics) - Radiomics features from medical imaging. [pyefd](https://github.com/hbldh/pyefd) - Elliptical feature descriptor, approximating a contour with a Fourier series. +[pyvips](https://github.com/libvips/pyvips/tree/master) - Faster image processing operations. #### Domain Adaptation / Batch-Effect Correction [Tran - A benchmark of batch-effect correction methods for single-cell RNA sequencing data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1850-9), [Code](https://github.com/JinmiaoChenLab/Batch-effect-removal-benchmarking). @@ -557,6 +558,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [augmix](https://github.com/google-research/augmix) - Image augmentation from Google. [kornia](https://github.com/kornia/kornia) - Image augmentation, feature extraction and loss functions. [augly](https://github.com/facebookresearch/AugLy) - Image, audio, text, video augmentation from Facebook. +[pyvips](https://github.com/libvips/pyvips/tree/master) - Faster image processing operations. ##### Lossfunction Related [SegLoss](https://github.com/JunMa11/SegLoss) - List of loss functions for medical image segmentation. From e8b3a1db13e649a6b288df0823fa1e85a6c2ff0e Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 27 Oct 2023 15:29:11 +0200 Subject: [PATCH 064/152] Filters --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9392e5f..4bb018d 100644 --- a/README.md +++ b/README.md @@ -781,12 +781,16 @@ Other measures: #### Signal Processing and Filtering [Stanford Lecture Series on Fourier Transformation](https://see.stanford.edu/Course/EE261), [Youtube](https://www.youtube.com/watch?v=gZNm7L96pfY&list=PLB24BC7956EE040CD&index=1), [Lecture Notes](https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf). [Visual Fourier explanation](https://dsego.github.io/demystifying-fourier/). -[The Scientist & Engineer's Guide to Digital Signal Processing (1999)](https://www.analog.com/en/education/education-library/scientist_engineers_guide.html). +[The Scientist & Engineer's Guide to Digital Signal Processing (1999)](https://www.analog.com/en/education/education-library/scientist_engineers_guide.html) - Chapter 3 has good introduction to Bessel, Butterworth and Chebyshev filters. [Kalman Filter article](https://www.bzarg.com/p/how-a-kalman-filter-works-in-pictures). [Kalman Filter book](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python) - Focuses on intuition using Jupyter Notebooks. Includes Bayesian and various Kalman filters. [Interactive Tool](https://fiiir.com/) for FIR and IIR filters, [Examples](https://plot.ly/python/fft-filters/). [filterpy](https://github.com/rlabbe/filterpy) - Kalman filtering and optimal estimation library. +#### Filtering in Python +[scipy.signal](https://docs.scipy.org/doc/scipy/reference/signal.html) - [Butterworth low-pass filter example](https://github.com/guillaume-chevalier/filtering-stft-and-laplace-transform), [Savitzky–Golay filter](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html'), [W](https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter) +[pandas.Series.rolling](https://pandas.pydata.org/docs/reference/api/pandas.Series.rolling.html) - Choose appropriate `win_type`. + #### Geometry [geomstats](https://github.com/geomstats/geomstats) - Computations and statistics on manifolds with geometric structures. From 2b119dc2013b5e41c1f59c9ba0ce18618de455fd Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 27 Oct 2023 15:40:30 +0200 Subject: [PATCH 065/152] Update README.md --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4bb018d..3f072b1 100644 --- a/README.md +++ b/README.md @@ -788,7 +788,9 @@ Other measures: [filterpy](https://github.com/rlabbe/filterpy) - Kalman filtering and optimal estimation library. #### Filtering in Python -[scipy.signal](https://docs.scipy.org/doc/scipy/reference/signal.html) - [Butterworth low-pass filter example](https://github.com/guillaume-chevalier/filtering-stft-and-laplace-transform), [Savitzky–Golay filter](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html'), [W](https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter) +[scipy.signal](https://docs.scipy.org/doc/scipy/reference/signal.html) +* [Butterworth low-pass filter example](https://github.com/guillaume-chevalier/filtering-stft-and-laplace-transform) +* [Savitzky–Golay filter](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html), [W](https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter) [pandas.Series.rolling](https://pandas.pydata.org/docs/reference/api/pandas.Series.rolling.html) - Choose appropriate `win_type`. #### Geometry From 9ca943e8a0ae9026078bd3822e404adab43d8df0 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 31 Oct 2023 21:24:08 +0100 Subject: [PATCH 066/152] PICASSO --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3f072b1..2332bda 100644 --- a/README.md +++ b/README.md @@ -448,10 +448,13 @@ Image Data Explorer - Microscopy Image Viewer, [Shiny App](https://shiny-portal. [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. [CSBDeep](https://github.com/CSBDeep/CSBDeep) - Content-aware image restoration, [Project page](https://csbdeep.bioimagecomputing.com/tools/). -##### Illumination correction + Bleed through correction +##### Illumination correction [skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). [cidre](https://github.com/smithk/cidre) - Illumination correction method for optical microscopy. [BaSiCPy](https://github.com/peng-lab/BaSiCPy) - Background and Shading Correction of Optical Microscopy Images, [BaSiC](https://github.com/marrlab/BaSiC). + +##### Bleedthrough correction / Spectral Unmixing +[PICASSO](https://github.com/nygctech/PICASSO) - Blind unmixing without reference spectra measurement, [Paper](https://www.biorxiv.org/content/10.1101/2021.01.27.428247v1.full) [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). From d345848d25e3e6bd24295794e30847e4867bb5d5 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 2 Nov 2023 10:27:34 +0100 Subject: [PATCH 067/152] AutoUnmix --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2332bda..f45a9aa 100644 --- a/README.md +++ b/README.md @@ -458,6 +458,7 @@ Image Data Explorer - Microscopy Image Viewer, [Shiny App](https://shiny-portal. [cytoflow](https://github.com/cytoflow/cytoflow) - Flow cytometry. Includes Bleedthrough correction methods. Linear unmixing in Fiji for Bleedthrough Correction - [Youtube](https://www.youtube.com/watch?v=W90qs0J29v8). Bleedthrough Correction using Lumos and Fiji - [Link](https://imagej.net/plugins/lumos-spectral-unmixing). +AutoUnmix - [Link](https://www.biorxiv.org/content/10.1101/2023.05.30.542836v1.full). ##### Platforms and Pipelines [CellProfiler](https://github.com/CellProfiler/CellProfiler), [CellProfilerAnalyst](https://github.com/CellProfiler/CellProfiler-Analyst) - Create image analysis pipelines. From 364baeeada1403e7a8845c89789506d65726faa8 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 2 Nov 2023 16:04:27 +0100 Subject: [PATCH 068/152] Segment Anything and Segment Everything Everywhere All at Once --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index f45a9aa..300cac6 100644 --- a/README.md +++ b/README.md @@ -491,9 +491,13 @@ AutoUnmix - [Link](https://www.biorxiv.org/content/10.1101/2023.05.30.542836v1.f [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. [ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. [EmbedSeg](https://github.com/juglab/EmbedSeg) - Embedding-based Instance Segmentation. -[micro-sam](https://github.com/computational-cell-analytics/micro-sam) - SegmentAnything for Microscopy. +[segment-anything](https://github.com/facebookresearch/segment-anything) - Segment Anything (SAM) from Facebook. +[micro-sam](https://github.com/computational-cell-analytics/micro-sam) - Segment Anything for Microscopy. +[Segment-Everything-Everywhere-All-At-Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once) - Segment Everything Everywhere All at Once from Microsoft. [deepcell-tf](https://github.com/vanvalenlab/deepcell-tf/tree/master) - Cell segmentation, [DeepCell](https://deepcell.org/). +https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once + ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. [omnipose](http://www.cellpose.org/dataset_omnipose) - Cell images. From c98c9d94b2c277eea3eac8c5382280e5f2f735e0 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 4 Nov 2023 23:21:21 +0100 Subject: [PATCH 069/152] BiaPy --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 300cac6..900707e 100644 --- a/README.md +++ b/README.md @@ -468,6 +468,8 @@ AutoUnmix - [Link](https://www.biorxiv.org/content/10.1101/2023.05.30.542836v1.f [qupath](https://github.com/qupath/qupath) - Image analysis. ##### Microscopy Pipelines +Labsyspharm Stack see below. +[BiaPy](https://github.com/danifranco/BiaPy) - Bioimage analysis pipelines. [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. [DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. [IMCWorkflow](https://github.com/BodenmillerGroup/IMCWorkflow/) - Image analysis pipeline using [steinbock](https://github.com/BodenmillerGroup/steinbock), [Twitter](https://twitter.com/NilsEling/status/1715020265963258087), [Paper](https://www.nature.com/articles/s41596-023-00881-0), [workflow](https://bodenmillergroup.github.io/IMCDataAnalysis/). From 6b89c8ab34e6f9704465302830c6d0feb21b6598 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 5 Nov 2023 00:15:57 +0100 Subject: [PATCH 070/152] DL4MicEverywhere --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 900707e..3e0c679 100644 --- a/README.md +++ b/README.md @@ -492,6 +492,7 @@ Labsyspharm Stack see below. [allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. [ZeroCostDL4Mic](https://github.com/HenriquesLab/ZeroCostDL4Mic/wiki) - Deep-Learning in Microscopy. +[DL4MicEverywhere](https://github.com/HenriquesLab/DL4MicEverywhere) - Bringing the ZeroCostDL4Mic experience using Docker. [EmbedSeg](https://github.com/juglab/EmbedSeg) - Embedding-based Instance Segmentation. [segment-anything](https://github.com/facebookresearch/segment-anything) - Segment Anything (SAM) from Facebook. [micro-sam](https://github.com/computational-cell-analytics/micro-sam) - Segment Anything for Microscopy. From 1a136db5c9c6a9dfc63117dad868466a368da1a5 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 7 Nov 2023 16:08:54 +0100 Subject: [PATCH 071/152] ilastik --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3e0c679..ad911f3 100644 --- a/README.md +++ b/README.md @@ -488,6 +488,7 @@ Labsyspharm Stack see below. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. +[ilastik](https://github.com/ilastik/ilastik) - Segment, classify, track and count cells. [ImageJ Plugin](https://github.com/ilastik/ilastik4ij). [nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. [allencell](https://www.allencell.org/segmenter.html) - Tools for 3D segmentation, classical and deep learning methods. [Cell-ACDC](https://github.com/SchmollerLab/Cell_ACDC) - Python GUI for cell segmentation and tracking. From c5922a213ae260b2256370a78844e89ad0109254 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 7 Nov 2023 16:33:00 +0100 Subject: [PATCH 072/152] labkit --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index ad911f3..dda4bc3 100644 --- a/README.md +++ b/README.md @@ -499,8 +499,7 @@ Labsyspharm Stack see below. [micro-sam](https://github.com/computational-cell-analytics/micro-sam) - Segment Anything for Microscopy. [Segment-Everything-Everywhere-All-At-Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once) - Segment Everything Everywhere All at Once from Microsoft. [deepcell-tf](https://github.com/vanvalenlab/deepcell-tf/tree/master) - Cell segmentation, [DeepCell](https://deepcell.org/). - -https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once +[labkit](https://github.com/juglab/labkit-ui) - Fiji plugin for image segmentation. ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. From e456dcbec9516494b4f4a2bddcb8c0e94e6fcf56 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 8 Nov 2023 10:18:33 +0100 Subject: [PATCH 073/152] Napari Plugins --- README.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index dda4bc3..dfc137b 100644 --- a/README.md +++ b/README.md @@ -432,17 +432,21 @@ REMBI model - Recommended Metadata for Biological Images, BioImage Archive: [Stu [bdz](https://github.com/openssbd/bdz) - Zarr-based format for storing quantitative biological dynamics data. #### Image Viewers -[vizarr](https://github.com/hms-dbmi/vizarr) - Browser-based image viewer for zarr format. -[avivator](https://github.com/hms-dbmi/viv) - Browser-based image viewer for tiff files. [napari](https://github.com/napari/napari) - Image viewer and image processing tool. [Fiji](https://fiji.sc/) - General purpose tool. Image viewer and image processing tool. +[vizarr](https://github.com/hms-dbmi/vizarr) - Browser-based image viewer for zarr format. +[avivator](https://github.com/hms-dbmi/viv) - Browser-based image viewer for tiff files. [OMERO](https://www.openmicroscopy.org/omero/) - Image viewer for high-content screening. [IDR](https://idr.openmicroscopy.org/) uses OMERO. [Intro](https://www.youtube.com/watch?v=nSCrMO_c-5s) [fiftyone](https://github.com/voxel51/fiftyone) - Viewer and tool for building high-quality datasets and computer vision models. Image Data Explorer - Microscopy Image Viewer, [Shiny App](https://shiny-portal.embl.de/shinyapps/app/01_image-data-explorer), [Video](https://www.youtube.com/watch?v=H8zIZvOt1MA). [ImSwitch](https://github.com/ImSwitch/ImSwitch) - Microscopy Image Viewer, [Doc](https://imswitch.readthedocs.io/en/stable/gui.html), [Video](https://www.youtube.com/watch?v=XsbnMkGSPQQ). [pixmi](https://github.com/piximi/piximi) - Web-based image annotation and classification tool, [App](https://www.piximi.app/). [DeepCell Label](https://label.deepcell.org/) - Data labeling tool to segment images, [Video](https://www.youtube.com/watch?v=zfsvUBkEeow). - + +#### Napari Plugins +[napari-sam](https://github.com/MIC-DKFZ/napari-sam) - Segment Anything Plugin. +[napari-chatgpt](https://github.com/royerlab/napari-chatgpt) - ChatGPT Plugin. + ##### Image Restoration and Denoising [aydin](https://github.com/royerlab/aydin) - Image denoising. [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. From edf75297f78ee5209a44ce5b0afc40b3efc7b0f2 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 8 Nov 2023 11:00:55 +0100 Subject: [PATCH 074/152] connectomics --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index dfc137b..fe0b355 100644 --- a/README.md +++ b/README.md @@ -511,6 +511,7 @@ Labsyspharm Stack see below. [LIVECell](https://github.com/sartorius-research/LIVECell) - Cell images. [Sartorius](https://www.kaggle.com/competitions/sartorius-cell-instance-segmentation/overview) - Neurons. [EmbedSeg](https://github.com/juglab/EmbedSeg/releases/tag/v0.1.0) - 2D + 3D images. +[connectomics](https://sites.google.com/view/connectomics/) - Annotation of the EPFL Hippocampus dataset. ##### Evaluation [seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). From df191d0415a7009efec2f43de57d0e13d3f0a798 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 8 Nov 2023 22:09:17 +0100 Subject: [PATCH 075/152] Fractal --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index fe0b355..79e8482 100644 --- a/README.md +++ b/README.md @@ -477,6 +477,7 @@ Labsyspharm Stack see below. [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. [DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. [IMCWorkflow](https://github.com/BodenmillerGroup/IMCWorkflow/) - Image analysis pipeline using [steinbock](https://github.com/BodenmillerGroup/steinbock), [Twitter](https://twitter.com/NilsEling/status/1715020265963258087), [Paper](https://www.nature.com/articles/s41596-023-00881-0), [workflow](https://bodenmillergroup.github.io/IMCDataAnalysis/). +[Fractal](https://fractal-analytics-platform.github.io/) - Image analytics pipeline, [Github](https://github.com/fractal-analytics-platform). ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From 9065bfa331d916be51fb061830b87ab7216e5db1 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 8 Nov 2023 22:10:20 +0100 Subject: [PATCH 076/152] Update README.md --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 79e8482..784a56a 100644 --- a/README.md +++ b/README.md @@ -466,7 +466,7 @@ AutoUnmix - [Link](https://www.biorxiv.org/content/10.1101/2023.05.30.542836v1.f ##### Platforms and Pipelines [CellProfiler](https://github.com/CellProfiler/CellProfiler), [CellProfilerAnalyst](https://github.com/CellProfiler/CellProfiler-Analyst) - Create image analysis pipelines. -[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data. +[fractal](https://fractal-analytics-platform.github.io/) - Framework to process high-content imaging data from UZH, [Github](https://github.com/fractal-analytics-platform). [atomai](https://github.com/pycroscopy/atomai) - Deep and Machine Learning for Microscopy. [py-clesperanto](https://github.com/clesperanto/pyclesperanto_prototype/) - Tools for 3D microscopy analysis, [deskewing](https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/demo/transforms/deskew.ipynb) and lots of other tutorials, interacts with napari. [qupath](https://github.com/qupath/qupath) - Image analysis. @@ -477,7 +477,6 @@ Labsyspharm Stack see below. [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. [DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. [IMCWorkflow](https://github.com/BodenmillerGroup/IMCWorkflow/) - Image analysis pipeline using [steinbock](https://github.com/BodenmillerGroup/steinbock), [Twitter](https://twitter.com/NilsEling/status/1715020265963258087), [Paper](https://www.nature.com/articles/s41596-023-00881-0), [workflow](https://bodenmillergroup.github.io/IMCDataAnalysis/). -[Fractal](https://fractal-analytics-platform.github.io/) - Image analytics pipeline, [Github](https://github.com/fractal-analytics-platform). ##### Labsyspharm [mcmicro](https://github.com/labsyspharm/mcmicro) - Multiple-choice microscopy pipeline, [Website](https://mcmicro.org/overview/), [Paper](https://www.nature.com/articles/s41592-021-01308-y). From 9abb6e524f259102de65bb0c7f1270ef05098ec5 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 9 Nov 2023 11:17:03 +0100 Subject: [PATCH 077/152] Review of organoid pipelines --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 784a56a..6f1e8b4 100644 --- a/README.md +++ b/README.md @@ -487,6 +487,7 @@ Labsyspharm Stack see below. ##### Cell Segmentation [microscopy-tree](https://biomag-lab.github.io/microscopy-tree/) - Review of cell segmentation algorithms, [Paper](https://www.sciencedirect.com/science/article/abs/pii/S0962892421002518). +Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2301.02341.pdf). [BioImage.IO](https://bioimage.io/#/) - BioImage Model Zoo. [MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). From 7a4d4fa2c9f5be6332f921ca1c1f939608d76456 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 13 Nov 2023 15:25:40 +0100 Subject: [PATCH 078/152] Satellite Image Lists --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 6f1e8b4..04a1e83 100644 --- a/README.md +++ b/README.md @@ -1161,6 +1161,8 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Pytorch](https://github.com/bharathgs/Awesome-pytorch-list) [Awesome Quantitative Finance](https://github.com/wilsonfreitas/awesome-quant) [Awesome Recommender Systems](https://github.com/grahamjenson/list_of_recommender_systems) +[Awesome Satellite Benchmark Datasets](https://github.com/Seyed-Ali-Ahmadi/Awesome_Satellite_Benchmark_Datasets) +[Awesome Satellite Image for Deep Learning](https://github.com/satellite-image-deep-learning/techniques) [Awesome Single Cell](https://github.com/seandavi/awesome-single-cell) [Awesome Semantic Segmentation](https://github.com/mrgloom/awesome-semantic-segmentation) [Awesome Sentence Embedding](https://github.com/Separius/awesome-sentence-embedding) From 227d21fabb472fa2a79632f8e527905f8f0bd380 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 14 Nov 2023 10:04:50 +0100 Subject: [PATCH 079/152] Awesome Biological Image Analysis --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 04a1e83..525e76a 100644 --- a/README.md +++ b/README.md @@ -1129,6 +1129,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome AI Booksmarks](https://github.com/goodrahstar/my-awesome-AI-bookmarks) [Awesome AI on Kubernetes](https://github.com/CognonicLabs/awesome-AI-kubernetes) [Awesome Big Data](https://github.com/onurakpolat/awesome-bigdata) +[Awesome Biological Image Analysis](https://github.com/hallvaaw/awesome-biological-image-analysis) [Awesome Business Machine Learning](https://github.com/firmai/business-machine-learning) [Awesome Causality](https://github.com/rguo12/awesome-causality-algorithms) [Awesome Community Detection](https://github.com/benedekrozemberczki/awesome-community-detection) From ff944cbb0b8a05ae8976c0a47e355354d1cef819 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 15 Nov 2023 16:32:35 +0100 Subject: [PATCH 080/152] ZeroCostDL4Mic training dataset --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 525e76a..9cca59c 100644 --- a/README.md +++ b/README.md @@ -513,6 +513,7 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [Sartorius](https://www.kaggle.com/competitions/sartorius-cell-instance-segmentation/overview) - Neurons. [EmbedSeg](https://github.com/juglab/EmbedSeg/releases/tag/v0.1.0) - 2D + 3D images. [connectomics](https://sites.google.com/view/connectomics/) - Annotation of the EPFL Hippocampus dataset. +[ZeroCostDL4Mic](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD895) - Stardist example training and test dataset. ##### Evaluation [seg-eval](https://github.com/lstrgar/seg-eval) - Cell segmentation performance evaluation without Ground Truth labels, [Paper](https://www.biorxiv.org/content/10.1101/2023.02.23.529809v1.full.pdf). From fdcff75c0ba93e5c01704c2f44c3c3fd6c2d11c6 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 21 Nov 2023 18:34:31 +0100 Subject: [PATCH 081/152] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 9cca59c..716edf3 100644 --- a/README.md +++ b/README.md @@ -928,9 +928,10 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [causallib](https://github.com/IBM/causallib) - Modular causal inference analysis and model evaluations by IBM, [examples](https://github.com/IBM/causallib/tree/master/examples). [causalml](https://github.com/uber/causalml) - Causal inference by Uber. [upliftml](https://github.com/bookingcom/upliftml) - Causal inference by Booking.com. -[EconML](https://github.com/microsoft/EconML) - Heterogeneous Treatment Effects Estimation by Microsoft. [causality](https://github.com/akelleh/causality) - Causal analysis using observational datasets. [DoubleML](https://github.com/DoubleML/doubleml-for-py) - Machine Learning + Causal inference, [Tweet](https://twitter.com/ChristophMolnar/status/1574338002305880068), [Presentation](https://scholar.princeton.edu/sites/default/files/bstewart/files/felton.chern_.slides.20190318.pdf), [Paper](https://arxiv.org/abs/1608.00060v1). +[EconML](https://github.com/py-why/EconML) - Heterogeneous Treatment Effects Estimation by Microsoft. + ##### Papers [Bours - Confounding](https://edisciplinas.usp.br/pluginfile.php/5625667/mod_resource/content/3/Nontechnicalexplanation-counterfactualdefinition-confounding.pdf) From eca7dcf871a8f8eec6c23ef7f91470e892063a51 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 27 Nov 2023 20:55:01 +0100 Subject: [PATCH 082/152] The Dunning-Kruger Effect is Autocorrelation --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 716edf3..99069be 100644 --- a/README.md +++ b/README.md @@ -135,6 +135,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing), [Youtube](https://www.youtube.com/watch?v=DbJyPELmhJc) [How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) [The Prosecutor's Fallacy](https://www.cebm.ox.ac.uk/news/views/the-prosecutors-fallacy) +[The Dunning-Kruger Effect is Autocorrelation](https://economicsfromthetopdown.com/2022/04/08/the-dunning-kruger-effect-is-autocorrelation/) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) From 2718fa5e8f0494c1abb3db6d57633375e6186fb0 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 29 Nov 2023 16:13:35 +0100 Subject: [PATCH 083/152] Friends don't let friends make certain types of data visualization --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 99069be..5ef07e1 100644 --- a/README.md +++ b/README.md @@ -103,6 +103,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [confseq](https://github.com/gostevehoward/confseq) - Uniform boundaries, confidence sequences, and always-valid p-values. ##### Visualizations +[Friends don't let friends make certain types of data visualization](https://github.com/cxli233/FriendsDontLetFriends) [Great Overview over Visualizations](https://textvis.lnu.se/) [Dependent Propabilities](https://static.laszlokorte.de/stochastic/) [Null Hypothesis Significance Testing (NHST) and Sample Size Calculation](https://rpsychologist.com/d3/NHST/) From 4ec8206a856005fc54da2e60c23d84580d6d7801 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 30 Nov 2023 16:56:14 +0100 Subject: [PATCH 084/152] Staining and imaging videos. --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 5ef07e1..a6c1887 100644 --- a/README.md +++ b/README.md @@ -388,6 +388,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) ##### Tutorials +[MIT 7.016 Introductory Biology, Fall 2018](https://www.youtube.com/playlist?list=PLUl4u3cNGP63LmSVIVzy584-ZbjbJ-Y63) - Videos 27, 28, and 29 talk about staining and imaging. [bioimaging.org](https://www.bioimagingguide.org/welcome.html) - A biologists guide to planning and performing quantitative bioimaging experiments. [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. From 4e6e34f9d2ad581d31306a43ded34a2d58577754 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 19 Dec 2023 09:31:38 +0100 Subject: [PATCH 085/152] skimpy --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index a6c1887..a269085 100644 --- a/README.md +++ b/README.md @@ -150,6 +150,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal #### Exploration and Cleaning [Checklist](https://github.com/r0f1/ml_checklist). [pyjanitor](https://github.com/pyjanitor-devs/pyjanitor) - Clean messy column names. +[skimpy](https://github.com/aeturrell/skimpy) - Create summary statistics of dataframes. Helpful `clean_columns()` function. [pandera](https://github.com/unionai-oss/pandera) - Data / Schema validation. [impyute](https://github.com/eltonlaw/impyute) - Imputations. [fancyimpute](https://github.com/iskandr/fancyimpute) - Matrix completion and imputation algorithms. From f646546367ce89b141f4de8955411ed3e036425b Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 21 Dec 2023 14:09:21 +0100 Subject: [PATCH 086/152] PCA papers --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index a269085..2b3777e 100644 --- a/README.md +++ b/README.md @@ -237,6 +237,8 @@ SimCLR - [link](https://github.com/lightly-ai/lightly) ##### Packages [Dangers of PCA (paper)](https://www.nature.com/articles/s41598-022-14395-4). +[Phantom oscillations in PCA](https://www.biorxiv.org/content/10.1101/2023.06.20.545619v1.full). +[What to use instead of PCA](https://www.pnas.org/doi/10.1073/pnas.2319169120). [Talk](https://www.youtube.com/watch?v=9iol3Lk6kyU), [tsne intro](https://distill.pub/2016/misread-tsne/). [sklearn.manifold](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.manifold) and [sklearn.decomposition](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.decomposition) - PCA, t-SNE, MDS, Isomaps and others. Additional plots for PCA - Factor Loadings, Cumulative Variance Explained, [Correlation Circle Plot](http://rasbt.github.io/mlxtend/user_guide/plotting/plot_pca_correlation_graph/), [Tweet](https://twitter.com/rasbt/status/1555999903398219777/photo/1) From a1506eed0f78f50df7f1d22360c3195b0a97acff Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 21 Dec 2023 14:09:44 +0100 Subject: [PATCH 087/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2b3777e..8190e8c 100644 --- a/README.md +++ b/README.md @@ -238,7 +238,7 @@ SimCLR - [link](https://github.com/lightly-ai/lightly) ##### Packages [Dangers of PCA (paper)](https://www.nature.com/articles/s41598-022-14395-4). [Phantom oscillations in PCA](https://www.biorxiv.org/content/10.1101/2023.06.20.545619v1.full). -[What to use instead of PCA](https://www.pnas.org/doi/10.1073/pnas.2319169120). +[What to use instead of PCA](https://www.pnas.org/doi/10.1073/pnas.2319169120). [Talk](https://www.youtube.com/watch?v=9iol3Lk6kyU), [tsne intro](https://distill.pub/2016/misread-tsne/). [sklearn.manifold](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.manifold) and [sklearn.decomposition](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.decomposition) - PCA, t-SNE, MDS, Isomaps and others. Additional plots for PCA - Factor Loadings, Cumulative Variance Explained, [Correlation Circle Plot](http://rasbt.github.io/mlxtend/user_guide/plotting/plot_pca_correlation_graph/), [Tweet](https://twitter.com/rasbt/status/1555999903398219777/photo/1) From bce5e92a9c2f27781e2edbf9108c85c8ff8ce1bf Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 4 Jan 2024 17:15:24 +0100 Subject: [PATCH 088/152] Estimating Effect Sizes --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 8190e8c..d0546cc 100644 --- a/README.md +++ b/README.md @@ -90,6 +90,9 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandaltman.html), [2](http://www.statsmodels.org/dev/generated/statsmodels.graphics.agreement.mean_diff_plot.html) - Plot for agreement between two methods of measurement. [ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html) +##### Effect Size +[Estimating Effect Sizes From Pretest-Posttest-Control Group Designs](https://journals.sagepub.com/doi/epdf/10.1177/1094428106291059) - Scott B. Morris + ##### Statistical Tests [test_proportions_2indep](https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.test_proportions_2indep.html) - Proportion test. [G-Test](https://en.wikipedia.org/wiki/G-test) - Alternative to chi-square test, [power_divergence](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.power_divergence.html). From 83f45eb723cdfd7138222e162ba485bdabe68ff8 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 4 Jan 2024 17:16:10 +0100 Subject: [PATCH 089/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d0546cc..fe94d25 100644 --- a/README.md +++ b/README.md @@ -91,7 +91,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html) ##### Effect Size -[Estimating Effect Sizes From Pretest-Posttest-Control Group Designs](https://journals.sagepub.com/doi/epdf/10.1177/1094428106291059) - Scott B. Morris +[Estimating Effect Sizes From Pretest-Posttest-Control Group Designs](https://journals.sagepub.com/doi/epdf/10.1177/1094428106291059) - Scott B. Morris, [Twitter](https://twitter.com/MatthewBJane/status/1742588609025200557) ##### Statistical Tests [test_proportions_2indep](https://www.statsmodels.org/dev/generated/statsmodels.stats.proportion.test_proportions_2indep.html) - Proportion test. From 62f62205acd922f3580cd22c0c560a5716835275 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 6 Jan 2024 11:29:26 +0100 Subject: [PATCH 090/152] Rafi, Greenland article --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index fe94d25..54a314d 100644 --- a/README.md +++ b/README.md @@ -140,6 +140,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) [The Prosecutor's Fallacy](https://www.cebm.ox.ac.uk/news/views/the-prosecutors-fallacy) [The Dunning-Kruger Effect is Autocorrelation](https://economicsfromthetopdown.com/2022/04/08/the-dunning-kruger-effect-is-autocorrelation/) +[Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) From 306118f6ee338781581a435301fcab9f6725277d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 Jan 2024 10:26:05 +0100 Subject: [PATCH 091/152] mlx --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 54a314d..a8a09fd 100644 --- a/README.md +++ b/README.md @@ -28,6 +28,7 @@ [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. [polars](https://github.com/pola-rs/polars) - Multi-threaded alternative to pandas. [xarray](https://github.com/pydata/xarray/) - Extends pandas to n-dimensional arrays. +[mlx](https://github.com/ml-explore/mlx) - An array framework for Apple silicon. [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. [duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. From 915a1266e10f154de53eb2c391fe28e8592943f7 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 8 Jan 2024 13:14:54 +0100 Subject: [PATCH 092/152] Evaluation --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index a8a09fd..abc109c 100644 --- a/README.md +++ b/README.md @@ -143,6 +143,9 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [The Dunning-Kruger Effect is Autocorrelation](https://economicsfromthetopdown.com/2022/04/08/the-dunning-kruger-effect-is-autocorrelation/) [Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) +#### Evaluation +[Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) + #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) [incidence2](https://github.com/reconhub/incidence2) - Computation, handling, visualisation and simple modelling of incidence (R package). From e4b8afa9d5b874bcbb5ee7fbbe76477917bbf082 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 8 Jan 2024 13:15:59 +0100 Subject: [PATCH 093/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index abc109c..2dacc3f 100644 --- a/README.md +++ b/README.md @@ -144,7 +144,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) #### Evaluation -[Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) +[Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) - [Twitter](https://twitter.com/GSCollins/status/1744309712995098624) #### Epidemiology [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) From b02087d40409f8acbdc5c0c7afb95304eb3ec0ec Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 19 Jan 2024 21:59:18 +0100 Subject: [PATCH 094/152] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2dacc3f..7bb2958 100644 --- a/README.md +++ b/README.md @@ -118,6 +118,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Bayesian two-sample t test](https://rpsychologist.com/d3/bayes/) [Distribution of p-values when comparing two groups](https://rpsychologist.com/d3/pdist/) [Understanding the t-distribution and its normal approximation](https://rpsychologist.com/d3/tdist/) +[Statistical Power and Sample Size Calculation Tools](https://pwrss.shinyapps.io/index/) ##### Talks [Inverse Propensity Weighting](https://www.youtube.com/watch?v=SUq0shKLPPs) From b5030f064bc013278c1d4091a47052355da18644 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 11 Feb 2024 21:43:18 +0100 Subject: [PATCH 095/152] ASA Statement on p-Values --- README.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 7bb2958..b24919c 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,11 @@ #### Classical Statistics +##### p-values +[The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) +[Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) +[Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) + ##### Correlation [phik](https://github.com/kaveio/phik) - Correlation between categorical, ordinal and interval variables. @@ -130,8 +135,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Verifying the Assumptions of Linear Models](https://github.com/erykml/medium_articles/blob/master/Statistics/linear_regression_assumptions.ipynb) [Mediation and Moderation Intro](https://ademos.people.uic.edu/Chapter14.html) [Montgomery et al. - How conditioning on post-treatment variables can ruin your experiment and what to do about it](https://cpb-us-e1.wpmucdn.com/sites.dartmouth.edu/dist/5/2293/files/2021/03/post-treatment-bias.pdf) -[Greenland - Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) -[Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) [Lindeløv - Common statistical tests are linear models](https://lindeloev.github.io/tests-as-linear/) [Chatruc - The Central Limit Theorem and its misuse](https://web.archive.org/web/20191229234155/https://lambdaclass.com/data_etudes/central_limit_theorem_misuse/) [Al-Saleh - Properties of the Standard Deviation that are Rarely Mentioned in Classrooms](http://www.stat.tugraz.at/AJS/ausg093/093Al-Saleh.pdf) From a2498ee142377d075d0033c78b78704777a5c90a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 11 Feb 2024 21:43:55 +0100 Subject: [PATCH 096/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b24919c..e244dfa 100644 --- a/README.md +++ b/README.md @@ -80,7 +80,7 @@ #### Classical Statistics ##### p-values -[The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) +[The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) [Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) From eadd434a5cec4c4dfde2472999be67c6864aad7f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 22 Feb 2024 09:27:14 +0100 Subject: [PATCH 097/152] hoeffd --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index e244dfa..c70b7c0 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ ##### Correlation [phik](https://github.com/kaveio/phik) - Correlation between categorical, ordinal and interval variables. +[hoeffd](https://search.r-project.org/CRAN/refmans/Hmisc/html/hoeffd.html) - Hoeffding's D Statistics, measure of dependence (R package). ##### Packages [statsmodels](https://www.statsmodels.org/stable/index.html) - Statistical tests. From 5e620ececd19760566f83ccbd1dbb8c3823256e0 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 26 Feb 2024 22:17:05 +0100 Subject: [PATCH 098/152] pwrss --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index c70b7c0..0d2b3bc 100644 --- a/README.md +++ b/README.md @@ -107,6 +107,9 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal ##### Comparing Two Populations [torch-two-sample](https://github.com/josipd/torch-two-sample) - Friedman-Rafsky Test: Compare two population based on a multivariate generalization of the Runstest. [Explanation](https://www.real-statistics.com/multivariate-statistics/multivariate-normal-distribution/friedman-rafsky-test/), [Application](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5014134/) +##### Power and Sample Size Calculations +[pwrss](https://cran.r-project.org/web/packages/pwrss/index.html) - Statistical Power and Sample Size Calculation Tools (R package), [Tutorial with t-test](https://rpubs.com/metinbulus/welch) + ##### Interim Analyses / Sequential Analysis / Stopping [Sequential Analysis](https://en.wikipedia.org/wiki/Sequential_analysis) - Wikipedia. [sequential](https://cran.r-project.org/web/packages/Sequential/Sequential.pdf) - Exact Sequential Analysis for Poisson and Binomial Data (R package). From f821174cddc25750b5df297c1ab03767b4541d14 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 1 Mar 2024 14:04:32 +0100 Subject: [PATCH 099/152] daft --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 0d2b3bc..6a62ee1 100644 --- a/README.md +++ b/README.md @@ -31,6 +31,7 @@ [mlx](https://github.com/ml-explore/mlx) - An array framework for Apple silicon. [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. [duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. +[daft](https://github.com/Eventual-Inc/Daft) - Distributed DataFrame. #### Pandas Parallelization [modin](https://github.com/modin-project/modin) - Parallelization library for faster pandas `DataFrame`. From 23fa0708b29f7848d10edc684bf3c36e20147c74 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 2 Mar 2024 14:11:28 +0100 Subject: [PATCH 100/152] Data Science Books with R --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 6a62ee1..61971d9 100644 --- a/README.md +++ b/README.md @@ -1148,6 +1148,8 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Blum - Foundations of Data Science](https://www.cs.cornell.edu/jeh/book.pdf?file=book.pdf) [Chan - Introduction to Probability for Data Science](https://probability4datascience.com/index.html) [Colonescu - Principles of Econometrics with R](https://bookdown.org/ccolonescu/RPoE4/) +[Rafael Irizarry - Introduction to Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-1/) (R Language) +[Rafael Irizarry - Advanced Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-2/) (R Language) ##### Other Awesome Lists [Awesome Adversarial Machine Learning](https://github.com/yenchenlin/awesome-adversarial-machine-learning) From c04e2be0397d11b606d2ccf4143806fef3ae98ad Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 2 Mar 2024 14:11:53 +0100 Subject: [PATCH 101/152] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 61971d9..845faf6 100644 --- a/README.md +++ b/README.md @@ -1148,8 +1148,8 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Blum - Foundations of Data Science](https://www.cs.cornell.edu/jeh/book.pdf?file=book.pdf) [Chan - Introduction to Probability for Data Science](https://probability4datascience.com/index.html) [Colonescu - Principles of Econometrics with R](https://bookdown.org/ccolonescu/RPoE4/) -[Rafael Irizarry - Introduction to Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-1/) (R Language) -[Rafael Irizarry - Advanced Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-2/) (R Language) +[Rafael Irizarry - Introduction to Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-1/) (R Language) +[Rafael Irizarry - Advanced Data Science](https://rafalab.dfci.harvard.edu/dsbook-part-2/) (R Language) ##### Other Awesome Lists [Awesome Adversarial Machine Learning](https://github.com/yenchenlin/awesome-adversarial-machine-learning) From 89f3ef1e9087c136c508fea00689f289004ef464 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 4 Mar 2024 16:43:31 +0100 Subject: [PATCH 102/152] Guess the Correlation --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 845faf6..544abf7 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) ##### Correlation +[Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. [phik](https://github.com/kaveio/phik) - Correlation between categorical, ordinal and interval variables. [hoeffd](https://search.r-project.org/CRAN/refmans/Hmisc/html/hoeffd.html) - Hoeffding's D Statistics, measure of dependence (R package). From eb0addb9aa6bb56a57232981ec0532abdabf8842 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 11 Mar 2024 00:08:11 +0100 Subject: [PATCH 103/152] Introduction to Bioimage Analysis --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 544abf7..eea4e4e 100644 --- a/README.md +++ b/README.md @@ -411,6 +411,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www ##### Tutorials [MIT 7.016 Introductory Biology, Fall 2018](https://www.youtube.com/playlist?list=PLUl4u3cNGP63LmSVIVzy584-ZbjbJ-Y63) - Videos 27, 28, and 29 talk about staining and imaging. [bioimaging.org](https://www.bioimagingguide.org/welcome.html) - A biologists guide to planning and performing quantitative bioimaging experiments. +[Introduction to Bioimage Analysis](https://bioimagebook.github.io/index.html) - Book. [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. From 6bb03c01d399dae26f6381016b49c63f14dd4136 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 7 Apr 2024 22:30:33 +0200 Subject: [PATCH 104/152] Rubin - Inconsistent multiple testing corrections --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index eea4e4e..746bca0 100644 --- a/README.md +++ b/README.md @@ -84,6 +84,7 @@ [The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) [Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) +[Rubin - Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses](https://www.sciencedirect.com/science/article/pii/S2590260124000067?via%3Dihub) ##### Correlation [Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. From 378c88e4eca367f1c613aecdd11371897dec070f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 11 Apr 2024 17:27:42 +0200 Subject: [PATCH 105/152] On the uses and abuses of regression models --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 746bca0..349c5fa 100644 --- a/README.md +++ b/README.md @@ -152,7 +152,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [How large is that number in the Law of Large Numbers?](https://thepalindrome.org/p/how-large-that-number-in-the-law) [The Prosecutor's Fallacy](https://www.cebm.ox.ac.uk/news/views/the-prosecutors-fallacy) [The Dunning-Kruger Effect is Autocorrelation](https://economicsfromthetopdown.com/2022/04/08/the-dunning-kruger-effect-is-autocorrelation/) -[Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) +[Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) +[Carlin et al. - On the uses and abuses of regression models: a call for reform of statistical practice and teaching](https://arxiv.org/abs/2309.06668) #### Evaluation [Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) - [Twitter](https://twitter.com/GSCollins/status/1744309712995098624) From 228931142d43473553703296c23a5b0a4e3eca60 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 12 Apr 2024 18:36:33 +0200 Subject: [PATCH 106/152] Gigerenzer - Mindless Statistics --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 349c5fa..31fc7cb 100644 --- a/README.md +++ b/README.md @@ -85,6 +85,7 @@ [Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) [Rubin - Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses](https://www.sciencedirect.com/science/article/pii/S2590260124000067?via%3Dihub) +[Gigerenzer - Mindless Statistics](https://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdf) ##### Correlation [Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. From 3186182b26c36e7967807cda409c5e7f6d86664a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 12 Apr 2024 19:22:34 +0200 Subject: [PATCH 107/152] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 31fc7cb..23cfdb7 100644 --- a/README.md +++ b/README.md @@ -86,6 +86,7 @@ [Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) [Rubin - Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses](https://www.sciencedirect.com/science/article/pii/S2590260124000067?via%3Dihub) [Gigerenzer - Mindless Statistics](https://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdf) +[Rubin - That's not a two-sided test! It's two one-sided tests!](https://rss.onlinelibrary.wiley.com/doi/full/10.1111/1740-9713.01405) ##### Correlation [Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. From da2da2f3b6ba6ef562f990c971b9864ed1a20798 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 10 May 2024 23:20:12 +0200 Subject: [PATCH 108/152] Rdatasets --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 23cfdb7..7b017d8 100644 --- a/README.md +++ b/README.md @@ -80,6 +80,9 @@ #### Classical Statistics +##### Datasets +[Rdatasets](https://vincentarelbundock.github.io/Rdatasets/articles/data.html) - Collection of more than 2000 datasets, stored as csv files. + ##### p-values [The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) [Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) From 2660d85eb5f1abf5b448f4420f4d280931a41fe6 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 21 May 2024 23:45:29 +0200 Subject: [PATCH 109/152] vegdist --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 7b017d8..4618274 100644 --- a/README.md +++ b/README.md @@ -769,6 +769,7 @@ Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teach #### Distance Functions [scipy.spatial](https://docs.scipy.org/doc/scipy/reference/spatial.distance.html) - All kinds of distance metrics. +[vegdist](https://rdrr.io/cran/vegan/man/vegdist.html) - Distance metrics (R package). [pyemd](https://github.com/wmayner/pyemd) - Earth Mover's Distance / Wasserstein distance, similarity between histograms. [OpenCV implementation](https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html), [POT implementation](https://pythonot.github.io/auto_examples/plot_OT_2D_samples.html) [dcor](https://github.com/vnmabus/dcor) - Distance correlation and related Energy statistics. [GeomLoss](https://www.kernel-operations.io/geomloss/) - Kernel norms, Hausdorff divergences, Debiased Sinkhorn divergences (=approximation of Wasserstein distance). From 5bf5466bf48ad5d193aea4a8c5db679922c3e357 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 21 May 2024 23:47:09 +0200 Subject: [PATCH 110/152] Cosine-Similarity paper --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 4618274..95345b8 100644 --- a/README.md +++ b/README.md @@ -768,6 +768,7 @@ Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teach [TensorFlow similarity](https://github.com/tensorflow/similarity) - Metric learning. #### Distance Functions +[Steck et al. - Is Cosine-Similarity of Embeddings Really About Similarity?](https://arxiv.org/abs/2403.05440) [scipy.spatial](https://docs.scipy.org/doc/scipy/reference/spatial.distance.html) - All kinds of distance metrics. [vegdist](https://rdrr.io/cran/vegan/man/vegdist.html) - Distance metrics (R package). [pyemd](https://github.com/wmayner/pyemd) - Earth Mover's Distance / Wasserstein distance, similarity between histograms. [OpenCV implementation](https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html), [POT implementation](https://pythonot.github.io/auto_examples/plot_OT_2D_samples.html) From 6ac94707ef321ca4b882d80ccd43e463636e5b53 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 27 May 2024 23:10:10 +0200 Subject: [PATCH 111/152] episensr + Lesko paper --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 95345b8..9929148 100644 --- a/README.md +++ b/README.md @@ -164,6 +164,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) - [Twitter](https://twitter.com/GSCollins/status/1744309712995098624) #### Epidemiology +[Lesko et al. - A Framework for Descriptive Epidemiology](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10144679/) [R Epidemics Consortium](https://www.repidemicsconsortium.org/projects/) - Large tool suite for working with epidemiological data (R packages). [Github](https://github.com/reconhub) [incidence2](https://github.com/reconhub/incidence2) - Computation, handling, visualisation and simple modelling of incidence (R package). [EpiEstim](https://github.com/mrc-ide/EpiEstim) - Estimate time varying instantaneous reproduction number R during epidemics (R package) [paper](https://academic.oup.com/aje/article/178/9/1505/89262). @@ -171,6 +172,8 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [zEpid](https://github.com/pzivich/zEpid) - Epidemiology analysis package, [Tutorial](https://github.com/pzivich/Python-for-Epidemiologists). [tipr](https://github.com/LucyMcGowan/tipr) - Sensitivity analyses for unmeasured confounders (R package). [quartets](https://github.com/r-causal/quartets) - Anscombe’s Quartet, Causal Quartet, [Datasaurus Dozen](https://github.com/jumpingrivers/datasauRus) and others (R package). +[episensr](https://cran.r-project.org/web/packages/episensr/vignettes/episensr.html) - Quantitative Bias Analysis for Epidemiologic Data (=simulation of possible effects of different sources of bias) (R package). + #### Exploration and Cleaning [Checklist](https://github.com/r0f1/ml_checklist). From 9239e403c9bc2b54bed90018e5b35e88990541e9 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 1 Jun 2024 13:38:53 +0200 Subject: [PATCH 112/152] Marginal Effects Tutorial --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9929148..e0473a7 100644 --- a/README.md +++ b/README.md @@ -958,6 +958,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y #### Causal Inference [CS 594 Causal Inference and Learning](https://www.cs.uic.edu/~elena/courses/fall19/cs594cil.html) +[Marginal Effects Tutorial](https://marginaleffects.com/vignettes/gcomputation.html) - Marginal Effects, g-computation and more. [Statistical Rethinking](https://github.com/rmcelreath/stat_rethinking_2022) - Video Lecture Series, Bayesian Statistics, Causal Models, [R](https://bookdown.org/content/4857/), [python](https://github.com/pymc-devs/resources/tree/master/Rethinking_2), [numpyro1](https://github.com/asuagar/statrethink-course-numpyro-2019), [numpyro2](https://fehiepsi.github.io/rethinking-numpyro/), [tensorflow-probability](https://github.com/ksachdeva/rethinking-tensorflow-probability). [Python Causality Handbook](https://github.com/matheusfacure/python-causality-handbook) [dowhy](https://github.com/py-why/dowhy) - Estimate causal effects. @@ -969,7 +970,6 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [DoubleML](https://github.com/DoubleML/doubleml-for-py) - Machine Learning + Causal inference, [Tweet](https://twitter.com/ChristophMolnar/status/1574338002305880068), [Presentation](https://scholar.princeton.edu/sites/default/files/bstewart/files/felton.chern_.slides.20190318.pdf), [Paper](https://arxiv.org/abs/1608.00060v1). [EconML](https://github.com/py-why/EconML) - Heterogeneous Treatment Effects Estimation by Microsoft. - ##### Papers [Bours - Confounding](https://edisciplinas.usp.br/pluginfile.php/5625667/mod_resource/content/3/Nontechnicalexplanation-counterfactualdefinition-confounding.pdf) [Bours - Effect Modification and Interaction](https://www.sciencedirect.com/science/article/pii/S0895435621000330) From 29641ca60469f3cccb0f9302efd0b2bc808471e4 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 1 Jun 2024 17:49:59 +0200 Subject: [PATCH 113/152] Awesome MLOps + Awesome Data Science --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index e0473a7..25ce3ab 100644 --- a/README.md +++ b/README.md @@ -1176,6 +1176,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Community Detection](https://github.com/benedekrozemberczki/awesome-community-detection) [Awesome CSV](https://github.com/secretGeek/AwesomeCSV) [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) +[Awesome Data Science](https://github.com/academic/awesome-datascience) [Awesome Data Science with Ruby](https://github.com/arbox/data-science-with-ruby) [Awesome Dash](https://github.com/ucg8j/awesome-dash) [Awesome Decision Trees](https://github.com/benedekrozemberczki/awesome-decision-tree-papers) @@ -1193,6 +1194,7 @@ Gilbert Strang - [Matrix Methods in Data Analysis, Signal Processing, and Machin [Awesome Machine Learning Interpretability](https://github.com/jphall663/awesome-machine-learning-interpretability) [Awesome Machine Learning Operations](https://github.com/EthicalML/awesome-machine-learning-operations) [Awesome Monte Carlo Tree Search](https://github.com/benedekrozemberczki/awesome-monte-carlo-tree-search-papers) +[Awesome MLOps](https://github.com/kelvins/awesome-mlops) [Awesome Neural Network Visualization](https://github.com/ashishpatel26/Tools-to-Design-or-Visualize-Architecture-of-Neural-Network) [Awesome Online Machine Learning](https://github.com/MaxHalford/awesome-online-machine-learning) [Awesome Pipeline](https://github.com/pditommaso/awesome-pipeline) From da232b54058baff7daaca5aeb1648feb20c34ad9 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 1 Jun 2024 20:48:01 +0200 Subject: [PATCH 114/152] An introduction to g methods --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 25ce3ab..330a3bc 100644 --- a/README.md +++ b/README.md @@ -957,6 +957,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [lightning](https://github.com/scikit-learn-contrib/lightning) - Large-scale linear classification, regression and ranking. #### Causal Inference +[Naimi et al. - An introduction to g methods](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6074945/) [CS 594 Causal Inference and Learning](https://www.cs.uic.edu/~elena/courses/fall19/cs594cil.html) [Marginal Effects Tutorial](https://marginaleffects.com/vignettes/gcomputation.html) - Marginal Effects, g-computation and more. [Statistical Rethinking](https://github.com/rmcelreath/stat_rethinking_2022) - Video Lecture Series, Bayesian Statistics, Causal Models, [R](https://bookdown.org/content/4857/), [python](https://github.com/pymc-devs/resources/tree/master/Rethinking_2), [numpyro1](https://github.com/asuagar/statrethink-course-numpyro-2019), [numpyro2](https://fehiepsi.github.io/rethinking-numpyro/), [tensorflow-probability](https://github.com/ksachdeva/rethinking-tensorflow-probability). From c99db3cfc0242122e029d748c69b2336b47c62ea Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 2 Jun 2024 23:04:55 +0200 Subject: [PATCH 115/152] Logs with zeros? Some problems and solutions --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 330a3bc..6a8469c 100644 --- a/README.md +++ b/README.md @@ -159,6 +159,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [The Dunning-Kruger Effect is Autocorrelation](https://economicsfromthetopdown.com/2022/04/08/the-dunning-kruger-effect-is-autocorrelation/) [Rafi, Greenland - Semantic and cognitive tools to aid statistical science: replace confidence and significance by compatibility and surprise](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01105-9) [Carlin et al. - On the uses and abuses of regression models: a call for reform of statistical practice and teaching](https://arxiv.org/abs/2309.06668) +[Chen, Roth - Logs with zeros? Some problems and solutions](https://arxiv.org/abs/2212.06080) #### Evaluation [Collins et al. - Evaluation of clinical prediction models (part 1): from development to external validation](https://www.bmj.com/content/384/bmj-2023-074819.full) - [Twitter](https://twitter.com/GSCollins/status/1744309712995098624) From 085248f83f8de0fe8616dee473e9531c70cb6b71 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 4 Jun 2024 13:56:53 +0200 Subject: [PATCH 116/152] A beginner's guide to rigor and reproducibility in fluorescence imaging experiments --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 6a8469c..da0a9d8 100644 --- a/README.md +++ b/README.md @@ -417,12 +417,11 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [textdistance](https://github.com/life4/textdistance) - Collection for comparing distances between two or more sequences. #### Bio Image Analysis +[Lee et al. - A beginner's guide to rigor and reproducibility in fluorescence imaging experiments](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6080651/) [Awesome Cytodata](https://github.com/cytodata/awesome-cytodata) ##### Tutorials [MIT 7.016 Introductory Biology, Fall 2018](https://www.youtube.com/playlist?list=PLUl4u3cNGP63LmSVIVzy584-ZbjbJ-Y63) - Videos 27, 28, and 29 talk about staining and imaging. -[bioimaging.org](https://www.bioimagingguide.org/welcome.html) - A biologists guide to planning and performing quantitative bioimaging experiments. -[Introduction to Bioimage Analysis](https://bioimagebook.github.io/index.html) - Book. [Bio-image Analysis Notebooks](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html) - Large collection of image processing workflows, including [point-spread-function estimation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/extract_psf.html) and [deconvolution](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/18a_deconvolution/introduction_deconvolution.html), [3D cell segmentation](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/20_image_segmentation/Segmentation_3D.html), [feature extraction](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/statistics_with_pyclesperanto.html) using [pyclesperanto](https://github.com/clEsperanto/pyclesperanto_prototype) and others. [python_for_microscopists](https://github.com/bnsreenu/python_for_microscopists) - Notebooks and associated [youtube channel](https://www.youtube.com/channel/UC34rW-HtPJulxr5wp2Xa04w/videos) for a variety of image processing tasks. From 5ee499611ccaa7c268e88dc3070548a65a6acf21 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 5 Jun 2024 18:53:46 +0200 Subject: [PATCH 117/152] Update README.md StatCheck --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index da0a9d8..f9866b6 100644 --- a/README.md +++ b/README.md @@ -104,6 +104,7 @@ [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) - Statistical post-hoc tests for pairwise multiple comparisons. Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandaltman.html), [2](http://www.statsmodels.org/dev/generated/statsmodels.graphics.agreement.mean_diff_plot.html) - Plot for agreement between two methods of measurement. [ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html) +[StatCheck](https://statcheck.steveharoz.com/) - Extract statistics from articles and recompute p-values (R package). ##### Effect Size [Estimating Effect Sizes From Pretest-Posttest-Control Group Designs](https://journals.sagepub.com/doi/epdf/10.1177/1094428106291059) - Scott B. Morris, [Twitter](https://twitter.com/MatthewBJane/status/1742588609025200557) From 8e3eb241ce0c21366fa3a00d006165f79018d8b4 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 28 Jun 2024 13:52:48 +0200 Subject: [PATCH 118/152] The Causal Cookbook --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index f9866b6..bb5aa99 100644 --- a/README.md +++ b/README.md @@ -958,6 +958,7 @@ Distances for comparing histograms and detecting outliers - [Talk](https://www.y [lightning](https://github.com/scikit-learn-contrib/lightning) - Large-scale linear classification, regression and ranking. #### Causal Inference +[Chatton et al. - The Causal Cookbook: Recipes for Propensity Scores, G-Computation, and Doubly Robust Standardization](https://journals.sagepub.com/doi/10.1177/25152459241236149) [Naimi et al. - An introduction to g methods](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6074945/) [CS 594 Causal Inference and Learning](https://www.cs.uic.edu/~elena/courses/fall19/cs594cil.html) [Marginal Effects Tutorial](https://marginaleffects.com/vignettes/gcomputation.html) - Marginal Effects, g-computation and more. From c91856bb3b428013cd9826cada26f3cf26070c73 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 29 Jun 2024 10:20:34 +0200 Subject: [PATCH 119/152] gibbs-diffusion --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index bb5aa99..9de5a6c 100644 --- a/README.md +++ b/README.md @@ -487,6 +487,7 @@ Image Data Explorer - Microscopy Image Viewer, [Shiny App](https://shiny-portal. [aydin](https://github.com/royerlab/aydin) - Image denoising. [DivNoising](https://github.com/juglab/DivNoising) - Unsupervised denoising method. [CSBDeep](https://github.com/CSBDeep/CSBDeep) - Content-aware image restoration, [Project page](https://csbdeep.bioimagecomputing.com/tools/). +[gibbs-diffusion](https://github.com/rubenohana/gibbs-diffusion) - Image denoising. ##### Illumination correction [skimage](https://scikit-image.org/docs/dev/api/skimage.exposure.html#skimage.exposure.equalize_adapthist) - Illumination correction (CLAHE). From dd5d3205e57a8720b774d01f34a9c8414de99bfc Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 29 Jun 2024 13:41:16 +0200 Subject: [PATCH 120/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9de5a6c..bd0e7c4 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ [rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - VSCode plugin to display .csv files with nice colors. #### General Python Programming -[Python Best Practices Guide](https://github.com/qiwihui/pocket_readings/issues/1148#issuecomment-874448132) +[Python Best Practices Guide](https://medium.com/@mronakjain94/comprehensive-guide-to-installing-poetry-on-ubuntu-and-managing-python-projects-949b49ef4f76) [pyenv](https://github.com/pyenv/pyenv) - Manage multiple Python versions on your system. [poetry](https://github.com/python-poetry/poetry) - Dependency management. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. From 7ebf024a6854a82c1ddaca32f94fdb57c53de53d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 2 Jul 2024 22:38:13 +0200 Subject: [PATCH 121/152] TOSTER --- README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index bd0e7c4..b99782e 100644 --- a/README.md +++ b/README.md @@ -86,10 +86,10 @@ ##### p-values [The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) [Greenland - Statistical tests, P-values, confidence intervals, and power: a guide to misinterpretations](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877414/) -[Blume - Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188299) [Rubin - Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses](https://www.sciencedirect.com/science/article/pii/S2590260124000067?via%3Dihub) [Gigerenzer - Mindless Statistics](https://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdf) -[Rubin - That's not a two-sided test! It's two one-sided tests!](https://rss.onlinelibrary.wiley.com/doi/full/10.1111/1740-9713.01405) +[Rubin - That's not a two-sided test! It's two one-sided tests! (TOST)](https://rss.onlinelibrary.wiley.com/doi/full/10.1111/1740-9713.01405) +[Lakens - How were we supposed to move beyond p < .05, and why didn’t we?](https://errorstatistics.com/2024/07/01/guest-post-daniel-lakens-how-were-we-supposed-to-move-beyond-p-05-and-why-didnt-we-thoughts-on-abandon-statistical-significance-5-years-on/) ##### Correlation [Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. @@ -105,6 +105,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandaltman.html), [2](http://www.statsmodels.org/dev/generated/statsmodels.graphics.agreement.mean_diff_plot.html) - Plot for agreement between two methods of measurement. [ANOVA](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html) [StatCheck](https://statcheck.steveharoz.com/) - Extract statistics from articles and recompute p-values (R package). +[TOSTER](https://github.com/Lakens/TOSTER) - TOST equivalence test and power functions (R package). ##### Effect Size [Estimating Effect Sizes From Pretest-Posttest-Control Group Designs](https://journals.sagepub.com/doi/epdf/10.1177/1094428106291059) - Scott B. Morris, [Twitter](https://twitter.com/MatthewBJane/status/1742588609025200557) From 3fb2dd03499a9de944e741c78aa2f5b8117d2178 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 3 Jul 2024 17:43:48 +0200 Subject: [PATCH 122/152] Abandon Statistical Significance --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index b99782e..4cbe145 100644 --- a/README.md +++ b/README.md @@ -89,7 +89,8 @@ [Rubin - Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses](https://www.sciencedirect.com/science/article/pii/S2590260124000067?via%3Dihub) [Gigerenzer - Mindless Statistics](https://library.mpib-berlin.mpg.de/ft/gg/GG_Mindless_2004.pdf) [Rubin - That's not a two-sided test! It's two one-sided tests! (TOST)](https://rss.onlinelibrary.wiley.com/doi/full/10.1111/1740-9713.01405) -[Lakens - How were we supposed to move beyond p < .05, and why didn’t we?](https://errorstatistics.com/2024/07/01/guest-post-daniel-lakens-how-were-we-supposed-to-move-beyond-p-05-and-why-didnt-we-thoughts-on-abandon-statistical-significance-5-years-on/) +[Lakens - How were we supposed to move beyond p < .05, and why didn’t we?](https://errorstatistics.com/2024/07/01/guest-post-daniel-lakens-how-were-we-supposed-to-move-beyond-p-05-and-why-didnt-we-thoughts-on-abandon-statistical-significance-5-years-on/) +[McShane et al. - Abandon Statistical Significance](https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1527253) ##### Correlation [Guess the Correlation](https://www.guessthecorrelation.com/) - Correlation guessing game. From e243bd083839efed1c0425531ec962a202f8d5db Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 11 Jul 2024 15:13:53 +0200 Subject: [PATCH 123/152] rye --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4cbe145..35f807f 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ #### General Python Programming [Python Best Practices Guide](https://medium.com/@mronakjain94/comprehensive-guide-to-installing-poetry-on-ubuntu-and-managing-python-projects-949b49ef4f76) +[rye](https://github.com/astral-sh/rye) - Dependency management. [pyenv](https://github.com/pyenv/pyenv) - Manage multiple Python versions on your system. [poetry](https://github.com/python-poetry/poetry) - Dependency management. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. @@ -665,7 +666,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [lightly](https://github.com/lightly-ai/lightly) - MoCo, SimCLR, SimSiam, Barlow Twins, BYOL, NNCLR. [MONAI](https://github.com/project-monai/monai) - Deep learning in healthcare imaging. [kornia](https://github.com/kornia/kornia) - Image transformations, epipolar geometry, depth estimation. -[torchinfo](https://github.com/TylerYep/torchinfo) - Nice model summary. +[torchinfo](https://github.com/Tylep/torchinfo) - Nice model summary. [lovely-tensors](https://github.com/xl0/lovely-tensors/) - Inspect tensors, mean, std, inf values. ##### Distributed Libs From e56f5d9abaf4d7dd2ff9407002b1e1b4afa1cf33 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 16 Jul 2024 18:09:09 +0200 Subject: [PATCH 124/152] Links to transformers --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 35f807f..ad2f404 100644 --- a/README.md +++ b/README.md @@ -719,6 +719,9 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po [StudioGAN](https://github.com/POSTECH-CVLab/PyTorch-StudioGAN) - PyTorch GAN implementations. ##### Transformers +[The Annotated Transformer](https://nlp.seas.harvard.edu/annotated-transformer/) - Intro to transformers. +[Transformers from Scratch](https://e2eml.school/transformers.html] - Intro. +[Neural Networks: Zero to Hero](https://karpathy.ai/zero-to-hero.html) - Video series on building neural networks. [SegFormer](https://github.com/NVlabs/SegFormer) - Simple and Efficient Design for Semantic Segmentation with Transformers. [esvit](https://github.com/microsoft/esvit) - Efficient self-supervised Vision Transformers. [nystromformer](https://github.com/Rishit-dagli/Nystromformer) - More efficient transformer because of approximate self-attention. From bd988b51b660723090456be24538e2ad4885c677 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 27 Jul 2024 01:29:47 +0200 Subject: [PATCH 125/152] quak --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index ad2f404..9aa9505 100644 --- a/README.md +++ b/README.md @@ -33,6 +33,7 @@ [pandas_flavor](https://github.com/Zsailer/pandas_flavor) - Write custom accessors like `.str` and `.dt`. [duckdb](https://github.com/duckdb/duckdb) - Efficiently run SQL queries on pandas DataFrame. [daft](https://github.com/Eventual-Inc/Daft) - Distributed DataFrame. +[quak](https://github.com/manzt/quak) - Scalable, interactive data table, [twitter](https://x.com/trevmanz/status/1816760923949809982). #### Pandas Parallelization [modin](https://github.com/modin-project/modin) - Parallelization library for faster pandas `DataFrame`. From 3d020a3bec46832c89b712684c47d7f074982ef1 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 12 Sep 2024 22:53:32 +0200 Subject: [PATCH 126/152] instanseg --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 9aa9505..653b09a 100644 --- a/README.md +++ b/README.md @@ -533,6 +533,7 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [MEDIAR](https://github.com/Lee-Gihun/MEDIAR) - Cell segmentation. [cellpose](https://github.com/mouseland/cellpose) - Cell segmentation. [Paper](https://www.biorxiv.org/content/10.1101/2020.02.02.931238v1), [Dataset](https://www.cellpose.org/dataset). [stardist](https://github.com/stardist/stardist) - Cell segmentation with Star-convex Shapes. +[instanseg](https://github.com/instanseg/instanseg) - Cell segmentation. [UnMicst](https://github.com/HMS-IDAC/UnMicst) - Identifying Cells and Segmenting Tissue. [ilastik](https://github.com/ilastik/ilastik) - Segment, classify, track and count cells. [ImageJ Plugin](https://github.com/ilastik/ilastik4ij). [nnUnet](https://github.com/MIC-DKFZ/nnUNet) - 3D biomedical image segmentation. From 1092edffad433a8e1955bb9d7f47c701f004b19d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 19 Sep 2024 22:21:59 +0200 Subject: [PATCH 127/152] litserve --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 653b09a..5096dbf 100644 --- a/README.md +++ b/README.md @@ -665,6 +665,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [torchcv](https://github.com/donnyyou/torchcv) - Deep Learning in Computer Vision. [pytorch-optimizer](https://github.com/jettify/pytorch-optimizer) - Collection of optimizers for PyTorch. [pytorch-lightning](https://github.com/PyTorchLightning/PyTorch-lightning) - Wrapper around PyTorch. +[litserve](https://github.com/Lightning-AI/LitServe) - Serve models. [lightly](https://github.com/lightly-ai/lightly) - MoCo, SimCLR, SimSiam, Barlow Twins, BYOL, NNCLR. [MONAI](https://github.com/project-monai/monai) - Deep learning in healthcare imaging. [kornia](https://github.com/kornia/kornia) - Image transformations, epipolar geometry, depth estimation. From 1a1cc17bafecd65e63c3705838f2521935311ea6 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 26 Sep 2024 08:55:19 +0200 Subject: [PATCH 128/152] uv, python-dotenv --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 5096dbf..23ecadd 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ #### General Python Programming [Python Best Practices Guide](https://medium.com/@mronakjain94/comprehensive-guide-to-installing-poetry-on-ubuntu-and-managing-python-projects-949b49ef4f76) -[rye](https://github.com/astral-sh/rye) - Dependency management. +[uv](https://github.com/astral-sh/uv) - Dependency management. [pyenv](https://github.com/pyenv/pyenv) - Manage multiple Python versions on your system. [poetry](https://github.com/python-poetry/poetry) - Dependency management. [pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. @@ -23,7 +23,7 @@ [more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. [tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). [loguru](https://github.com/Delgan/loguru) - Python logging. - +[python-dotenv](https://github.com/theskumar/python-dotenv) - Manage environment variables. #### Pandas Tricks, Alternatives and Additions [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. From 5d71f2b8268af4db84d5e6d3c47dc3f4783dd697 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 27 Sep 2024 11:51:09 +0200 Subject: [PATCH 129/152] shapiq --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 23ecadd..2e13764 100644 --- a/README.md +++ b/README.md @@ -1033,6 +1033,7 @@ Plotting learning curve: [link](http://www.ritchieng.com/machinelearning-learnin [Book](https://christophm.github.io/interpretable-ml-book/agnostic.html), [Examples](https://github.com/jphall663/interpretable_machine_learning_with_python) scikit-learn - [Permutation Importance](https://scikit-learn.org/stable/modules/generated/sklearn.inspection.permutation_importance.html) (can be used on any trained classifier) and [Partial Dependence](https://scikit-learn.org/stable/modules/generated/sklearn.inspection.partial_dependence.html) [shap](https://github.com/slundberg/shap) - Explain predictions of machine learning models, [talk](https://www.youtube.com/watch?v=C80SQe16Rao), [Good Shap intro](https://www.aidancooper.co.uk/a-non-technical-guide-to-interpreting-shap-analyses/). +[shapiq](https://github.com/mmschlk/shapiq) - Shapley interaction quantification. [treeinterpreter](https://github.com/andosa/treeinterpreter) - Interpreting scikit-learn's decision tree and random forest predictions. [lime](https://github.com/marcotcr/lime) - Explaining the predictions of any machine learning classifier, [talk](https://www.youtube.com/watch?v=C80SQe16Rao), [Warning (Myth 7)](https://crazyoscarchang.github.io/2019/02/16/seven-myths-in-machine-learning-research/). [lime_xgboost](https://github.com/jphall663/lime_xgboost) - Create LIMEs for XGBoost. From 1f51fd93aad2e4735e551fd2113c32445dc6e69f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 4 Oct 2024 10:28:56 +0200 Subject: [PATCH 130/152] ultralytics --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2e13764..4d7095a 100644 --- a/README.md +++ b/README.md @@ -684,6 +684,7 @@ Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), ##### Object detection / Instance Segmentation [Metrics reloaded: Recommendations for image analysis validation](https://arxiv.org/abs/2206.01653) - Guide for choosing correct image analysis metrics, [Code](https://github.com/Project-MONAI/MetricsReloaded), [Twitter Thread](https://twitter.com/lena_maierhein/status/1625450342006521857) [Good Yolo Explanation](https://jonathan-hui.medium.com/real-time-object-detection-with-yolo-yolov2-28b1b93e2088) +[ultralytics](https://github.com/ultralytics/ultralytics) - Easily accessible Yolo and SAM models. [yolact](https://github.com/dbolya/yolact) - Fully convolutional model for real-time instance segmentation. [EfficientDet Pytorch](https://github.com/toandaominh1997/EfficientDet.Pytorch), [EfficientDet Keras](https://github.com/xuannianz/EfficientDet) - Scalable and Efficient Object Detection. [detectron2](https://github.com/facebookresearch/detectron2) - Object Detection (Mask R-CNN) by Facebook. From 62da70e60e2b6c9af760db53de38a6a06572fb0d Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 9 Oct 2024 20:06:43 +0200 Subject: [PATCH 131/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 4d7095a..153c186 100644 --- a/README.md +++ b/README.md @@ -413,7 +413,6 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [fastText](https://github.com/facebookresearch/fastText) - Efficient text classification and representation learning. [annoy](https://github.com/spotify/annoy) - Approximate nearest neighbor search. [faiss](https://github.com/facebookresearch/faiss) - Approximate nearest neighbor search. -[pysparnn](https://github.com/facebookresearch/pysparnn) - Approximate nearest neighbor search. [infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics. [datasketch](https://github.com/ekzhu/datasketch) - Probabilistic data structures for large data (MinHash, HyperLogLog). [flair](https://github.com/zalandoresearch/flair) - NLP Framework by Zalando. From f2f0d214e5c8ddc9851cd7df6bb46c6d5ec3df86 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 10 Oct 2024 00:18:27 +0200 Subject: [PATCH 132/152] LSHForest --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 153c186..8abb9a2 100644 --- a/README.md +++ b/README.md @@ -413,6 +413,7 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [fastText](https://github.com/facebookresearch/fastText) - Efficient text classification and representation learning. [annoy](https://github.com/spotify/annoy) - Approximate nearest neighbor search. [faiss](https://github.com/facebookresearch/faiss) - Approximate nearest neighbor search. +[LSHForest](https://scikit-learn.org/0.16/modules/generated/sklearn.neighbors.LSHForest.html#sklearn.neighbors.LSHForest) - Locality-sensitive hashing (LSH) forest. [infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics. [datasketch](https://github.com/ekzhu/datasketch) - Probabilistic data structures for large data (MinHash, HyperLogLog). [flair](https://github.com/zalandoresearch/flair) - NLP Framework by Zalando. From 03007b4972f2b49a87614ebbf61bcf367a94c4d3 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Thu, 10 Oct 2024 11:11:06 +0200 Subject: [PATCH 133/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 8abb9a2..153c186 100644 --- a/README.md +++ b/README.md @@ -413,7 +413,6 @@ Embeddings - [GloVe](https://nlp.stanford.edu/projects/glove/) ([[1](https://www [fastText](https://github.com/facebookresearch/fastText) - Efficient text classification and representation learning. [annoy](https://github.com/spotify/annoy) - Approximate nearest neighbor search. [faiss](https://github.com/facebookresearch/faiss) - Approximate nearest neighbor search. -[LSHForest](https://scikit-learn.org/0.16/modules/generated/sklearn.neighbors.LSHForest.html#sklearn.neighbors.LSHForest) - Locality-sensitive hashing (LSH) forest. [infomap](https://github.com/mapequation/infomap) - Cluster (word-)vectors to find topics. [datasketch](https://github.com/ekzhu/datasketch) - Probabilistic data structures for large data (MinHash, HyperLogLog). [flair](https://github.com/zalandoresearch/flair) - NLP Framework by Zalando. From beafe8684d6bb2035a82e0d7e2c5bb6d15e22c36 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 22 Oct 2024 14:47:57 +0200 Subject: [PATCH 134/152] Update README.md --- README.md | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 153c186..42f236b 100644 --- a/README.md +++ b/README.md @@ -13,17 +13,12 @@ [rainbow-csv](https://marketplace.visualstudio.com/items?itemName=mechatroner.rainbow-csv) - VSCode plugin to display .csv files with nice colors. #### General Python Programming -[Python Best Practices Guide](https://medium.com/@mronakjain94/comprehensive-guide-to-installing-poetry-on-ubuntu-and-managing-python-projects-949b49ef4f76) [uv](https://github.com/astral-sh/uv) - Dependency management. -[pyenv](https://github.com/pyenv/pyenv) - Manage multiple Python versions on your system. -[poetry](https://github.com/python-poetry/poetry) - Dependency management. -[pyscaffold](https://github.com/pyscaffold/pyscaffold) - Python project template generator. -[hydra](https://github.com/facebookresearch/hydra) - Configuration management. -[hatch](https://github.com/pypa/hatch) - Python project management. +[python-dotenv](https://github.com/theskumar/python-dotenv) - Manage environment variables. +[structlog](https://github.com/hynek/structlog) - Python logging. [more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. [tqdm](https://github.com/tqdm/tqdm) - Progress bars for for-loops. Also supports [pandas apply()](https://stackoverflow.com/a/34365537/1820480). -[loguru](https://github.com/Delgan/loguru) - Python logging. -[python-dotenv](https://github.com/theskumar/python-dotenv) - Manage environment variables. +[hydra](https://github.com/facebookresearch/hydra) - Configuration management. #### Pandas Tricks, Alternatives and Additions [pandasvault](https://github.com/firmai/pandasvault) - Large collection of pandas tricks. From 79b606628530699c7e6fb7aa23f2697865632912 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 22 Oct 2024 14:49:22 +0200 Subject: [PATCH 135/152] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 42f236b..000fda0 100644 --- a/README.md +++ b/README.md @@ -14,6 +14,7 @@ #### General Python Programming [uv](https://github.com/astral-sh/uv) - Dependency management. +[just](https://github.com/casey/just) - Command runner. Replacement for make. [python-dotenv](https://github.com/theskumar/python-dotenv) - Manage environment variables. [structlog](https://github.com/hynek/structlog) - Python logging. [more_itertools](https://more-itertools.readthedocs.io/en/latest/) - Extension of itertools. From 53d0447543be267221d0d36951eb237c66700c24 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 27 Oct 2024 15:55:35 +0100 Subject: [PATCH 136/152] Added 3 R dataset collections --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 000fda0..bc4fcb1 100644 --- a/README.md +++ b/README.md @@ -79,7 +79,10 @@ #### Classical Statistics ##### Datasets -[Rdatasets](https://vincentarelbundock.github.io/Rdatasets/articles/data.html) - Collection of more than 2000 datasets, stored as csv files. +[Rdatasets](https://vincentarelbundock.github.io/Rdatasets/articles/data.html) - Collection of more than 2000 datasets, stored as csv files (R package). +[MedDataSets](https://lightbluetitan.github.io/meddatasets/index.html) - Datasets related to medicine, diseases, treatments, drugs, and public health (R package). +[usdatasets](https://lightbluetitan.github.io/usdatasets/) - US-exclusive datasets (crime, economics, education, finance, energy, healthcare) (R package). +[timeseriesdatasets_R](https://lightbluetitan.github.io/timeseriesdatasets_R/) - Time series datasets (R package). ##### p-values [The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) From c7c939d2dc7cd963554c5f63fc8bd57d1052924a Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sun, 10 Nov 2024 10:50:45 +0100 Subject: [PATCH 137/152] typo fixed --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bc4fcb1..a1116b1 100644 --- a/README.md +++ b/README.md @@ -722,7 +722,7 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po ##### Transformers [The Annotated Transformer](https://nlp.seas.harvard.edu/annotated-transformer/) - Intro to transformers. -[Transformers from Scratch](https://e2eml.school/transformers.html] - Intro. +[Transformers from Scratch](https://e2eml.school/transformers.html) - Intro. [Neural Networks: Zero to Hero](https://karpathy.ai/zero-to-hero.html) - Video series on building neural networks. [SegFormer](https://github.com/NVlabs/SegFormer) - Simple and Efficient Design for Semantic Segmentation with Transformers. [esvit](https://github.com/microsoft/esvit) - Efficient self-supervised Vision Transformers. From 42f7554f8a04e9b913cd85872ead18a418131080 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 16 Nov 2024 15:27:16 +0100 Subject: [PATCH 138/152] BiaPy Paper --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a1116b1..e216b9d 100644 --- a/README.md +++ b/README.md @@ -512,7 +512,7 @@ AutoUnmix - [Link](https://www.biorxiv.org/content/10.1101/2023.05.30.542836v1.f ##### Microscopy Pipelines Labsyspharm Stack see below. -[BiaPy](https://github.com/danifranco/BiaPy) - Bioimage analysis pipelines. +[BiaPy](https://github.com/danifranco/BiaPy) - Bioimage analysis pipelines, [paper](https://www.biorxiv.org/content/10.1101/2024.02.03.576026v2.full). [SCIP](https://scalable-cytometry-image-processing.readthedocs.io/en/latest/usage.html) - Image processing pipeline on top of Dask. [DeepCell Kiosk](https://github.com/vanvalenlab/kiosk-console/tree/master) - Image analysis platform. [IMCWorkflow](https://github.com/BodenmillerGroup/IMCWorkflow/) - Image analysis pipeline using [steinbock](https://github.com/BodenmillerGroup/steinbock), [Twitter](https://twitter.com/NilsEling/status/1715020265963258087), [Paper](https://www.nature.com/articles/s41596-023-00881-0), [workflow](https://bodenmillergroup.github.io/IMCDataAnalysis/). From 38f6ed80af3e79206e809821c36398a8681e4e6e Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 19 Nov 2024 10:01:10 +0100 Subject: [PATCH 139/152] MedImageInsight --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index e216b9d..afc3b50 100644 --- a/README.md +++ b/README.md @@ -545,6 +545,7 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [Segment-Everything-Everywhere-All-At-Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once) - Segment Everything Everywhere All at Once from Microsoft. [deepcell-tf](https://github.com/vanvalenlab/deepcell-tf/tree/master) - Cell segmentation, [DeepCell](https://deepcell.org/). [labkit](https://github.com/juglab/labkit-ui) - Fiji plugin for image segmentation. +[MedImageInsight](https://arxiv.org/abs/2410.06542) - Open-Source Embedding Model for General Domain Medical Imaging. ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. From 444f6713bf258c366b0c4a3d16e1195b028ed079 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 19 Nov 2024 12:41:30 +0100 Subject: [PATCH 140/152] CHIEF --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index afc3b50..817aa32 100644 --- a/README.md +++ b/README.md @@ -545,7 +545,8 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [Segment-Everything-Everywhere-All-At-Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once) - Segment Everything Everywhere All at Once from Microsoft. [deepcell-tf](https://github.com/vanvalenlab/deepcell-tf/tree/master) - Cell segmentation, [DeepCell](https://deepcell.org/). [labkit](https://github.com/juglab/labkit-ui) - Fiji plugin for image segmentation. -[MedImageInsight](https://arxiv.org/abs/2410.06542) - Open-Source Embedding Model for General Domain Medical Imaging. +[MedImageInsight](https://arxiv.org/abs/2410.06542) - Embedding Model for General Domain Medical Imaging. +[CHIEF](https://github.com/hms-dbmi/CHIEF) - Clinical Histopathology Imaging Evaluation Foundation Model. ##### Cell Segmentation Datasets [cellpose](https://www.cellpose.org/dataset) - Cell images. From aea211d3432f69e26eb2a1ea796f3cde75a69af3 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 30 Nov 2024 08:52:23 +0100 Subject: [PATCH 141/152] supertree --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 817aa32..8e5f597 100644 --- a/README.md +++ b/README.md @@ -398,6 +398,7 @@ Why the default feature importance for random forests is wrong: [link](http://ex [merf](https://github.com/manifoldai/merf) - Mixed Effects Random Forest for Clustering, [video](https://www.youtube.com/watch?v=gWj4ZwB7f3o) [groot](https://github.com/tudelft-cda-lab/GROOT) - Robust decision trees. [linear-tree](https://github.com/cerlymarco/linear-tree) - Trees with linear models at the leaves. +[supertree](https://github.com/mljar/supertree) - Decision tree visualization. #### Natural Language Processing (NLP) / Text Processing [talk](https://www.youtube.com/watch?v=6zm9NC9uRkk)-[nb](https://nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/Modern_NLP_in_Python.ipynb), [nb2](https://ahmedbesbes.com/how-to-mine-newsfeed-data-and-extract-interactive-insights-in-python.html), [talk](https://www.youtube.com/watch?time_continue=2&v=sI7VpFNiy_I). From 285a95fb15d4061141d49f9e10691f22838652a1 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 3 Dec 2024 11:54:59 +0100 Subject: [PATCH 142/152] Update README.md 1 dataset 100 viz + The Return of Pseudosciences paper --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 8e5f597..95ecbd7 100644 --- a/README.md +++ b/README.md @@ -130,6 +130,7 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal ##### Visualizations [Friends don't let friends make certain types of data visualization](https://github.com/cxli233/FriendsDontLetFriends) [Great Overview over Visualizations](https://textvis.lnu.se/) +[1 dataset, 100 visualizations](https://100.datavizproject.com/) [Dependent Propabilities](https://static.laszlokorte.de/stochastic/) [Null Hypothesis Significance Testing (NHST) and Sample Size Calculation](https://rpsychologist.com/d3/NHST/) [Correlation](https://rpsychologist.com/d3/correlation/) @@ -846,6 +847,9 @@ Other measures: #### Multi-label classification [scikit-multilearn](https://github.com/scikit-multilearn/scikit-multilearn) - Multi-label classification, [talk](https://www.youtube.com/watch?v=m-tAASQA7XQ&t=18m57s). +#### Critical AI Texts +[Sublime - The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History?](https://arxiv.org/abs/2411.18656) + #### Signal Processing and Filtering [Stanford Lecture Series on Fourier Transformation](https://see.stanford.edu/Course/EE261), [Youtube](https://www.youtube.com/watch?v=gZNm7L96pfY&list=PLB24BC7956EE040CD&index=1), [Lecture Notes](https://see.stanford.edu/materials/lsoftaee261/book-fall-07.pdf). [Visual Fourier explanation](https://dsego.github.io/demystifying-fourier/). From 1baa3f71f8860eb93b3323713e049f1e4b090ff2 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 6 Jan 2025 15:52:20 +0100 Subject: [PATCH 143/152] pgvector --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 95ecbd7..3a5db68 100644 --- a/README.md +++ b/README.md @@ -1141,6 +1141,7 @@ AlphaZero methodology - [1](https://github.com/AppliedDataSciencePartners/DeepRe [dvc](https://github.com/iterative/dvc) - Version control for large files. [kedro](https://github.com/quantumblacklabs/kedro) - Build data pipelines. [feast](https://github.com/feast-dev/feast) - Feature store. [Video](https://www.youtube.com/watch?v=_omcXenypmo). +[pgvector](https://github.com/pgvector/pgvector) - Vector similarity search for Postgres. [pinecone](https://www.pinecone.io/) - Database for vector search applications. [truss](https://github.com/basetenlabs/truss) - Serve ML models. [milvus](https://github.com/milvus-io/milvus) - Vector database for similarity search. From f4badbe276bccc2c62be195d7664a7590a4ba05b Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 6 Jan 2025 21:13:42 +0100 Subject: [PATCH 144/152] Time Series Anomaly Detection Review Paper --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 3a5db68..13ec486 100644 --- a/README.md +++ b/README.md @@ -869,6 +869,7 @@ Other measures: [geomstats](https://github.com/geomstats/geomstats) - Computations and statistics on manifolds with geometric structures. #### Time Series +[Time Series Anomaly Detection Review Paper](https://arxiv.org/abs/2412.20512) [statsmodels](https://www.statsmodels.org/dev/tsa.html) - Time series analysis, [seasonal decompose](https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html) [example](https://gist.github.com/balzer82/5cec6ad7adc1b550e7ee), [SARIMA](https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html), [granger causality](http://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.grangercausalitytests.html). [kats](https://github.com/facebookresearch/kats) - Time series prediction library by Facebook. [prophet](https://github.com/facebook/prophet) - Time series prediction library by Facebook. From a59241fa3830f5cd2a65050f3671399ae0d2fd7e Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 7 Jan 2025 23:49:20 +0100 Subject: [PATCH 145/152] Google Tuning Playbook --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 13ec486..cda3458 100644 --- a/README.md +++ b/README.md @@ -605,6 +605,7 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [Intro to semi-supervised learning](https://lilianweng.github.io/lil-log/2021/12/05/semi-supervised-learning.html). ##### Tutorials & Viewer +[Google Tuning Playbook](https://github.com/google-research/tuning_playbook) - A playbook for systematically maximizing the performance of deep learning models by Google. [fast.ai course](https://course.fast.ai/) - Practical Deep Learning for Coders. [Tensorflow without a PhD](https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd) - Neural Network course by Google. Feature Visualization: [Blog](https://distill.pub/2017/feature-visualization/), [PPT](http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture12.pdf) From 3815bf9d1674191f83d4b2c8abcc76362c93e74f Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Wed, 8 Jan 2025 21:27:39 +0100 Subject: [PATCH 146/152] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index cda3458..c6c5627 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,6 @@ [spark](https://docs.databricks.com/spark/latest/dataframes-datasets/introduction-to-dataframes-python.html#work-with-dataframes) - `DataFrame` for big data, [cheatsheet](https://gist.github.com/crawles/b47e23da8218af0b9bd9d47f5242d189), [tutorial](https://github.com/ericxiao251/spark-syntax). [dask](https://github.com/dask/dask), [dask-ml](http://ml.dask.org/) - Pandas `DataFrame` for big data and machine learning library, [resources](https://matthewrocklin.com/blog//work/2018/07/17/dask-dev), [talk1](https://www.youtube.com/watch?v=ccfsbuqsjgI), [talk2](https://www.youtube.com/watch?v=RA_2qdipVng), [notebooks](https://github.com/dask/dask-ec2/tree/master/notebooks), [videos](https://www.youtube.com/user/mdrocklin). [h2o](https://github.com/h2oai/h2o-3) - Helpful `H2OFrame` class for out-of-memory dataframes. -[datatable](https://github.com/h2oai/datatable) - Data Table for big data support. [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library, [Intro](https://www.youtube.com/watch?v=6XzS5XcpicM&t=2m50s). [cupy](https://github.com/cupy/cupy) - NumPy-like API accelerated with CUDA. [ray](https://github.com/ray-project/ray/) - Flexible, high-performance distributed execution framework. From 0aa801c6ac8080a38b1e22510906a48c56c772d5 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 14 Jan 2025 08:56:15 +0100 Subject: [PATCH 147/152] Added R datasets --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c6c5627..bfc1d78 100644 --- a/README.md +++ b/README.md @@ -79,9 +79,13 @@ ##### Datasets [Rdatasets](https://vincentarelbundock.github.io/Rdatasets/articles/data.html) - Collection of more than 2000 datasets, stored as csv files (R package). +[crimedatasets](https://lightbluetitan.github.io/crimedatasets/) - Datasets focused on crimes, criminal activities (R package). +[educationr](https://lightbluetitan.github.io/educationr/) - Datasets related to education (performance, learning methods, test scores, absenteeism) (R package). [MedDataSets](https://lightbluetitan.github.io/meddatasets/index.html) - Datasets related to medicine, diseases, treatments, drugs, and public health (R package). -[usdatasets](https://lightbluetitan.github.io/usdatasets/) - US-exclusive datasets (crime, economics, education, finance, energy, healthcare) (R package). +[oncodatasets](https://lightbluetitan.github.io/oncodatasets/) - Datasets focused on cancer research, survival rates, genetic studies, biomarkers, epidemiology (R package). [timeseriesdatasets_R](https://lightbluetitan.github.io/timeseriesdatasets_R/) - Time series datasets (R package). +[usdatasets](https://lightbluetitan.github.io/usdatasets/) - US-exclusive datasets (crime, economics, education, finance, energy, healthcare) (R package). + ##### p-values [The ASA Statement on p-Values: Context, Process, and Purpose](https://amstat.tandfonline.com/doi/full/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN) From 4e1271bee8d2edf72314c26753e69fa84184cf83 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Tue, 14 Jan 2025 17:46:33 +0100 Subject: [PATCH 148/152] darts --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index bfc1d78..9e42355 100644 --- a/README.md +++ b/README.md @@ -875,6 +875,7 @@ Other measures: #### Time Series [Time Series Anomaly Detection Review Paper](https://arxiv.org/abs/2412.20512) [statsmodels](https://www.statsmodels.org/dev/tsa.html) - Time series analysis, [seasonal decompose](https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompose.html) [example](https://gist.github.com/balzer82/5cec6ad7adc1b550e7ee), [SARIMA](https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html), [granger causality](http://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.grangercausalitytests.html). +[darts](https://github.com/unit8co/darts) - Time Series library (LightGBM, Neural Networks). [kats](https://github.com/facebookresearch/kats) - Time series prediction library by Facebook. [prophet](https://github.com/facebook/prophet) - Time series prediction library by Facebook. [neural_prophet](https://github.com/ourownstory/neural_prophet) - Time series prediction built on PyTorch. From 7687443b8bdf101ab79cc05fd07c7284d49e85ba Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Mon, 3 Mar 2025 18:57:07 +0100 Subject: [PATCH 149/152] Statistical Inference and Regression book --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 9e42355..6d18279 100644 --- a/README.md +++ b/README.md @@ -760,6 +760,7 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po Legate Numpy - Distributed Numpy array multiple using GPUs by Nvidia (not released yet) [video](https://www.youtube.com/watch?v=Jxxs_moibog). #### Regression +Good introduction: [A User’s Guide to Statistical Inference and Regression](https://mattblackwell.github.io/gov2002-book/) Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teaching/ML_SVR.pdf), [forum](https://www.quora.com/How-does-support-vector-regression-work), [paper](http://alex.smola.org/papers/2003/SmoSch03b.pdf) [pyearth](https://github.com/scikit-learn-contrib/py-earth) - Multivariate Adaptive Regression Splines (MARS), [tutorial](https://uc-r.github.io/mars). From ef03448c2969561c518ef32679ea6a9b12935f93 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 7 Mar 2025 23:48:00 +0100 Subject: [PATCH 150/152] Applied Machine Learning in Python, Ridgeplot --- README.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 6d18279..2a1ebc5 100644 --- a/README.md +++ b/README.md @@ -183,6 +183,10 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [quartets](https://github.com/r-causal/quartets) - Anscombe’s Quartet, Causal Quartet, [Datasaurus Dozen](https://github.com/jumpingrivers/datasauRus) and others (R package). [episensr](https://cran.r-project.org/web/packages/episensr/vignettes/episensr.html) - Quantitative Bias Analysis for Epidemiologic Data (=simulation of possible effects of different sources of bias) (R package). +#### Machine Learning Tutorials +[Statistical Inference and Regression](https://mattblackwell.github.io/gov2002-book/) +[Applied Machine Learning in Python](https://geostatsguy.github.io/MachineLearningDemos_Book/intro.html) +[Convolutional Neural Networks for Visual Recognition](https://cs231n.github.io/) - Stanford CS class. #### Exploration and Cleaning [Checklist](https://github.com/r0f1/ml_checklist). @@ -218,9 +222,6 @@ Bland-Altman Plot [1](https://pingouin-stats.org/generated/pingouin.plot_blandal [pypeln](https://github.com/cgarciae/pypeln) - Concurrent data pipelines. [feature-engine](https://github.com/feature-engine/feature_engine) - Encoders, transformers, etc. -#### Computer Vision -[Intro to Computer Vision](https://www.youtube.com/playlist?list=PLjMXczUzEYcHvw5YYSU92WrY8IwhTuq7p) - #### Feature Selection [Overview Paper](https://www.sciencedirect.com/science/article/pii/S016794731930194X), [Talk](https://www.youtube.com/watch?v=JsArBz46_3s), [Repo](https://github.com/Yimeng-Zhang/feature-engineering-and-feature-selection) Blog post series - [1](http://blog.datadive.net/selecting-good-features-part-i-univariate-selection/), [2](http://blog.datadive.net/selecting-good-features-part-ii-linear-models-and-regularization/), [3](http://blog.datadive.net/selecting-good-features-part-iii-random-forests/), [4](http://blog.datadive.net/selecting-good-features-part-iv-stability-selection-rfe-and-everything-side-by-side/) @@ -309,6 +310,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [physt](https://github.com/janpipek/physt) - Better histograms, [talk](https://www.youtube.com/watch?v=ZG-wH3-Up9Y), [notebook](https://nbviewer.jupyter.org/github/janpipek/pydata2018-berlin/blob/master/notebooks/talk.ipynb). [fast-histogram](https://github.com/astrofrog/fast-histogram) - Fast histograms. [matplotlib_venn](https://github.com/konstantint/matplotlib-venn) - Venn diagrams, [alternative](https://github.com/penrose/penrose). +[ridgeplot](https://github.com/tpvasconcelos/ridgeplot) - Ridge plots. [joypy](https://github.com/sbebo/joypy) - Draw stacked density plots (=ridge plots), [Ridge plots in seaborn](https://seaborn.pydata.org/examples/kde_ridgeplot.html). [mosaic plots](https://www.statsmodels.org/dev/generated/statsmodels.graphics.mosaicplot.mosaic.html) - Categorical variable visualization, [example](https://sukhbinder.wordpress.com/2018/09/18/mosaic-plot-in-python/). [scikit-plot](https://github.com/reiinakano/scikit-plot) - ROC curves and other visualizations for ML models. @@ -601,7 +603,6 @@ Review of organoid pipelines - [Paper](https://arxiv.org/ftp/arxiv/papers/2301/2 [DeepPurpose](https://github.com/kexinhuang12345/DeepPurpose) - Deep Learning Based Molecular Modelling and Prediction Toolkit. #### Neural Networks -[Convolutional Neural Networks for Visual Recognition](https://cs231n.github.io/) - Stanford CS class. [mit6874](https://mit6874.github.io/) - Computational Systems Biology: Deep Learning in the Life Sciences. [ConvNet Shape Calculator](https://madebyollin.github.io/convnet-calculator/) - Calculate output dimensions of Conv2D layer. [Great Gradient Descent Article](https://towardsdatascience.com/10-gradient-descent-optimisation-algorithms-86989510b5e9). @@ -760,7 +761,6 @@ Cell Segmentation - [Talk](https://www.youtube.com/watch?v=dVFZpodqJiI), Blog Po Legate Numpy - Distributed Numpy array multiple using GPUs by Nvidia (not released yet) [video](https://www.youtube.com/watch?v=Jxxs_moibog). #### Regression -Good introduction: [A User’s Guide to Statistical Inference and Regression](https://mattblackwell.github.io/gov2002-book/) Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teaching/ML_SVR.pdf), [forum](https://www.quora.com/How-does-support-vector-regression-work), [paper](http://alex.smola.org/papers/2003/SmoSch03b.pdf) [pyearth](https://github.com/scikit-learn-contrib/py-earth) - Multivariate Adaptive Regression Splines (MARS), [tutorial](https://uc-r.github.io/mars). @@ -768,7 +768,6 @@ Understanding SVM Regression: [slides](https://cs.adelaide.edu.au/~chhshen/teach [GLRM](https://github.com/madeleineudell/LowRankModels.jl) - Generalized Low Rank Models. [tweedie](https://xgboost.readthedocs.io/en/latest/parameter.html#parameters-for-tweedie-regression-objective-reg-tweedie) - Specialized distribution for zero inflated targets, [Talk](https://www.youtube.com/watch?v=-o0lpHBq85I). [MAPIE](https://github.com/scikit-learn-contrib/MAPIE) - Estimating prediction intervals. -[Regressio](https://github.com/brendanartley/Regressio) - Regression and Spline models. #### Polynomials [orthopy](https://github.com/nschloe/orthopy) - Orthogonal polynomials in all shapes and sizes. From 7e7c213a75fb7023bbe503a8c7b88fa391197228 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Fri, 7 Mar 2025 23:53:29 +0100 Subject: [PATCH 151/152] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 2a1ebc5..0575777 100644 --- a/README.md +++ b/README.md @@ -949,7 +949,7 @@ Tutorial on using cvxpy: [1](https://calmcode.io/cvxpy-one/the-stigler-diet.html [Time-dependent Cox Model in R](https://stats.stackexchange.com/questions/101353/cox-regression-with-time-varying-covariates). [lifelines](https://lifelines.readthedocs.io/en/latest/) - Survival analysis, Cox PH Regression, [talk](https://www.youtube.com/watch?v=aKZQUaNHYb0), [talk2](https://www.youtube.com/watch?v=fli-yE5grtY). [scikit-survival](https://github.com/sebp/scikit-survival) - Survival analysis. -[xgboost](https://github.com/dmlc/xgboost) - `"objective": "survival:cox"` [NHANES example](https://slundberg.github.io/shap/notebooks/NHANES%20I%20Survival%20Model.html) +[xgboost](https://github.com/dmlc/xgboost) - `"objective": "survival:cox"` [NHANES example](https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/NHANES%20I%20Survival%20Model.html) [survivalstan](https://github.com/hammerlab/survivalstan) - Survival analysis, [intro](http://www.hammerlab.org/2017/06/26/introducing-survivalstan/). [convoys](https://github.com/better/convoys) - Analyze time lagged conversions. RandomSurvivalForests (R packages: randomForestSRC, ggRandomForests). From bec82aed08e30e6b4ab61fcc15a80f5244eb6e36 Mon Sep 17 00:00:00 2001 From: Florian Rohrer Date: Sat, 15 Mar 2025 19:30:52 +0100 Subject: [PATCH 152/152] fastplotlib --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 0575777..2898169 100644 --- a/README.md +++ b/README.md @@ -336,6 +336,7 @@ Faster t-SNE implementations: [lvdmaaten](https://lvdmaaten.github.io/tsne/), [M [proplot](https://github.com/proplot-dev/proplot) - Matplotlib wrapper. [morpheus](https://software.broadinstitute.org/morpheus/) - Broad Institute tool matrix visualization and analysis software. [Source](https://github.com/cmap/morpheus.js), Tutorial: [1](https://www.youtube.com/watch?v=0nkYDeekhtQ), [2](https://www.youtube.com/watch?v=r9mN6MsxUb0), [Code](https://github.com/broadinstitute/BBBC021_Morpheus_Exercise). [jupyter-scatter](https://github.com/flekschas/jupyter-scatter) - Interactive 2D scatter plot widget for Jupyter. +[fastplotlib](https://github.com/fastplotlib/fastplotlib) - Fast plotting library using pygfx. #### Colors [palettable](https://github.com/jiffyclub/palettable) - Color palettes from [colorbrewer2](https://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3).