
Lists (5)
Sort Name ascending (A-Z)
- All languages
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Common Lisp
- Crystal
- Cuda
- Cython
- D
- Dockerfile
- Emacs Lisp
- Erlang
- Go
- HCL
- HTML
- Handlebars
- Haskell
- Java
- JavaScript
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MATLAB
- MDX
- Makefile
- Nix
- OCaml
- Objective-C
- OpenEdge ABL
- PLpgSQL
- Perl
- Python
- R
- Ruby
- Rust
- SCSS
- SQL
- Scala
- Shell
- Starlark
- Swift
- SystemVerilog
- TSQL
- TeX
- TypeScript
- Vim Script
- Visual Basic .NET
- Vue
- Zig
Starred repositories
The leader in Customer Data Infrastructure
Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
Breeze is/was a numerical processing library for Scala.
A Scala API for Apache Beam and Google Cloud Dataflow.
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
GeoTrellis is a geographic data processing engine for high performance applications.
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
Essential Spark extensions and helper methods ✨😲
Qubole Sparklens tool for performance tuning Apache Spark
Examples for High Performance Spark
Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Solution to Facebook's link prediction contest on Kaggle.
Visualize statistics from the MOOC "Functional Programming Principles in Scala" using Scala!
Spark Structured Streaming / Kafka / Cassandra / Elastic
Quick summary: This code implements a spectral (third order tensor decomposition) learning method for learning LDA topic model on Spark.
Performance optimization for Spark running on Kubernetes
OSMesa is an OpenStreetMap processing stack based on GeoTrellis and Apache Spark
Randomized SVD of large sparse matrices on Spark
Spark 2.0 Scala Machine Learning examples
Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines and apply best practices.