Skip to content
View erjan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report erjan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 250 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.

Python 50 7 Updated Jan 18, 2025

Notes talking about the design and implementation of Apache Spark

5,342 1,843 Updated Apr 2, 2024

Roadmap для Data Engineer. Цель роадмапа – устроиться тебе на работу!

452 171 Updated Sep 15, 2025

This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksClub/data-engineering-zoomcamp/tree/main

Dockerfile 101 115 Updated Aug 20, 2024

My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on ​lambda architecture​, that aggregates Twitter and US stock market data for user sentiment anal…

Scala 505 127 Updated Aug 24, 2022

100+ Python challenging programming exercises

28,168 6,917 Updated Apr 28, 2025

Practice your pandas skills!

Jupyter Notebook 11,693 8,661 Updated Aug 16, 2024

An example project that demontrates real time big data stream processing using GigaSpaces

Java 19 9 Updated Feb 26, 2022

100 numpy exercises (with solutions)

Python 13,293 6,293 Updated Aug 26, 2025

Data Engineering pet-project covering GCP, Docker, workflow orchestration with Mage, data transforming with dbt, batch processing via Spark

Python 1 Updated Apr 21, 2024

Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboard is then used to support a purchasing decision of which He…

Python 239 53 Updated Jan 1, 2023

The smart city reference pipeline shows how to integrate various media building blocks, with analytics powered by the OpenVINO™ Toolkit, for traffic or stadium sensing, analytics and management tasks.

Python 209 89 Updated May 5, 2025

Terminal User Interface (TUI) apps

Python 886 62 Updated Aug 15, 2025

This project shows how to capture changes from postgres database and stream them into kafka

Python 38 20 Updated May 17, 2024

Learn how to design, develop, deploy and iterate on production-grade ML applications.

Jupyter Notebook 3,203 566 Updated Aug 16, 2024
Jupyter Notebook 11 24 Updated Aug 30, 2019

My solution to the book <A collection of Data Science Take-home Challenges>

Jupyter Notebook 981 528 Updated Oct 31, 2022
JavaScript 1 Updated Jun 15, 2023

Sample project to demonstrate data engineering best practices

Python 197 36 Updated Feb 24, 2024

DataTalks.Club's Data Engineering Zoomcamp Project

Python 12 3 Updated May 7, 2023

Final Project of the MLOps Zoomcamp hosted by DataTalksClub.

HTML 26 5 Updated Dec 19, 2022

DataTalks.Club's Data Engineering Zoomcamp Project

Python 23 7 Updated Jul 14, 2022

A repo to track data engineering projects

Jupyter Notebook 13 6 Updated Nov 11, 2022

A batch Data Pipeline that retrieves data from a user purchase table and a movie review table and is transformed to form a user behaviour metric table.

HCL 17 3 Updated Aug 14, 2025

A project portfolio to accompany my resume

Python 29 6 Updated Sep 5, 2023

Insight Data Engineering Project

Python 15 10 Updated Jun 1, 2021

Data Engineering Project in GCP

Python 21 4 Updated Mar 29, 2023

In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data from the Spotify API, transform into desired format and load it…

Jupyter Notebook 24 4 Updated May 6, 2023

A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!

Python 25 5 Updated Nov 8, 2022

Data Engineering, Data Warehouse, Data Mart, Cloud Data, AWS, SAS, Redshift, S3

Jupyter Notebook 31 4 Updated Feb 2, 2021
Next