Back to all projects

Data Platform Data Engineering Projects

Infrastructure, frameworks, and tooling

Discover open-source data engineering projects in Data Platform from the community.

9 projects found

1.Drift Detective

Drift Detective is a Python library for tracking schema evolution using versioned JSON snapshots

by varga.dani6

2.Silism Commerce 360 - Local AI Native Lakehouse

AI-native e-commerce data platform you can run locally (Airflow + dbt + MCP)

+1
by Silism

3.Airflow Python React Widgets

Python React Experiment

by Rahul Rajasekharan

4.Airflow and DBT Analytics Accelerator

An Open-source accelerator for a ready-to-run, end-to-end analytics platform

+1
by gsaipurushoth7

5.Reddit ETL Pipeline in Docker

Reddit Data Engineering ETL Pipeline: Spark, Airflow, MinIO in Docker Medallion Architecture

+1
by Abdullah

6.Data Warehousing for Realtime Pipelines

Building a real-time data warehouse with the use of state-of-the-art tools like Apache Kafka..etc

by wahomewilberforce

7.E2E Real-Time Data Pipeline

Real-time data pipeline with Kafka, Flink, Iceberg, Trino, and Superset.

+1
by abelst9

8.Batch data pipeline

Batch Data Pipeline with Airflow, DuckDB, Delta Lake, Trino and Metabase. Observability and quality.

+1
by abelst9

9.Airflow Bulk Pause Unpause Plugin

Bulk manage Airflow DAG states effortlessly — pause or unpause in one action.

by Rahul Rajasekharan