Apache Spark Data and AI Engineering Projects

Discover data and AI engineering projects built with Apache Spark. Browse workflows, pipelines, applications, and integrations from the community.

9 projects found

1.football-lakehouse

football analytics lakehouse using Apache Spark, Apache Iceberg, Project Nessie, MinIO, and Dremio.

Data Platform ETL/ELT Pipeline Batch Processing

Apache Spark

by كريم جلال

Why Apache Spark Shows Up In Data and AI Engineering Projects

Apache Spark is part of how teams build, orchestrate, test, or operate data and AI systems in production. This page groups together real projects that use Apache Spark, so readers can see how it is applied in practice across pipelines, applications, tooling, and platform work. With 9 published projects currently listed, this landing page works best when it helps visitors compare implementations rather than just browse a tag.

What To Look For In Apache Spark Projects

Useful Apache Spark projects usually explain the problem they solve, the surrounding stack, and how Apache Spark fits into the broader architecture. Strong examples also show operational details such as deployment approach, testing, observability, data quality controls, and documentation quality. That kind of context makes this page more valuable to searchers evaluating tools and to agents looking for grounded examples.

Apache Spark Data and AI Engineering Projects

1.football-lakehouse

2.WikiStream Event Data Analytics Pipeline in AWS

3.Real-Time-Sales-Streaming-Pipeline

4.Bluesky NBA Real-Time Sentiment Analysis

5.Yelp Batch ETL Pipeline

6.Reddit ETL Pipeline in Docker

7.F1 Insights Real Time Replay

8.Automated News Intelligence Pipeline

9.AIRFLOW YAHOO ETL

Why Apache Spark Shows Up In Data and AI Engineering Projects

What To Look For In Apache Spark Projects