DataTalksClub/data-engineering-zoomcamp

GitHub: DataTalksClub/data-engineering-zoomcamp

Stars: 42150 | Forks: 8339

Data Engineering Zoomcamp Overview

Data Engineering Zoomcamp: A Free 9-Week Course on Data Engineering Fundamentals

Master the fundamentals of data engineering by building an end-to-end data pipeline from scratch. Gain hands-on experience with industry-standard tools and best practices.

Join Slack#course-data-engineering ChannelTelegram AnnouncementsCourse PlaylistFAQ

## How to Enroll ### 2026 Cohort - **Start Date**: 12 January 2026 - **Register Here**: [Sign up](https://airtable.com/shr6oVXeQvSI5HuWD) ### Self-Paced Learning 1. Watch the course videos. 2. Join the [Slack community](https://datatalks.club/slack.html). 3. Refer to the [FAQ document](https://datatalks.club/faq/data-engineering-zoomcamp.html) for guidance. ## Syllabus Overview The course consists of structured modules, hands-on workshops, and a final project to reinforce your learning. ### **Prerequisites** To get the most out of this course, you should have: - Basic coding experience - Familiarity with SQL - Experience with Python (helpful but not required) No prior data engineering experience is necessary. ### **Modules** #### [Module 1: Containerization and Infrastructure as Code](01-docker-terraform/) - Introduction to GCP - Docker and Docker Compose - Running PostgreSQL with Docker - Infrastructure setup with Terraform - Homework #### [Module 2: Workflow Orchestration](02-workflow-orchestration/) - Data Lakes and Workflow Orchestration - Workflow orchestration with Kestra - Homework #### [Workshop 1: Data Ingestion](cohorts/2026/workshops/dlt.md) - API reading and pipeline scalability - Data normalization and incremental loading - Homework #### [Module 3: Data Warehousing](03-data-warehouse/) - Introduction to BigQuery - Partitioning, clustering, and best practices - Machine learning in BigQuery #### [Module 4: Analytics Engineering](04-analytics-engineering/) - Analytics Engineering and Data Modeling - dbt (data build tool) with DuckDB & BigQuery - Testing, documentation, and deployment #### [Module 5: Data Platforms](05-data-platforms/) - Building end-to-end data pipelines with Bruin - Data ingestion, transformation, and quality - Deployment to cloud (BigQuery) #### [Module 6: Batch Processing](06-batch/) - Introduction to Apache Spark - DataFrames and SQL - Internals of GroupBy and Joins #### [Module 7: Streaming](07-streaming/) - Introduction to Kafka - Kafka Streams and KSQL - Schema management with Avro #### [Final Project](projects/) - Apply all concepts learned in a real-world scenario - Peer review and feedback process ## Testimonials ### **Getting Help on Slack** Join the [`#course-data-engineering`](https://app.slack.com/client/T01ATQK62F8/C01FABYF2RG) channel on [DataTalks.Club Slack](https://datatalks.club/slack.html) for discussions, troubleshooting, and networking. To keep discussions organized: ## Meet the Instructors - [Alexey Grigorev](https://linkedin.com/in/agrigorev) - [Michael Shoemaker](https://www.linkedin.com/in/michaelshoemaker1/) - [Will Russell](https://www.linkedin.com/in/wrussell1999/) - [Anna Geller](https://www.linkedin.com/in/anna-geller-12a86811a/) - [Juan Manuel Perafan](https://www.linkedin.com/in/jmperafan/) - [Arsalan Noorafkan](https://www.linkedin.com/in/arsalan0/) Past instructors: - [Victoria Perez Mola](https://www.linkedin.com/in/victoriaperezmola/) - [Ankush Khanna](https://linkedin.com/in/ankushkhanna2) - [Sejal Vaidya](https://www.linkedin.com/in/vaidyasejal/) - [Irem Erturk](https://www.linkedin.com/in/iremerturk/) - [Luis Oliveira](https://www.linkedin.com/in/lgsoliveira/) - [Zach Wilson](https://www.linkedin.com/in/eczachly) ## About DataTalks.Club

DataTalks.Club

DataTalks.Club is a global online community of data enthusiasts. It's a place to discuss data, learn, share knowledge, ask and answer questions, and support each other.

WebsiteJoin Slack CommunityNewsletterUpcoming EventsYouTubeGitHubLinkedInTwitter

All the activity at DataTalks.Club mainly happens on [Slack](https://datatalks.club/slack.html). We post updates there and discuss different aspects of data, career questions, and more.