The Complete HandsOn Introduction to Apache Airflow

Learn to author, schedule and monitor data pipelines through practical examples using Apache Airflow
The Complete HandsOn Introduction to Apache Airflow
File Size :
2.63 GB
Total length :
3h 28m

Category

Instructor

Marc Lamberti

Language

Last update

3/2023

Ratings

4.4/5

The Complete HandsOn Introduction to Apache Airflow

What you’ll learn

Create plugins to add functionalities to Apache Airflow.
Using Docker with Airflow and different executors
Master core functionalities such as DAGs, Operators, Tasks, Workflows, etc
Understand and apply advanced concepts of Apache Airflow such as XCOMs, Branching and SubDAGs.
The difference between Sequential, Local and Celery Executors, how do they work and how can you use them.
Use Apache Airflow in a Big Data ecosystem with Hive, PostgreSQL, Elasticsearch etc.
Install and configure Apache Airflow
Think, answer and implement solutions using Airflow to real data processing problems

The Complete HandsOn Introduction to Apache Airflow

Requirements

VirtualBox must be installed – A VM of 3Gb will have to be downloaded
At least 8 gigabytes of memory
Some prior programming or scripting experience. Python experience will help you a lot but since it’s a very easy language to learn, it shouldn’t be too difficult if you are not familiar with.

Description

Apache Airflow is an open-source  platform to programmatically author, schedule and monitor workflows. If you have many ETL(s) to manage, Airflow is a must-have.In this course you are going to learn everything you need to start using Apache Airflow through theory and pratical videos. Starting from very basic notions such as, what is Airflow and how it works, we will dive into advanced concepts such as, how to create plugins and make real dynamic pipelines.

Overview

Section 1: Course Introduction

Lecture 1 Prerequisites

Lecture 2 Course Objectives

Lecture 3 Who I am?

Lecture 4 Development Environment

Section 2: Getting Started with Airflow

Lecture 5 Why Airflow?

Lecture 6 What is Airflow?

Lecture 7 Core Components

Lecture 8 Core Concepts

Lecture 9 Airflow is not…

Lecture 10 Single Node Architecture

Lecture 11 Multi Node Architecture

Lecture 12 How does it work?

Lecture 13 [Practice] Installing Apache Airflow

Lecture 14 What is Docker?

Lecture 15 The docker-compose file

Lecture 16 Key Takeaways

Section 3: The important views of the Airflow UI

Lecture 17 The DAGs View

Lecture 18 The Grid View

Lecture 19 The Graph View

Lecture 20 The Landing Times View

Lecture 21 The Calendar View

Lecture 22 The Gantt View

Lecture 23 The Code View

Lecture 24 Wrap up!

Section 4: Coding Your First Data Pipeline with Airflow

Lecture 25 The Project

Lecture 26 Advices

Lecture 27 What is a DAG?

Lecture 28 DAG Skeleton

Lecture 29 What is an Operator?

Lecture 30 Providers

Lecture 31 Create a Table

Lecture 32 Create a connection

Lecture 33 The secret weapon!

Lecture 34 What is a Sensor?

Lecture 35 Is the API available?

Lecture 36 Extract users

Lecture 37 Process users

Lecture 38 Before running process_user

Lecture 39 What is a Hook?

Lecture 40 Store users

Lecture 41 Order matters!

Lecture 42 Your DAG in action!

Lecture 43 DAG Scheduling

Lecture 44 Backfilling: How does it work?

Lecture 45 Wrap up!

Section 5: The New Way of Scheduling DAGs

Lecture 46 Why do you need that feature?

Lecture 47 What is a Dataset?

Lecture 48 Adios schedule_interval!

Lecture 49 Create the Producer DAG

Lecture 50 Create the Consumer DAG

Lecture 51 Track your Datasets with the new view!

Lecture 52 Wait for many datasets

Lecture 53 Dataset limitations

Section 6: Databases and Executors

Lecture 54 What’s an executor?

Lecture 55 The default config

Lecture 56 The Sequential Executor

Lecture 57 The Local Executor

Lecture 58 The Celery Executor

Lecture 59 The current config

Lecture 60 Add the DAG parallel_dag.py into the dags folder

Lecture 61 Monitor your tasks with Flower

Lecture 62 Remove DAG examples

Lecture 63 Running tasks on Celery Workers

Lecture 64 What is a queue?

Lecture 65 Add a new Celery Worker

Lecture 66 Create a queue to better distribute tasks

Lecture 67 Send a task to a specific queue

Lecture 68 Concurrency, the parameters you must know!

Section 7: Implementing Advanced Concepts in Airflow

Lecture 69 Adios repetitive patterns

Lecture 70 Add the DAG group_dag.py

Lecture 71 How to use SubDAGs?

Lecture 72 Adios SubDAGs, welcome TaskGroups!

Lecture 73 Add the DAG xcom_dag.py

Lecture 74 Sharing data between tasks with XComs

Lecture 75 [Practice] XComs in action!

Lecture 76 Choosing a specific path in your DAG

Lecture 77 [Practice] Executing a task according to a condition

Lecture 78 Trigger rules or how tasks get triggered

Section 8: Creating Airflow Plugins with Elasticsearch and PostgreSQL

Lecture 79 Introduction

Lecture 80 What’s Elasticsearch?

Lecture 81 Running Elasticsearch with Airflow

Lecture 82 How the plugin system works?

Lecture 83 Create the connection

Lecture 84 Create the ElasticHook

Lecture 85 Add ElasticHook to the Plugin system

Lecture 86 Add the DAG elastic_dag.py

Lecture 87 Your Hook in Action!

Section 9: BONUS – APPENDIX

Lecture 88 [BLOG POST] How to use the DockerOperator with Templating and Apache Spark

Lecture 89 [BLOG POST] Apache Airflow with Kubernetes Executor

Lecture 90 [BLOG POST] How to use templates and macros in Apache Airflow

Lecture 91 [BLOG POST] How to use timezones in Apache Airflow

Lecture 92 [BLOG POST] How to use the BashOperator

Lecture 93 [BLOG POST] Variables in Apache Airflow: The Guide

Lecture 94 [BLOG POST] Best Practices in Apache Airflow (part 1)

Lecture 95 [VIDEO] Running Apache Airflow on a multi-nodes Kubernetes cluster locally

Lecture 96 [BLOG POST] The PostgresOperator: All you need to know

Lecture 97 [VIDEO] The DockerOperator: The Basics and more!

Lecture 98 [VIDEO] The New Of Scheduling your DAGs

Lecture 99 [VIDEO] Build a data pipeline with AWS, Snowflake and Airflow

Lecture 100 [VIDEO] What’s new in Airflow 2.4?

Lecture 101 BONUS : COUPON FOR MY OTHER COURSES!

People being curious about data engineering.,People who want to learn basic and advanced concepts about Apache Airflow.,People who like hands-on approach.

Course Information:

Udemy | English | 3h 28m | 2.63 GB
Created by: Marc Lamberti

You Can See More Courses in the Developer >> Greetings from CourseDown.com

New Courses

Scroll to Top