Azure Data Factory For Data Engineers Project on Covid19

Real world project for Data Engineers using Azure Data Factory, SQL, Data Lake, Databricks, HDInsight, CI/CD [DP203]
Azure Data Factory For Data Engineers Project on Covid19
File Size :
4.78 GB
Total length :
12h 46m

Category

Instructor

Ramesh Retnasamy

Language

Last update

1/2023

Ratings

4.6/5

Azure Data Factory For Data Engineers Project on Covid19

What you’ll learn

You will learn how to build a real-world data pipeline in Azure Data Factory (ADF). This course has been taught using real world data used to report Covid-19 trends.
You will acquire good Data Engineering skills in Azure using Azure Data Factory (ADF), Azure Data Lake Storage Gen2, Azure SQL Database, Azure Blob Storage and Azure Monitor
You will learn how to ingest data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using Azure Data Factory (ADF)
You will learn how to transform data using Data Flows in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to transform data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to transform data using Azure HDInsight Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to load transformed data from Azure Data Lake Storage Gen2 to Azure SQL Database using Azure Data Factory (ADF)
You will learn extensively about Triggers in Azure Data Factory (ADF) and how to use them to schedule the data pipelines.
You will learn how to monitor pipelines using Azure Data Factory (ADF), Azure Monitor and Log Analytics with a real-world project.
You will learn how to build production ready pipelines and good practices and naming standards
You will gain most of the skills required to pass the Azure Data Engineer Associate certification exams DP200 and DP200, but the primary objective of the course is not to teach you to pass the exams.

Azure Data Factory For Data Engineers Project on Covid19

Requirements

Basic understanding about cloud computing will be useful, but not necessary.
Experience in Azure is not required, I will take you through everything necessary to learn this course and build the project
An Azure Account is required, If you don’t have one we will create a free account in the course

Description

Major updates to the course since the launchJanuary 2023 – Updates to section 3 (Environment Set-up) to reflect the change to the User Interface. Re-recorded 5 lessons.  November 2022 – Addition of sections 15 & 16 focusing on Continuous Integration & Continuous Delivery (CI/CD)Welcome! I am looking forward to helping you with learning one of the in-demand data engineering tools in the cloud, Azure Data Factory (ADF)! This course has been taught with implementing a data engineering solution using Azure Data Factory (ADF) for a real world problem of reporting Covid-19 trends and prediction of the spread of this virus.This is like no other course in Udemy for Azure Data Factory or Data Engineering Technologies. Once you have completed the course including all the assignments, I strongly believe that you will be in a position to start a real world data engineering project on your own and also proficient on Azure Data Factory (ADF). I have also included lessons on the storage solutions such as Azure Data Lake Storage, Azure Blob Storage, Azure SQL Database etc. Also, there are lessons on Azure HDInsight and Azure Databricks. I have even included lessons on building reports using Power BI on the data processed by the Azure Data Factory data pipelines. I have considered the machine learning models to be out of scope. You can use this data to build your own models and predict the spread.The course follows a logical progression of real world project implementation with technical concepts being explained and the data pipelines in Azure Data Factory (ADF) being built at the same time. Even-though this course is not specifically designed to teach you the skills required for passing the Azure Data Engineer Associate Certification exam DP203, it can greatly help you get most of the necessary skills required for the exam. I value your time as much as I do mine. So, I have designed this course to be fast-paced and to the point. Also, the course has been taught with simple English and no jargons. I start the course from basis and by the end of the course you will be proficient in the technologies used. Currently the course teaches you the followingAzure Data FactoryBuilding a solution architecture for a data engineering solution using Azure Data Engineering technologies such as Azure Data Factory (ADF), Azure Data Lake Gen2, Azure Blob Storage, Azure SQL Database, Azure Databricks, Azure HDInsight and Microsoft PowerBI.Integrating data from HTTP clients, Azure Blob Storage and Azure Data Lake Gen2 using Azure Data Factory.Branching and Chaining activities in Azure Data Factory (ADF) Pipelines using control flow activities such as Get Metadata. If Condition, ForEach, Delete, Validation etc.Using Parameters and Variables in Pipelines, Datasets and LinkedServices to create a metadata driven pipelines in Azure Data Factory (ADF)Debugging the data pipelines and resolving issues.Scheduling pipelines using triggers such as Event Trigger, Schedule Trigger and Tumbling Window Trigger in Azure Data Factory (ADF)Creating Mapping Data Flows to create transformation logic. The course covers all of the transformation steps such as Source, Filter, Select, Pivot, Lookup, Conditional Split, Derived Column, Aggregate, Join and Sink transformation.Debugging data flows, investigating issues, fixing failures etcImplementing Azure Data Factory pipelines to invoke Mapping Data Flows and executing them.Creating ADF pipelines to execute HDInsight activities and carry out data transformations.Creating ADF pipelines to execute Databricks Notebook activities to carry out transformations.Creating dependency between pipelines to orchestrate the data flowCreating dependency between triggers to orchestrate the data flowMonitoring data pipelines, creating alerts, reporting of metrics from the Azure Data Factory Monitor.Monitoring of Data Factory pipelines using Azure Monitor and setting diagnostic setting to be forwarded to Azure Storage Account or Log Analytics Workspace.Creating Log Analytics workspace, creating workbooks and charts from log analytics on the Azure Data Factory pipelinesImplementing the Azure Data Factory Analytics monitoring tool and how to extend the capability further.Azure Storage SolutionsCreating Azure Storage Account, Creating containers, Uploading data, Access Control (IAM), Using Azure Storage explorer to interact with the storage accountCreating Azure Data Lake Gen2, Creating containers, Uploading data, Access Control (IAM), Using Azure Storage explorer to interact with the storage accountCreating Azure SQL Database, Pricing Tiers, Creating Admin User, Creating Tables, Loading Data and Querying the database.Azure HDInsight & DatabricksCreating HDInsight Clusters, Interacting with the UI, Using Ambari, Creating Hive tables, Invoking HDInsight activities from Azure Data FactoryCreating Azure Databricks Workspace, Creating Databricks clusters, Mounting storage accounts, Creating Databricks notebooks, performing transformations using Databricks notebooks, Invoking Databricks notebooks from Azure Data Factory.

Overview

Section 1: Introduction

Lecture 1 Course Introduction

Lecture 2 Course Structure

Lecture 3 Course Slides Download

Section 2: Overviews

Lecture 4 Azure Data Factory Overview

Lecture 5 Project Overview

Lecture 6 Solution Architecture Overview

Lecture 7 Azure Storage Solutions Overview

Lecture 8 Useful Links & Resources

Section 3: Environment Set-up

Lecture 9 Environment Set-up – Module Overview

Lecture 10 Creating Azure Free Account

Lecture 11 Azure Portal Overview

Lecture 12 Creating Azure Data Factory

Lecture 13 Creating Azure Storage Account

Lecture 14 Installing Azure Storage Explorer

Lecture 15 Creating Azure Data Lake Storage Gen2

Lecture 16 Creating Azure SQL Database

Lecture 17 Installing Azure Data Studio

Lecture 18 Useful Links & Resources

Section 4: Data Ingestion from Azure Blob

Lecture 19 Data Ingestion from Azure Blob – Module Overview

Lecture 20 Copy Activity Overview

Lecture 21 Environment Preparation

Lecture 22 Naming Standards

Lecture 23 Linked Services & Data Sets

Lecture 24 Creating ADF Pipeline

Lecture 25 Control Flow Activities (1) – Validation Activity

Lecture 26 Control Flow Activities (2) – Get Metadata, If Condition, Web Activities

Lecture 27 Control Flow Activities (3) – Delete Activity

Lecture 28 ADF Triggers Overview

Lecture 29 Creating Event Trigger

Lecture 30 Useful Links & Resources

Section 5: Data Ingestion From HTTP

Lecture 31 Data Ingestion From HTTP – Module Overview

Lecture 32 Important – Recent Changes to ECDC Data

Lecture 33 ECDC Data Overview

Lecture 34 Create Pipeline

Lecture 35 Frequently Asked Questions – Please Read

Lecture 36 Pipeline Variables

Lecture 37 Pipeline Parameters & Schedule Trigger

Lecture 38 Control Flow Activities

Lecture 39 ADF potential bug in the next lesson – Please Read

Lecture 40 Linked Service Parameters

Lecture 41 Metadata Driven Pipeline

Lecture 42 Useful Links & Resources

Section 6: Data Flows – Cases & Deaths Data Transformation

Lecture 43 Data Flows(1) – Module Overview

Lecture 44 Introduction to Data Flows

Lecture 45 Data Flow UI Overview

Lecture 46 Transformation Requirement Overview

Lecture 47 Source Transformation

Lecture 48 Filter Transformation

Lecture 49 Select Transformation

Lecture 50 Pivot Transformation

Lecture 51 Lookup Transformation

Lecture 52 Sink Transformation

Lecture 53 Create ADF Pipeline

Lecture 54 Useful Links & Resources

Section 7: Data Flows – Hospital Admissions Data Transformation

Lecture 55 Data Flows(2) – Module Overview

Lecture 56 Transformation Requirement

Lecture 57 Source Transformation (Assignment)

Lecture 58 Select Transformation (Assignment)

Lecture 59 Lookup Country (Assignment)

Lecture 60 Conditional Split Transformation

Lecture 61 Source Transformation – DimDate

Lecture 62 Derived Column Transformation

Lecture 63 Aggregate Transformation

Lecture 64 Join Transformation

Lecture 65 Pivot Transformation (Assignment)

Lecture 66 Sort Transformation

Lecture 67 Sink Transformation (Assignment)

Lecture 68 Create ADF Pipeline (Assignment)

Lecture 69 Useful Links & Resources

Section 8: Prepare Data for HDInsight & Data Bricks

Lecture 70 Prepare Data for HDInsight & Data Bricks

Section 9: HDInsight Activity

Lecture 71 HDInsight Activity – Module Overview

Lecture 72 Note for Azure Free Tier & Student Subscription users

Lecture 73 Create HDInsight Cluster

Lecture 74 Tour of the HDInsight UI

Lecture 75 Transformation Requirement

Lecture 76 Hive Script Walkthrough

Lecture 77 Create ADF Pipeline with Hive Activity

Lecture 78 Delete HDInsight Cluster

Lecture 79 Useful Links & Resources

Section 10: Data Bricks Activity

Lecture 80 Data Bricks Activity – Module Overview

Lecture 81 Cluster Configuration – Only for Free and Student Subscriptions

Lecture 82 Create Azure Databricks Service

Lecture 83 Create Azure Databricks Cluster

Lecture 84 Mounting Azure Data Lake Storage

Lecture 85 Transformation Requirements

Lecture 86 Create ADF Pipeline Databricks Notebook Activity

Lecture 87 Only for students using Free Azure Subscription

Lecture 88 Useful Links & Resources

Section 11: Copy Data to Azure SQL

Lecture 89 Copy Data to Azure SQL – Module Overview

Lecture 90 Copy Data Activity – Cases & Deaths Data

Lecture 91 Copy Data Activity – Hospital Admissions Data

Lecture 92 Copy Data Activity – Testing Data

Lecture 93 Useful Links and Resources

Section 12: Making Pipelines Production Ready

Lecture 94 Making Pipelines Production Ready – Module Overview

Lecture 95 Option 1 – Pipeline Dependency

Lecture 96 Option 2 – Trigger Dependency

Lecture 97 Useful Links & Resources

Section 13: Monitoring

Lecture 98 Monitoring – Module Overview

Lecture 99 What to Monitor & How

Lecture 100 Azure Data Factory Monitor

Lecture 101 Creating Alerts

Lecture 102 Monitor Pipeline Failures

Lecture 103 Re-run Failed Pipelines

Lecture 104 Reporting on Metrics

Lecture 105 Introduction to Azure Monitor

Lecture 106 Introduction to Log Analytics

Lecture 107 Log Analytics Further capabilities

Lecture 108 Azure data factory analytics

Lecture 109 Useful Links & Resources

Section 14: Power BI Reports

Lecture 110 Power BI Reports – Module Overview

Lecture 111 Introduction to PowerBI Desktop

Lecture 112 Walk through the Covid-19 Report

Lecture 113 Useful Links & Resources

Section 15: Continuous Integration / Continuous Delivery (CI/CD)

Lecture 114 Continuous Integration/ Continuous Delivery – Module Overview

Lecture 115 Introduction to Continuous Integration/ Continuous Delivery (CI/CD)

Lecture 116 Introduction to CI/CD for Azure Data Factory

Lecture 117 Overview of Azure DevOps

Lecture 118 Azure DevOps Environment Set-up

Lecture 119 Azure Data Factory Environment Set-up

Lecture 120 Azure Data Factory Git Configuration

Lecture 121 Azure Data Factory Code Development using Git

Lecture 122 Option 1 – Manual Build

Lecture 123 Option 1 – Release Pipeline Design

Lecture 124 Option 1 – Creating ARM Deployment Task

Lecture 125 Option 1 – Testing ARM Deployment Task

Lecture 126 Option 1 – Pitfalls of ARM Deployment Task

Lecture 127 Option 1 – Pre and Post Deployment Tasks

Lecture 128 Option 1 – Pipeline Variables

Lecture 129 Option 1 – Add Production Stage

Lecture 130 Option 2 – Overview

Lecture 131 Option 2 – YAML Build Pipeline Script Walkthrough

Lecture 132 Option 2 – Create YAML Build Pipeline

Lecture 133 Option 2 – Update Release Pipeline

Lecture 134 Option 2 – CI/CD End to End Testing

Lecture 135 Useful Links & Resources

Section 16: CI/CD Scenario – Data Lake Access

Lecture 136 Access to Data Lake Storage Overview

Lecture 137 Data Lake Storage Set-up

Lecture 138 Using Managed Identity – Grant access to Data Lake

Lecture 139 Using Managed Identity – Create Data Factory Pipeline

Lecture 140 Using Managed Identity – Release Pipeline changes

Lecture 141 Using Access Keys – Solution Options Overview

Lecture 142 Using Access Keys – Key Vault Set-up

Lecture 143 Using Access Keys – Create Data Factory Pipeline

Lecture 144 Using Access Keys – Release Pipeline Changes

Section 17: Conclusion

Lecture 145 Congratulations & Good Luck

Lecture 146 Bonus Lecture

University students looking for a career in Data Engineering,IT developers working on other disciplines trying to move to Data Engineering,Data Engineers/ Data Warehouse Developers currently working on on-premises technologies, or other cloud platforms such as AWS or GCP who want to learn Azure Technologies,Data Architects looking to gain an understanding about Azure Data Engineering stack,Data Scientists who want extend their knowledge into data engineering

Course Information:

Udemy | English | 12h 46m | 4.78 GB
Created by: Ramesh Retnasamy

You Can See More Courses in the IT & Software >> Greetings from CourseDown.com

New Courses

Scroll to Top