Azure Data Factory For Data Engineers Project on Covid19
What you’ll learn
You will learn how to build a real-world data pipeline in Azure Data Factory (ADF). This course has been taught using real world data used to report Covid-19 trends.
You will acquire good Data Engineering skills in Azure using Azure Data Factory (ADF), Azure Data Lake Storage Gen2, Azure SQL Database, Azure Blob Storage and Azure Monitor
You will learn how to ingest data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using Azure Data Factory (ADF)
You will learn how to transform data using Data Flows in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to transform data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to transform data using Azure HDInsight Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to load transformed data from Azure Data Lake Storage Gen2 to Azure SQL Database using Azure Data Factory (ADF)
You will learn extensively about Triggers in Azure Data Factory (ADF) and how to use them to schedule the data pipelines.
You will learn how to monitor pipelines using Azure Data Factory (ADF), Azure Monitor and Log Analytics with a real-world project.
You will learn how to build production ready pipelines and good practices and naming standards
You will gain most of the skills required to pass the Azure Data Engineer Associate certification exams DP200 and DP200, but the primary objective of the course is not to teach you to pass the exams.
Requirements
Basic understanding about cloud computing will be useful, but not necessary.
Experience in Azure is not required, I will take you through everything necessary to learn this course and build the project
An Azure Account is required, If you don’t have one we will create a free account in the course
Description
Major updates to the course since the launchJanuary 2023 – Updates to section 3 (Environment Set-up) to reflect the change to the User Interface. Re-recorded 5 lessons. November 2022 – Addition of sections 15 & 16 focusing on Continuous Integration & Continuous Delivery (CI/CD)Welcome! I am looking forward to helping you with learning one of the in-demand data engineering tools in the cloud, Azure Data Factory (ADF)! This course has been taught with implementing a data engineering solution using Azure Data Factory (ADF) for a real world problem of reporting Covid-19 trends and prediction of the spread of this virus.This is like no other course in Udemy for Azure Data Factory or Data Engineering Technologies. Once you have completed the course including all the assignments, I strongly believe that you will be in a position to start a real world data engineering project on your own and also proficient on Azure Data Factory (ADF). I have also included lessons on the storage solutions such as Azure Data Lake Storage, Azure Blob Storage, Azure SQL Database etc. Also, there are lessons on Azure HDInsight and Azure Databricks. I have even included lessons on building reports using Power BI on the data processed by the Azure Data Factory data pipelines. I have considered the machine learning models to be out of scope. You can use this data to build your own models and predict the spread.The course follows a logical progression of real world project implementation with technical concepts being explained and the data pipelines in Azure Data Factory (ADF) being built at the same time. Even-though this course is not specifically designed to teach you the skills required for passing the Azure Data Engineer Associate Certification exam DP203, it can greatly help you get most of the necessary skills required for the exam. I value your time as much as I do mine. So, I have designed this course to be fast-paced and to the point. Also, the course has been taught with simple English and no jargons. I start the course from basis and by the end of the course you will be proficient in the technologies used. Currently the course teaches you the followingAzure Data FactoryBuilding a solution architecture for a data engineering solution using Azure Data Engineering technologies such as Azure Data Factory (ADF), Azure Data Lake Gen2, Azure Blob Storage, Azure SQL Database, Azure Databricks, Azure HDInsight and Microsoft PowerBI.Integrating data from HTTP clients, Azure Blob Storage and Azure Data Lake Gen2 using Azure Data Factory.Branching and Chaining activities in Azure Data Factory (ADF) Pipelines using control flow activities such as Get Metadata. If Condition, ForEach, Delete, Validation etc.Using Parameters and Variables in Pipelines, Datasets and LinkedServices to create a metadata driven pipelines in Azure Data Factory (ADF)Debugging the data pipelines and resolving issues.Scheduling pipelines using triggers such as Event Trigger, Schedule Trigger and Tumbling Window Trigger in Azure Data Factory (ADF)Creating Mapping Data Flows to create transformation logic. The course covers all of the transformation steps such as Source, Filter, Select, Pivot, Lookup, Conditional Split, Derived Column, Aggregate, Join and Sink transformation.Debugging data flows, investigating issues, fixing failures etcImplementing Azure Data Factory pipelines to invoke Mapping Data Flows and executing them.Creating ADF pipelines to execute HDInsight activities and carry out data transformations.Creating ADF pipelines to execute Databricks Notebook activities to carry out transformations.Creating dependency between pipelines to orchestrate the data flowCreating dependency between triggers to orchestrate the data flowMonitoring data pipelines, creating alerts, reporting of metrics from the Azure Data Factory Monitor.Monitoring of Data Factory pipelines using Azure Monitor and setting diagnostic setting to be forwarded to Azure Storage Account or Log Analytics Workspace.Creating Log Analytics workspace, creating workbooks and charts from log analytics on the Azure Data Factory pipelinesImplementing the Azure Data Factory Analytics monitoring tool and how to extend the capability further.Azure Storage SolutionsCreating Azure Storage Account, Creating containers, Uploading data, Access Control (IAM), Using Azure Storage explorer to interact with the storage accountCreating Azure Data Lake Gen2, Creating containers, Uploading data, Access Control (IAM), Using Azure Storage explorer to interact with the storage accountCreating Azure SQL Database, Pricing Tiers, Creating Admin User, Creating Tables, Loading Data and Querying the database.Azure HDInsight & DatabricksCreating HDInsight Clusters, Interacting with the UI, Using Ambari, Creating Hive tables, Invoking HDInsight activities from Azure Data FactoryCreating Azure Databricks Workspace, Creating Databricks clusters, Mounting storage accounts, Creating Databricks notebooks, performing transformations using Databricks notebooks, Invoking Databricks notebooks from Azure Data Factory.
Overview
Section 1: Introduction
Lecture 1 Course Introduction
Lecture 2 Course Structure
Lecture 3 Course Slides Download
Section 2: Overviews
Lecture 4 Azure Data Factory Overview
Lecture 5 Project Overview
Lecture 6 Solution Architecture Overview
Lecture 7 Azure Storage Solutions Overview
Lecture 8 Useful Links & Resources
Section 3: Environment Set-up
Lecture 9 Environment Set-up – Module Overview
Lecture 10 Creating Azure Free Account
Lecture 11 Azure Portal Overview
Lecture 12 Creating Azure Data Factory
Lecture 13 Creating Azure Storage Account
Lecture 14 Installing Azure Storage Explorer
Lecture 15 Creating Azure Data Lake Storage Gen2
Lecture 16 Creating Azure SQL Database
Lecture 17 Installing Azure Data Studio
Lecture 18 Useful Links & Resources
Section 4: Data Ingestion from Azure Blob
Lecture 19 Data Ingestion from Azure Blob – Module Overview
Lecture 20 Copy Activity Overview
Lecture 21 Environment Preparation
Lecture 22 Naming Standards
Lecture 23 Linked Services & Data Sets
Lecture 24 Creating ADF Pipeline
Lecture 25 Control Flow Activities (1) – Validation Activity
Lecture 26 Control Flow Activities (2) – Get Metadata, If Condition, Web Activities
Lecture 27 Control Flow Activities (3) – Delete Activity
Lecture 28 ADF Triggers Overview
Lecture 29 Creating Event Trigger
Lecture 30 Useful Links & Resources
Section 5: Data Ingestion From HTTP
Lecture 31 Data Ingestion From HTTP – Module Overview
Lecture 32 Important – Recent Changes to ECDC Data
Lecture 33 ECDC Data Overview
Lecture 34 Create Pipeline
Lecture 35 Frequently Asked Questions – Please Read
Lecture 36 Pipeline Variables
Lecture 37 Pipeline Parameters & Schedule Trigger
Lecture 38 Control Flow Activities
Lecture 39 ADF potential bug in the next lesson – Please Read
Lecture 40 Linked Service Parameters
Lecture 41 Metadata Driven Pipeline
Lecture 42 Useful Links & Resources
Section 6: Data Flows – Cases & Deaths Data Transformation
Lecture 43 Data Flows(1) – Module Overview
Lecture 44 Introduction to Data Flows
Lecture 45 Data Flow UI Overview
Lecture 46 Transformation Requirement Overview
Lecture 47 Source Transformation
Lecture 48 Filter Transformation
Lecture 49 Select Transformation
Lecture 50 Pivot Transformation
Lecture 51 Lookup Transformation
Lecture 52 Sink Transformation
Lecture 53 Create ADF Pipeline
Lecture 54 Useful Links & Resources
Section 7: Data Flows – Hospital Admissions Data Transformation
Lecture 55 Data Flows(2) – Module Overview
Lecture 56 Transformation Requirement
Lecture 57 Source Transformation (Assignment)
Lecture 58 Select Transformation (Assignment)
Lecture 59 Lookup Country (Assignment)
Lecture 60 Conditional Split Transformation
Lecture 61 Source Transformation – DimDate
Lecture 62 Derived Column Transformation
Lecture 63 Aggregate Transformation
Lecture 64 Join Transformation
Lecture 65 Pivot Transformation (Assignment)
Lecture 66 Sort Transformation
Lecture 67 Sink Transformation (Assignment)
Lecture 68 Create ADF Pipeline (Assignment)
Lecture 69 Useful Links & Resources
Section 8: Prepare Data for HDInsight & Data Bricks
Lecture 70 Prepare Data for HDInsight & Data Bricks
Section 9: HDInsight Activity
Lecture 71 HDInsight Activity – Module Overview
Lecture 72 Note for Azure Free Tier & Student Subscription users
Lecture 73 Create HDInsight Cluster
Lecture 74 Tour of the HDInsight UI
Lecture 75 Transformation Requirement
Lecture 76 Hive Script Walkthrough
Lecture 77 Create ADF Pipeline with Hive Activity
Lecture 78 Delete HDInsight Cluster
Lecture 79 Useful Links & Resources
Section 10: Data Bricks Activity
Lecture 80 Data Bricks Activity – Module Overview
Lecture 81 Cluster Configuration – Only for Free and Student Subscriptions
Lecture 82 Create Azure Databricks Service
Lecture 83 Create Azure Databricks Cluster
Lecture 84 Mounting Azure Data Lake Storage
Lecture 85 Transformation Requirements
Lecture 86 Create ADF Pipeline Databricks Notebook Activity
Lecture 87 Only for students using Free Azure Subscription
Lecture 88 Useful Links & Resources
Section 11: Copy Data to Azure SQL
Lecture 89 Copy Data to Azure SQL – Module Overview
Lecture 90 Copy Data Activity – Cases & Deaths Data
Lecture 91 Copy Data Activity – Hospital Admissions Data
Lecture 92 Copy Data Activity – Testing Data
Lecture 93 Useful Links and Resources
Section 12: Making Pipelines Production Ready
Lecture 94 Making Pipelines Production Ready – Module Overview
Lecture 95 Option 1 – Pipeline Dependency
Lecture 96 Option 2 – Trigger Dependency
Lecture 97 Useful Links & Resources
Section 13: Monitoring
Lecture 98 Monitoring – Module Overview
Lecture 99 What to Monitor & How
Lecture 100 Azure Data Factory Monitor
Lecture 101 Creating Alerts
Lecture 102 Monitor Pipeline Failures
Lecture 103 Re-run Failed Pipelines
Lecture 104 Reporting on Metrics
Lecture 105 Introduction to Azure Monitor
Lecture 106 Introduction to Log Analytics
Lecture 107 Log Analytics Further capabilities
Lecture 108 Azure data factory analytics
Lecture 109 Useful Links & Resources
Section 14: Power BI Reports
Lecture 110 Power BI Reports – Module Overview
Lecture 111 Introduction to PowerBI Desktop
Lecture 112 Walk through the Covid-19 Report
Lecture 113 Useful Links & Resources
Section 15: Continuous Integration / Continuous Delivery (CI/CD)
Lecture 114 Continuous Integration/ Continuous Delivery – Module Overview
Lecture 115 Introduction to Continuous Integration/ Continuous Delivery (CI/CD)
Lecture 116 Introduction to CI/CD for Azure Data Factory
Lecture 117 Overview of Azure DevOps
Lecture 118 Azure DevOps Environment Set-up
Lecture 119 Azure Data Factory Environment Set-up
Lecture 120 Azure Data Factory Git Configuration
Lecture 121 Azure Data Factory Code Development using Git
Lecture 122 Option 1 – Manual Build
Lecture 123 Option 1 – Release Pipeline Design
Lecture 124 Option 1 – Creating ARM Deployment Task
Lecture 125 Option 1 – Testing ARM Deployment Task
Lecture 126 Option 1 – Pitfalls of ARM Deployment Task
Lecture 127 Option 1 – Pre and Post Deployment Tasks
Lecture 128 Option 1 – Pipeline Variables
Lecture 129 Option 1 – Add Production Stage
Lecture 130 Option 2 – Overview
Lecture 131 Option 2 – YAML Build Pipeline Script Walkthrough
Lecture 132 Option 2 – Create YAML Build Pipeline
Lecture 133 Option 2 – Update Release Pipeline
Lecture 134 Option 2 – CI/CD End to End Testing
Lecture 135 Useful Links & Resources
Section 16: CI/CD Scenario – Data Lake Access
Lecture 136 Access to Data Lake Storage Overview
Lecture 137 Data Lake Storage Set-up
Lecture 138 Using Managed Identity – Grant access to Data Lake
Lecture 139 Using Managed Identity – Create Data Factory Pipeline
Lecture 140 Using Managed Identity – Release Pipeline changes
Lecture 141 Using Access Keys – Solution Options Overview
Lecture 142 Using Access Keys – Key Vault Set-up
Lecture 143 Using Access Keys – Create Data Factory Pipeline
Lecture 144 Using Access Keys – Release Pipeline Changes
Section 17: Conclusion
Lecture 145 Congratulations & Good Luck
Lecture 146 Bonus Lecture
University students looking for a career in Data Engineering,IT developers working on other disciplines trying to move to Data Engineering,Data Engineers/ Data Warehouse Developers currently working on on-premises technologies, or other cloud platforms such as AWS or GCP who want to learn Azure Technologies,Data Architects looking to gain an understanding about Azure Data Engineering stack,Data Scientists who want extend their knowledge into data engineering
Course Information:
Udemy | English | 12h 46m | 4.78 GB
Created by: Ramesh Retnasamy
You Can See More Courses in the IT & Software >> Greetings from CourseDown.com