Azure Synapse Analytics For Data Engineers Hands On Project
What you’ll learn
You will learn how to build a real world project using Azure Synapse Analytics. This course has been taught using real world data from NYC Taxi Trips data
You will acquire professional level data engineering skills in Azure Synapse Analytics
You will learn how to create SQL scripts and Spark notebooks in Azure Synapse Analytics
You will learn how to create dedicated SQL pools and spark pools in Azure Synapse Analytics
You will learn how to enable synapse link and enable analytic store in Cosmos DB
You will learn how to ingest and transform data Serverless SQL Pool and Spark Pool
You will learn how to load data into dedicated SQL Pool
You will learn how to serve data to Power BI from Serverless SQL Pool and Dedicated SQL Pool
You will learn how to execute scripts and notebooks using Synapse Pipelines and Triggers
You will learn how to do operational reporting from the data stored in Cosmos DB using Azure Synapse Analytics
You will learn how to build reports in Power BI for the data stored in Azure Synapse Analytics
Requirements
All the code and step-by-step instructions are provided, but the skills below will greatly benefit your journey
Basic SQL knowledge will be required
Basic Python programming experience will be required
Knowledge of cloud fundamentals will be beneficial, but not necessary
Azure subscription will be required, If you don’t have one we will create a free account in the course
Description
Welcome! I am looking forward to helping you with learning one of the in-demand data engineering tools in the cloud, Azure Synapse Analytics! This course has been taught with implementing a data engineering solution using Azure Synapse Analytics for a real world project of analysing and reporting on NYC Taxi trips data.This is like no other course in Udemy for Azure Synapse Analytics. Once you have completed the course including all the assignments, I strongly believe that you will be in a position to start a real world data engineering project on your own and also proficient on Azure Synapse Analytics. The primary focus of the course is Azure Synapse Analytics, but it also covers the relevant concepts and connectivity to the other technologies mentioned. The course follows a logical progression of a real world project implementation with technical concepts being explained and the scripts and notebooks being built at the same time. Even though this course is not specifically designed to teach you the skills required for passing the exams Azure Data Engineer Associate Certification [DP-203] or Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI [DP-500], it can greatly help you get most of the necessary skills required for the exams. I value your time as much as I do mine. So, I have designed this course to be fast-paced and to the point. Also, the course has been taught with simple English and no jargons. I start the course from basics and by the end of the course you will be proficient in the technologies used. Currently the course teaches you the followingAzure Synapse Analytics Architecture Serverless SQL PoolSpark PoolDedicated SQL PoolSynapse PipelinesSynapse Link for Cosmos DB / Hybrid Transactional and Analytical Processing (HTAP) capabilityPower BI Integration with Azure Synapse AnalyticsAzure Data Lake Storage Gen2 integration with Azure Synapse AnalyticsProject using NYC Taxi Trips data using the above technologiesPlease note that the following are not currently coveredData FlowsAdvanced concepts around Dedicated SQL PoolSpark ProgrammingSQL Fundamentals
Overview
Section 1: Introduction
Lecture 1 Course Introduction
Lecture 2 Course Materials Download
Lecture 3 Useful Links
Section 2: Azure Subscription (Optional)
Lecture 4 Creating Azure Account
Lecture 5 Azure Portal Overview
Section 3: Azure Synapse Analytics Overview
Lecture 6 Introduction to Azure Synapse Analytics
Lecture 7 History of Data Warehouse/ Data Lakes
Lecture 8 Emergence of Azure Synapse Analytics
Lecture 9 Create Azure Synapse Analytics Workspace
Lecture 10 Azure Synapse Analytics Workspace Overview
Lecture 11 Azure Synapse Studio Overview
Lecture 12 Data Hub Overview
Lecture 13 Develop Hub Overview
Lecture 14 Integrate Hub Overview
Lecture 15 Monitor Hub Overview
Lecture 16 Manage Hub Overview
Section 4: NYC Taxi Project Overview
Lecture 17 Section Overview
Lecture 18 NYC Taxi Data Source Overview
Lecture 19 NYC Taxi Data Files Overview
Lecture 20 Upload NYC Taxi Data to Data Lake
Lecture 21 Project Requirements Overview
Lecture 22 Solution Architecture Overview
Section 5: Serverless SQL Pool – Overview
Lecture 23 Section Overview
Lecture 24 Introduction to Serverless SQL Pool
Lecture 25 Serverless SQL Pool Cost Control
Lecture 26 Connect from Azure Data Studio to Serverless SQL Pool (Optional)
Section 6: Serverless SQL Pool – Query CSV
Lecture 27 Section Overview
Lecture 28 OPENROWSET Function Overview
Lecture 29 Query Taxi Zone File (CSV File)
Lecture 30 Specify Data Types
Lecture 31 Specify Collation
Lecture 32 Query Subset of Columns
Lecture 33 Debugging & Identifying Errors
Lecture 34 Use External Data Source
Lecture 35 Query Calendar File (CSV File) – Assignment
Lecture 36 Query Vendor File (Quoted and Escaped Columns)
Lecture 37 Query Trip Type File (TSV File) – Assignment
Section 7: Serverless SQL Pool – Query JSON
Lecture 38 Section Overview
Lecture 39 Query Payment Type (Single Line JSON) – JSON_VALUE Function
Lecture 40 Query Payment Type (Single Line JSON) – OPENJSON Function
Lecture 41 Query JSON Array
Lecture 42 Query Standard JSON
Lecture 43 Query Multi Line JSON (Assignment)
Section 8: Serverless SQL Pool – Query Folders & Multiple Files
Lecture 44 Query Folders and Subfolders
Lecture 45 File Metadata Functions
Section 9: Serverless SQL Pool – Query Columnar Formats
Lecture 46 Query Single Parquet File
Lecture 47 Query Folders and Sub Folders (Assignment)
Lecture 48 Query Delta files
Section 10: Serverless SQL Pool – Data Discovery
Lecture 49 Data Discovery Overview
Lecture 50 Identify Duplicates
Lecture 51 Data Quality Checks
Lecture 52 Joining Files
Lecture 53 Transform Data
Lecture 54 Data Discovery Assignment
Section 11: Serverless SQL Pool – Data Virtualisation
Lecture 55 Data Virtualisation Overview
Lecture 56 Introduction to External Tables
Lecture 57 Create External Table – CSV
Lecture 58 Handling Rejections
Lecture 59 Create External Table – CSV (Assignment)
Lecture 60 Create External Table – Parquet
Lecture 61 Create External Table – Delta (Assignment)
Lecture 62 Views – Introduction
Lecture 63 Create View for JSON files
Lecture 64 Create View for JSON files (Assignment)
Lecture 65 Partition Pruning
Section 12: Serverless SQL Pool – Data Ingestion
Lecture 66 Section Overview
Lecture 67 CREATE EXTERNAL TABLE AS (CETAS)
Lecture 68 Transform to Parquet
Lecture 69 Transform to Parquet (Assignment)
Lecture 70 Transform JSON to Parquet
Lecture 71 Transform JSON to Parquet (Assignment)
Lecture 72 Transform Partitioned Data to Parquet
Lecture 73 Introduction to Stored Procedure
Lecture 74 Transform Partitioned Data – Solution
Lecture 75 Transform Partitioned Data – Solution Lab
Lecture 76 Create View (assignment)
Section 13: Serverless SQL Pool – Data Transformation
Lecture 77 Section Overview
Lecture 78 Project Requirements
Lecture 79 Select Statement
Lecture 80 Stored Procedure
Lecture 81 Create View
Lecture 82 Assignment
Section 14: Synapse Pipelines & Triggers
Lecture 83 Section Overview
Lecture 84 Synapse Pipelines Overview
Lecture 85 Synapse Pipeline Components
Lecture 86 Transformation Pipeline Design – Taxi Zone
Lecture 87 Create Linked Service & Dataset
Lecture 88 Create Pipeline – Delete Activity
Lecture 89 Create Pipeline – Script Activity
Lecture 90 Create Pipeline – Stored Procedure Activity
Lecture 91 Parameters & Variables Overview
Lecture 92 Parameters & Variables (Demo)
Lecture 93 Dynamic Pipeline
Lecture 94 Pipeline Design – Partitioned File
Lecture 95 Create Pipeline (Partitioned File) – Delete Activity
Lecture 96 Create Pipeline (Partitioned File) – Stored Procedure/ Script Activity
Lecture 97 Create Pipeline (Partitioned File) – Assignment (Gold Table)
Lecture 98 Pipeline Dependencies
Lecture 99 Manual Trigger
Lecture 100 Automated Trigger
Section 15: Spark Pool
Lecture 101 Section Overview
Lecture 102 Spark Pool Overview
Lecture 103 Create Spark Pool
Lecture 104 Notebooks Overview
Lecture 105 Integration between spark pool and serverless sql pool
Lecture 106 Trip Data Aggregation
Lecture 107 Create Synapse Pipeline
Section 16: Power BI Integration
Lecture 108 Section Overview
Lecture 109 Power BI Integration Overview
Lecture 110 Connecting from PowerBI Desktop
Lecture 111 Publish to PowerBI Workspace
Lecture 112 Access to Power BI
Lecture 113 Synapse Studio Power BI Integration
Lecture 114 Report from Synapse Studio
Lecture 115 Campaign Report
Lecture 116 Demand Report
Section 17: Synapse Link/ HTAP
Lecture 117 Section Overview
Lecture 118 Synapse Link Overview
Lecture 119 Project Overview
Lecture 120 Create Cosmos DB Service
Lecture 121 Data Preparation
Lecture 122 Query using Serverless SQL Pool
Lecture 123 Query using Spark Pool
Section 18: Dedicated SQL Pool
Lecture 124 Section Overview
Lecture 125 Dedicated SQL Pool Overview
Lecture 126 Create dedicated SQL Pool
Lecture 127 Project Requirement
Lecture 128 Copy to Dedicated Pool – Polybase
Lecture 129 Copy to Dedicated Pool – Copy Command
Lecture 130 Connection to Azure Data Studio & PowerBI
Section 19: Next Steps
Lecture 131 Good Luck
Lecture 132 Bonus Lecture
University students looking for a career in Data Engineering,IT developers working on other disciplines trying to move to Data Engineering,Data Engineers/ Data Warehouse Developers currently working on on-premises technologies, or other cloud platforms such as AWS or GCP who want to learn Azure Data Technologies,Data Architects looking to gain an understanding about Azure Data Engineering stack
Course Information:
Udemy | English | 13h 28m | 4.77 GB
Created by: Ramesh Retnasamy
You Can See More Courses in the Developer >> Greetings from CourseDown.com