Azure Databricks and Spark SQL Python

Hands-on course focusing on data engineering and analysis on Azure Databricks using Spark SQL
Azure Databricks and Spark SQL Python
File Size :
3.08 GB
Total length :
7h 21m



Malvik Vaghadia


Last update

Last updated 11/2022



Azure Databricks and Spark SQL Python

What you’ll learn

Azure Databricks
Data Lakehouse
Delta Lakes
Spark SQL
Big Data
Real World Scenarios

Azure Databricks and Spark SQL Python


Basic SQL
Basic Python


Databricks is one of the most in demand big data tools around. It is a fast, easy, and collaborative Spark based big data analytics service designed for data science, ML and data engineering workflows.The course is packed with lectures, code-along videos and dedicated challenge sections. This should be more than enough to keep you engaged and learning! As an added bonus you will also have lifetime access to all the lectures… and I have provided detailed notebooks as a downloadable asset, the notebooks will contain step by step documentation with additional resources and links.I have ensured that the delivery of the course is engaging and concise, the curriculum is extensive yet delivered in an efficient way. The course will provide you with hands-on training utilising a variety of different data sets.The course is aimed at teaching you PySpark, Spark SQL in Python and the Databricks Lakehouse Architecture.You will primarily be using Databricks on Microsoft Azure in addition to other services such as Azure Data Lake Storage Gen 2.The course will cover a variety of areas including:Set Up and OverviewAzure Databricks NotebooksSpark SQLReading and Writing DataData Analysis and Transformation with Spark SQL in PythonCharts and Dashboards in Databricks NotebooksDatabricks Medallion ArchitectureAccessing Data in Cloud Object StorageHive MetastoreDatabases, Tables and Views in DatabricksDelta Lake / Databricks Lakehouse Architecture


Section 1: Course Overview / Introduction to Spark and Databricks

Lecture 1 Course Introduction

Lecture 2 Big Data

Lecture 3 Hadoop, Spark and Databricks

Lecture 4 Apache Spark Architecture

Lecture 5 Spark vs Databricks Comparison

Lecture 6 Resource: Comparing Apache Spark vs Databricks

Section 2: Azure and Databricks Set Up

Lecture 7 Azure Account Set Up

Lecture 8 Azure UI Overview

Lecture 9 Resource: Azure Resources

Lecture 10 Creating your Databricks Service

Lecture 11 Databricks UI Overview

Lecture 12 Clusters

Lecture 13 Resource: Pricing, Cluster Pools and Runtime Versions

Lecture 14 How to use Databricks Notebooks

Lecture 15 Mix Languages and add Markdown text in your Notebook

Lecture 16 Databricks Utilities Module and FileStore Utilities

Lecture 17 Resource: How to use Notebooks

Lecture 18 IMPORTANT – Download Course Resource Notebooks

Lecture 19 Cost Management and Cancelling your Subscription

Lecture 20 Resource: Cancelling your Azure Subscription

Section 3: Reading and Writing Data

Lecture 21 Dataset Download

Lecture 22 Databricks FileStore

Lecture 23 Resource: File Types

Lecture 24 Reading Data

Lecture 25 Writing Data

Lecture 26 Parquet Files

Lecture 27 Deleting Files and Folders

Section 4: Data Analysis and Transformation with SparkSQL

Lecture 28 Selecting and Renaming Columns

Lecture 29 Adding New Columns

Lecture 30 Changing Data Types

Lecture 31 Math Functions and Simple Arithmetic

Lecture 32 Sort Functions

Lecture 33 String Functions

Lecture 34 Datetime Functions

Lecture 35 Filtering DataFrames

Lecture 36 Conditional Statements

Lecture 37 Using SQL Expressions with expr()

Lecture 38 Removing Columns

Lecture 39 Grouping your DataFrame

Lecture 40 Pivot your DataFrame

Lecture 41 Joining DataFrames

Lecture 42 Union

Lecture 43 Unpivot your DataFrame

Lecture 44 Pandas

Section 5: Utilising the Medallion Architecture in Databricks

Lecture 45 Medallion Architecture

Lecture 46 Resource: Medallion Architecture

Section 6: Challenge Section: Customer Orders

Lecture 47 Dataset Download and DBFS Upload

Lecture 48 Assignment 1: Bronze to Silver

Lecture 49 Assignment 1 Solutions Walkthrough

Lecture 50 Assignment 2: Silver to Gold

Lecture 51 Assignment 2 Solutions Walkthrough

Section 7: Visualizations and Dashboards

Lecture 52 Visualizations and Dashboards

Section 8: Accessing Data from Azure Data Lake Storage (ADLS) with Databricks

Lecture 53 Creating an ADLS Gen2 Account

Lecture 54 (Optional) Storage Explorer

Lecture 55 Accessing via Access Keys

Lecture 56 Accessing via SAS Token

Lecture 57 Mounting ADLS to DBFS Overview

Lecture 58 Mounting ADLS to DBFS Demo

Lecture 59 Secret Scopes

Lecture 60 End to End Walkthrough Example

Section 9: Hive Metastore, Databases, Tables and Views

Lecture 61 Running SQL on DataFrames

Lecture 62 Hive Metastore and Creating Databases

Lecture 63 Managed Tables

Lecture 64 Specifying a Location for your Underlying Managed Table Data

Lecture 65 Unmanaged (External) Tables

Lecture 66 Permanent Views

Section 10: Challenge Section: Employees

Lecture 67 Dataset Download and ADLS Upload

Lecture 68 Assignment: Employees

Lecture 69 Assignment Solutions Walkthrough

Section 11: Databricks Data Lakehouse / Delta Lake

Lecture 70 Databricks Data Lakehouse / Delta Lake Overview

Lecture 71 Delta Lake Data Files

Lecture 72 Deleting and Updating Records

Lecture 73 Merge Into

Lecture 74 Table Utility Commands

Section 12: Modularize Code and Link Notebooks

Lecture 75 Running a Notebook from another Notebook

Lecture 76 Text Widgets

Section 13: Challenge Section: Health Updates

Lecture 77 Dataset Download and Overview

Lecture 78 Assignment 1 Overview

Lecture 79 Assignment 1 Solutions Walthrough

Lecture 80 Assignment 2 Overview (Difficult!)

Lecture 81 Assignment 2 Solutions Walkthrough

Anyone interested in working with Big Data and Spark,Anyone interested in working with Databricks,Anyone interested in working with cloud platforms

Course Information:

Udemy | English | 7h 21m | 3.08 GB
Created by: Malvik Vaghadia

You Can See More Courses in the Developer >> Greetings from

New Courses

Scroll to Top