Data Engineering for Beginner using Google Cloud Python

Basic data engineering : python, pandas, google cloud platform (GCP) bigquery, spark on dataproc, gcs, data warehouse
Data Engineering for Beginner using Google Cloud Python
File Size :
3.30 GB
Total length :
8h 3m

Category

Instructor

Timotius Pamungkas

Language

Last update

1/2023

Ratings

4.4/5

Data Engineering for Beginner using Google Cloud Python

What you’ll learn

Basic data engineering, what is data engineering, why needed, how to do it from zero
Relational database model, database modelling for normalization design & hands-on using postgresql & python / pandas
NoSQL database model, denormalization design & hands-on using elasticsearch & python / pandas
Introduction to spark & spark cluster using google cloud platform

Data Engineering for Beginner using Google Cloud Python

Requirements

Understanding basic sql statements (select, insert, update, delete is sufficient)
Understanding basic python / pandas
The course uses google cloud platform. If you wants to do hands-on, you need to provide credit card detail for payment on google cloud. If you don’t, you can still watch the course video

Description

“Data is the new oil”. You might have heard the quote before. Data in digital era is as valuable as oil in industrial era. However, just like oil, raw data itself is not usable. Rather, the value is created when it is gathered completely and accurately, connected to other relevant data, and done so in a timely manner.Data engineers design and build pipelines that transform and transport data into a usable format. A different role, like data scientist or machine learning engineer then able to use the data into valuable business insight. Just like raw oil transformed into petrol to be used through complex process.To be a data engineer requires a lot of data literacy and practice. This course is the first step for you who want to know about data engineering. In this course, we will see theories and hands-on to introduce you to data engineering. As data field is very wide, this course will show you the basic, entry level knowledge about data engineering process and tools.This course is very suitable to build foundation for you to go to data field. In this course, we will learn about:Introduction to data engineeringRelational & non relational databaseRelational & non relational data modelTable normalizationFact & dimension tablesTable denormalization for data warehouseETL (Extract Transform Load) & data staging using pyhton pandasElasticsearch basicData warehouseNumbers every engineers should know & how it is related to big dataHadoopSpark cluster on google cloud dataprocData lakeImportant NotesData field is HUGE!  This course will be continuously updated, but for time being, this contains introduction to concept, and sample hands-on for data engineering. For now, this course is intended for beginner on data engineering. If you have some experience on programming and wonder about data engineering, this course is for you.If you have experience in data engineering field, this course might be too basic for you (although I’m very happy if you still purchase the course)If you never write python or SQL before, this course is not for you. To understand the course, you must have basic knowledge on SQL and pyhton.

Overview

Section 1: Introduction

Lecture 1 Welcome to This Course

Lecture 2 Course Structure & Coverage

Lecture 3 How To Get Maximum Value From This Course

Section 2: Introduction to Data Engineering

Lecture 4 What is Data Engineering?

Lecture 5 Data Engineering Example

Lecture 6 What is Data Modelling?

Section 3: Database

Lecture 7 What is Database

Lecture 8 Relational Database

Lecture 9 When Not To Use Relational Database?

Lecture 10 NoSQL Database

Lecture 11 Demo : Postgresql

Lecture 12 Demo : Python for Postgresql

Lecture 13 Demo : Elasticsearch

Lecture 14 Demo : Python for Elasticsearch

Section 4: Relational Database Model

Lecture 15 The Importance of Relational Data Model

Lecture 16 OLTP vs OLAP

Lecture 17 Database Normalization

Lecture 18 First Normal Form (1NF)

Lecture 19 Second Normal Form (2NF)

Lecture 20 Third Normal Form (3NF)

Lecture 21 Normalization Python Demo

Lecture 22 Normalization Tips

Lecture 23 Database Denormalization

Lecture 24 Denormalization Python Demo

Lecture 25 Fact & Dimension Tables

Lecture 26 Star Schema

Lecture 27 Star Schema Python Demo

Lecture 28 Snowflake Schema

Lecture 29 Galaxy Schema

Lecture 30 Extract Transform Load (ETL) & Staging Tables

Lecture 31 ETL & Staging Tables – Demo Overview

Lecture 32 ETL & Staging Tables – Python Demo 1

Lecture 33 ETL & Staging Tables – Python Demo 2

Lecture 34 To Insert or To Update?

Lecture 35 ETL & Staging Tables – Python Demo 3

Lecture 36 ETL & Staging Tables – Python Demo 4

Lecture 37 ETL & Staging Tables – Tips

Section 5: NoSQL Database Model

Lecture 38 Basic NoSQL Concept

Lecture 39 CAP Theorem

Lecture 40 Denormalization on Elasticsearch

Lecture 41 Elasticsearch Basic Usage

Lecture 42 Elasticsearch Index & Document

Lecture 43 Elasticsearch ETL – Overview

Lecture 44 Elasticsearch Query DSL

Lecture 45 Elasticsearch ETL – Python Demo

Section 6: Data Warehouse

Lecture 46 Business Perspective

Lecture 47 Technical Perspective

Lecture 48 More Fact & Dimension Table

Lecture 49 OLAP Cube

Lecture 50 On-Premise or Cloud?

Lecture 51 Various Techniques

Lecture 52 Demo Overview

Lecture 53 Demo 1 – PostgreSQL Data Warehouse

Lecture 54 Demo 2 – BigQuery Data Warehouse

Lecture 55 Demo 3 – Data Warehouse Operations

Section 7: Numbes Every Engineer Should Know

Lecture 56 Numbers Every Engineer Should Know

Lecture 57 Small Numbers

Lecture 58 Big Numbers

Section 8: Hadoop & Spark

Lecture 59 Hadoop Ecosystem

Lecture 60 Introducing Spark

Lecture 61 Spark Programming

Lecture 62 Data Formats

Lecture 63 Hello Spark

Lecture 64 Spark Demo – Dataframe

Lecture 65 Spark Demo – Spark SQL

Lecture 66 Spark & BigQuery – Setting Environment

Lecture 67 Spark & BigQuery – ETL Movies

Lecture 68 Spark & BigQuery – Lesson Learned

Section 9: Spark Cluster on Google Cloud (Dataproc)

Lecture 69 Spark Cluster – Overview

Lecture 70 Demo : Big Data

Lecture 71 Google Dataproc

Section 10: Data Lake

Lecture 72 Data Lake Overview

Lecture 73 Schema On Read

Lecture 74 Lake, not Swamp

Lecture 75 Google Data Catalog

Section 11: Resources & References

Lecture 76 Download Source Code & Datasets

Lecture 77 Bonus & Discount Codes

Beginner python developer curious about data engineering,Software engineer who wants to take the path of becoming data engineer,Technical architect, engineering manager, who wants to know overview of data engineering

Course Information:

Udemy | English | 8h 3m | 3.30 GB
Created by: Timotius Pamungkas

You Can See More Courses in the IT & Software >> Greetings from CourseDown.com

New Courses

Scroll to Top