BigQuery for Big data engineers Master Big Query Internals

A Complete & deep knowledge BigQuery guide for Data engineers & Analysts; Hands-On Bigquery via Console, CLI, Python lib
BigQuery for Big data engineers Master Big Query Internals
File Size :
2.26 GB
Total length :
8h 30m

Category

Instructor

J Garg - Real Time Learning

Language

Last update

1/2023

Ratings

4.4/5

BigQuery for Big data engineers Master Big Query Internals

What you’ll learn

Learn Full In & Out of Google Cloud BigQuery with proper HANDS-ON examples from scratch.
Get an Overview of Google Cloud Platform and a brief introduction to the set of services it provides.
Start with Bigquery core concepts like understanding its Architecture, Dataset, Table, View, Materialized View, Schedule queries, Limitations & Quotas.
ADVANCE Big query topics like Query Execution plan, Efficient schema design, Optimization techniques, Partitioning, Clustering, etc.
Build Big data pipelines using various Google Cloud Platform services – Dataflow, Pub/Sub, BigQuery, Cloud storage, Beam, Data Studio, Cloud Composer/Airflow.
Learn to interact with Bigquery using Web Console, Command Line, Python Client Library etc.
Learn Best practices to follow in Real-Time Projects for Performance and Cost saving for every component of Big query.
Bigquery Pricing models for Storage, Querying, API requests, DMLs and free operations.
Data-sets and Queries used in lectures are available in resources tab. This will save your typing efforts.

BigQuery for Big data engineers Master Big Query Internals

Requirements

Basic knowledge of SQL

Description

Note : This Bigquery course is NOT intended to teach SQL or PostgreSQL. The focus of the course is kept to give you In-depth knowledge of Google Bigquery concepts/Internals.”BigQuery is server-less, highly scalable, and cost-effective Data warehouse designed for Google cloud Platform (GCP) to store and query petabytes of data.”What’s included in the course ?Brief introduction to the set of services Google Cloud provides.Complete In-depth knowledge of Google BigQuery concepts explained from Scratch to ADVANCE to Real-Time implementation.Each and every BigQuery concept is explained with HANDS-ON examples.Includes each and every, even thin detail of Big Query.Learn to interact with BigQuery using its Web Console, Bq CLI and Python Client Library.Create, Load, Modify and Manage BigQuery Datasets, Tables, Views, Materialized Views etc. *Exclusive* – Query Execution Plan, Efficient schema design, Optimization techniques, Partitioning, Clustering.Build and deploy end-to-end data pipelines (Batch & Stream) of Real-Time case studies in GCP.Services used in the pipelines- Dataflow, Apache Beam, Pub/Sub, Bigquery, Cloud storage, Data Studio, Cloud Composer/Airflow etc.Learn Best practices and Optimization techniques to follow in Real-Time Google Cloud BigQuery Projects.After completing this course, you can start working on any BigQuery project with full confidence.Add-OnsQuestions and Queries will be answered very quickly.Queries and datasets used in lectures are attached in the course for your convenience.I am going to update it frequently, every time adding new components of Bigquery.

Overview

Section 1: Introduction to GCP & its services

Lecture 1 Introduction to Google Cloud Platform

Lecture 2 GCP vs AWS vs Azure – Why choose GCP

Lecture 3 Compute Services in GCP

Lecture 4 Storage Services in GCP

Lecture 5 Big data Services in GCP

Lecture 6 AI & ML Services in GCP

Lecture 7 Big data ecosystem in GCP

Section 2: Introduction to BigQuery

Lecture 8 Conventional Datawarehouse Problems

Lecture 9 What is BigQuery

Lecture 10 BigQuery Out-of-the Box Features

Lecture 11 Architecture of BigQuery

Section 3: Dataset & Table creation

Lecture 12 Setup a GCP account

Lecture 13 Important note

Lecture 14 Create a Project

Lecture 15 BigQuery UI Tour

Lecture 16 Region Vs Multi-region

Lecture 17 Create a Dataset

Lecture 18 Create a Table

Section 4: Using BigQuery Dashboard options

Lecture 19 Running query with various Query Settings

Lecture 20 Caching features & limitations

Lecture 21 Querying Wildcard Tables

Lecture 22 Wildcard Table Limitations

Lecture 23 Schedule, Save, Share a Query

Lecture 24 Schema Auto detection

Section 5: Efficient Schema Design in BigQuery

Lecture 25 Design an Efficient schema for BigQuery Tables

Lecture 26 Nested & Repeated Columns

Section 6: Operations on Datasets & Tables

Lecture 27 Copying Datasets

Lecture 28 Transfer Service for scheduling Copy Jobs

Lecture 29 Native operations on Table for Schema change

Lecture 30 Manual operations on Table

Section 7: Execution Plan of BigQuery

Lecture 31 How BigQuery creates Execution Plan of a Query

Lecture 32 Understanding Execution Plan in UI Dashboard

Section 8: Partitioned Tables in BigQuery

Lecture 33 What is Partitioning & its benefits

Lecture 34 Ingestion time Partitioned Tables

Lecture 35 Date column Partitioned Tables

Lecture 36 Integer based Partitioned Tables

Lecture 37 ALTER, COPY operations on Partitioned Tables

Lecture 38 DML operations on Partitioned Tables

Lecture 39 Best Practices for Partitioning

Section 9: Clustered Tables in BigQuery

Lecture 40 What is Clustering

Lecture 41 When to use Clustering OR Partitioning OR Both

Lecture 42 Create Clustered Table

Lecture 43 Dos & Don’ts for Clustering

Section 10: Loading & Querying External Data Sources

Lecture 44 Introduction and Create Cloud Storage Bucket

Lecture 45 Create & Query Permanent Table on Cloud Storage bucket

Lecture 46 External data source Limitations

Section 11: Views in Bigquery

Lecture 47 Introduction to Views & its Advantages

Lecture 48 Create Views in BigQuery

Lecture 49 Restrict rows at User level in Views

Lecture 50 Limitations of Views

Section 12: Materialized Views in BigQuery

Lecture 51 What are Materialized Views

Lecture 52 Create a Materialized View

Lecture 53 ALTER Materialized View

Lecture 54 Design an optimized query for Materialized View

Lecture 55 Auto & Manual Refreshes of Materialized Views

Lecture 56 Limitations & Quotas of Materialized Views

Lecture 57 Best Practices in Materialized Views

Section 13: BQ Command Line

Lecture 58 Introduction

Lecture 59 Cloud SDK Setup

Lecture 60 BQ Basic commands

Lecture 61 BQ – Querying Commands

Lecture 62 BQ- Dataset creation command

Lecture 63 BQ – Create all types of Tables

Lecture 64 BQ – Load data into Table

Lecture 65 BQ – Exclusive operations

Section 14: Python Client Library of BigQuery

Lecture 66 Setup

Lecture 67 Python code to create dataset

Lecture 68 Python code to create table

Lecture 69 Python code to query tables

Section 15: Build End-to-End Data Pipelines (Apache Beam)

Lecture 70 Case Study Requirements

Lecture 71 GCP approach to case study

Lecture 72 Apache Beam Pipeline creation

Lecture 73 Write Transformations in Beam

Lecture 74 Write to BigQuery

Lecture 75 Create View for Daily data

Lecture 76 Python 3 code

Lecture 77 Run the Beam Pipeline

Lecture 78 Create Reports in Cloud DataStudio

Lecture 79 Create monthly reports in DataStudio

Lecture 80 Write Airflow DAG to schedule

Lecture 81 Create Cloud Composer environment and run DAG

Section 16: Build Streaming Data Pipelines

Lecture 82 Introduction

Lecture 83 Google Pub/Sub Architecture

Lecture 84 Publish messages to Pub/Sub

Lecture 85 Beam pipeline for Streaming data

Section 17: BigQuery Pricing

Lecture 86 Storage Pricing

Lecture 87 Query Pricing

Lecture 88 API, DML pricing

Lecture 89 Free operations in BigQuery

Lecture 90 Google Cloud Pricing Calculator

Section 18: Best Practices / Optimization Techniques

Lecture 91 Introduction

Lecture 92 Methods to restrict data scan

Lecture 93 Ways to reduce CPU time

Lecture 94 Which SQL anti-patterns to avoid

Section 19: Additional Learnings – Different File Formats & Apache Beam

Lecture 95 What do we need from a File

Lecture 96 Text, Sequence, Avro Files

Lecture 97 RC, ORC, Parquet Files

Lecture 98 Performance Test results of Various Files

Lecture 99 Which File Format to choose

Lecture 100 Introduction to Apache Beam

Lecture 101 Batch Vs Stream processing

Lecture 102 Thankyou

Section 20: BONUS

Lecture 103 Bonus

Students who want to learn Deep Internals of BigQuery components.,Data engineers, intending to build end-to-end Data pipelines in GCP (Google Cloud Platform),Anyone planning to give Google Cloud Data engineer certification.

Course Information:

Udemy | English | 8h 30m | 2.26 GB
Created by: J Garg – Real Time Learning

You Can See More Courses in the IT & Software >> Greetings from CourseDown.com

New Courses

Scroll to Top