BigQuery for Big data engineers Master Big Query Internals
What you’ll learn
Learn Full In & Out of Google Cloud BigQuery with proper HANDS-ON examples from scratch.
Get an Overview of Google Cloud Platform and a brief introduction to the set of services it provides.
Start with Bigquery core concepts like understanding its Architecture, Dataset, Table, View, Materialized View, Schedule queries, Limitations & Quotas.
ADVANCE Big query topics like Query Execution plan, Efficient schema design, Optimization techniques, Partitioning, Clustering, etc.
Build Big data pipelines using various Google Cloud Platform services – Dataflow, Pub/Sub, BigQuery, Cloud storage, Beam, Data Studio, Cloud Composer/Airflow.
Learn to interact with Bigquery using Web Console, Command Line, Python Client Library etc.
Learn Best practices to follow in Real-Time Projects for Performance and Cost saving for every component of Big query.
Bigquery Pricing models for Storage, Querying, API requests, DMLs and free operations.
Data-sets and Queries used in lectures are available in resources tab. This will save your typing efforts.
Requirements
Basic knowledge of SQL
Description
Note : This Bigquery course is NOT intended to teach SQL or PostgreSQL. The focus of the course is kept to give you In-depth knowledge of Google Bigquery concepts/Internals.”BigQuery is server-less, highly scalable, and cost-effective Data warehouse designed for Google cloud Platform (GCP) to store and query petabytes of data.”What’s included in the course ?Brief introduction to the set of services Google Cloud provides.Complete In-depth knowledge of Google BigQuery concepts explained from Scratch to ADVANCE to Real-Time implementation.Each and every BigQuery concept is explained with HANDS-ON examples.Includes each and every, even thin detail of Big Query.Learn to interact with BigQuery using its Web Console, Bq CLI and Python Client Library.Create, Load, Modify and Manage BigQuery Datasets, Tables, Views, Materialized Views etc. *Exclusive* – Query Execution Plan, Efficient schema design, Optimization techniques, Partitioning, Clustering.Build and deploy end-to-end data pipelines (Batch & Stream) of Real-Time case studies in GCP.Services used in the pipelines- Dataflow, Apache Beam, Pub/Sub, Bigquery, Cloud storage, Data Studio, Cloud Composer/Airflow etc.Learn Best practices and Optimization techniques to follow in Real-Time Google Cloud BigQuery Projects.After completing this course, you can start working on any BigQuery project with full confidence.Add-OnsQuestions and Queries will be answered very quickly.Queries and datasets used in lectures are attached in the course for your convenience.I am going to update it frequently, every time adding new components of Bigquery.
Overview
Section 1: Introduction to GCP & its services
Lecture 1 Introduction to Google Cloud Platform
Lecture 2 GCP vs AWS vs Azure – Why choose GCP
Lecture 3 Compute Services in GCP
Lecture 4 Storage Services in GCP
Lecture 5 Big data Services in GCP
Lecture 6 AI & ML Services in GCP
Lecture 7 Big data ecosystem in GCP
Section 2: Introduction to BigQuery
Lecture 8 Conventional Datawarehouse Problems
Lecture 9 What is BigQuery
Lecture 10 BigQuery Out-of-the Box Features
Lecture 11 Architecture of BigQuery
Section 3: Dataset & Table creation
Lecture 12 Setup a GCP account
Lecture 13 Important note
Lecture 14 Create a Project
Lecture 15 BigQuery UI Tour
Lecture 16 Region Vs Multi-region
Lecture 17 Create a Dataset
Lecture 18 Create a Table
Section 4: Using BigQuery Dashboard options
Lecture 19 Running query with various Query Settings
Lecture 20 Caching features & limitations
Lecture 21 Querying Wildcard Tables
Lecture 22 Wildcard Table Limitations
Lecture 23 Schedule, Save, Share a Query
Lecture 24 Schema Auto detection
Section 5: Efficient Schema Design in BigQuery
Lecture 25 Design an Efficient schema for BigQuery Tables
Lecture 26 Nested & Repeated Columns
Section 6: Operations on Datasets & Tables
Lecture 27 Copying Datasets
Lecture 28 Transfer Service for scheduling Copy Jobs
Lecture 29 Native operations on Table for Schema change
Lecture 30 Manual operations on Table
Section 7: Execution Plan of BigQuery
Lecture 31 How BigQuery creates Execution Plan of a Query
Lecture 32 Understanding Execution Plan in UI Dashboard
Section 8: Partitioned Tables in BigQuery
Lecture 33 What is Partitioning & its benefits
Lecture 34 Ingestion time Partitioned Tables
Lecture 35 Date column Partitioned Tables
Lecture 36 Integer based Partitioned Tables
Lecture 37 ALTER, COPY operations on Partitioned Tables
Lecture 38 DML operations on Partitioned Tables
Lecture 39 Best Practices for Partitioning
Section 9: Clustered Tables in BigQuery
Lecture 40 What is Clustering
Lecture 41 When to use Clustering OR Partitioning OR Both
Lecture 42 Create Clustered Table
Lecture 43 Dos & Don’ts for Clustering
Section 10: Loading & Querying External Data Sources
Lecture 44 Introduction and Create Cloud Storage Bucket
Lecture 45 Create & Query Permanent Table on Cloud Storage bucket
Lecture 46 External data source Limitations
Section 11: Views in Bigquery
Lecture 47 Introduction to Views & its Advantages
Lecture 48 Create Views in BigQuery
Lecture 49 Restrict rows at User level in Views
Lecture 50 Limitations of Views
Section 12: Materialized Views in BigQuery
Lecture 51 What are Materialized Views
Lecture 52 Create a Materialized View
Lecture 53 ALTER Materialized View
Lecture 54 Design an optimized query for Materialized View
Lecture 55 Auto & Manual Refreshes of Materialized Views
Lecture 56 Limitations & Quotas of Materialized Views
Lecture 57 Best Practices in Materialized Views
Section 13: BQ Command Line
Lecture 58 Introduction
Lecture 59 Cloud SDK Setup
Lecture 60 BQ Basic commands
Lecture 61 BQ – Querying Commands
Lecture 62 BQ- Dataset creation command
Lecture 63 BQ – Create all types of Tables
Lecture 64 BQ – Load data into Table
Lecture 65 BQ – Exclusive operations
Section 14: Python Client Library of BigQuery
Lecture 66 Setup
Lecture 67 Python code to create dataset
Lecture 68 Python code to create table
Lecture 69 Python code to query tables
Section 15: Build End-to-End Data Pipelines (Apache Beam)
Lecture 70 Case Study Requirements
Lecture 71 GCP approach to case study
Lecture 72 Apache Beam Pipeline creation
Lecture 73 Write Transformations in Beam
Lecture 74 Write to BigQuery
Lecture 75 Create View for Daily data
Lecture 76 Python 3 code
Lecture 77 Run the Beam Pipeline
Lecture 78 Create Reports in Cloud DataStudio
Lecture 79 Create monthly reports in DataStudio
Lecture 80 Write Airflow DAG to schedule
Lecture 81 Create Cloud Composer environment and run DAG
Section 16: Build Streaming Data Pipelines
Lecture 82 Introduction
Lecture 83 Google Pub/Sub Architecture
Lecture 84 Publish messages to Pub/Sub
Lecture 85 Beam pipeline for Streaming data
Section 17: BigQuery Pricing
Lecture 86 Storage Pricing
Lecture 87 Query Pricing
Lecture 88 API, DML pricing
Lecture 89 Free operations in BigQuery
Lecture 90 Google Cloud Pricing Calculator
Section 18: Best Practices / Optimization Techniques
Lecture 91 Introduction
Lecture 92 Methods to restrict data scan
Lecture 93 Ways to reduce CPU time
Lecture 94 Which SQL anti-patterns to avoid
Section 19: Additional Learnings – Different File Formats & Apache Beam
Lecture 95 What do we need from a File
Lecture 96 Text, Sequence, Avro Files
Lecture 97 RC, ORC, Parquet Files
Lecture 98 Performance Test results of Various Files
Lecture 99 Which File Format to choose
Lecture 100 Introduction to Apache Beam
Lecture 101 Batch Vs Stream processing
Lecture 102 Thankyou
Section 20: BONUS
Lecture 103 Bonus
Students who want to learn Deep Internals of BigQuery components.,Data engineers, intending to build end-to-end Data pipelines in GCP (Google Cloud Platform),Anyone planning to give Google Cloud Data engineer certification.
Course Information:
Udemy | English | 8h 30m | 2.26 GB
Created by: J Garg – Real Time Learning
You Can See More Courses in the IT & Software >> Greetings from CourseDown.com