Learn Big Data The Hadoop Ecosystem Masterclass
What you’ll learn
Process Big Data using batch
Process Big Data using realtime data
Be familiar with the technologies in the Hadoop Stack
Be able to install and configure the Hortonworks Data Platform (HDP)
Requirements
You will need to have a background in IT. The course is aimed at Software Engineers, System Administrators, DBAs who want to learn about Big Data
Knowing any programming language will enhance your course experience
The course contains demos you can try out on your own machine. To run the Hadoop cluster on your own machine, you will need to run a virtual server. 8 GB or more RAM is recommended.
Description
Important update: Effective January 31, 2021, all Cloudera software will require a valid subscription and only be accessible via the paywall. The sandbox can still be downloaded, but the full install requires a Cloudera subscription to get access to the yum repository. In this course you will learn Big Data using the Hadoop Ecosystem. Why Hadoop? It is one of the most sought after skills in the IT industry. The average salary in the US is $112,000 per year, up to an average of $160,000 in San Fransisco (source: Indeed).The course is aimed at Software Engineers, Database Administrators, and System Administrators that want to learn about Big Data. Other IT professionals can also take this course, but might have to do some extra research to understand some of the concepts.You will learn how to use the most popular software in the Big Data industry at moment, using batch processing as well as realtime processing. This course will give you enough background to be able to talk about real problems and solutions with experts in the industry. Updating your LinkedIn profile with these technologies will make recruiters want you to get interviews at the most prestigious companies in the world.The course is very practical, with more than 6 hours of lectures. You want to try out everything yourself, adding multiple hours of learning. If you get stuck with the technology while trying, there is support available. I will answer your messages on the message boards and we have a Facebook group where you can post questions.
Overview
Section 1: Introduction
Lecture 1 Course Introduction
Lecture 2 Course Guide
Section 2: What is Big Data and Hadoop
Lecture 3 What is Big Data
Lecture 4 Examples of Big Data
Lecture 5 What is Data Science
Lecture 6 What is Hadoop
Lecture 7 Hadoop Distributions
Section 3: Introduction to Hadoop
Lecture 8 Hadoop Installation
Lecture 9 Demo: Hortonworks Sandbox
Lecture 10 Demo: Hadoop Installation – Part 1
Lecture 11 Demo: Hadoop Installation – Part 2
Lecture 12 Introduction to HDFS
Lecture 13 DataNode Communications
Lecture 14 Demo: HDFS – Part 1
Lecture 15 Demo: HDFS – Part 2 – Using Ambari
Lecture 16 MapReduce WordCount Example
Lecture 17 Demo: MapReduce WordCount
Lecture 18 Lines that span blocks
Lecture 19 Introduction to Yarn
Lecture 20 Demo: Yarn and ResourceManager UI
Lecture 21 Ambari API and Blueprints
Lecture 22 Demo: Ambari API and Blueprints
Lecture 23 ETL Processing in Hadoop
Section 4: Pig
Lecture 24 Introduction to Pig
Lecture 25 Demo: Part 1 – Pig Installation
Lecture 26 Demo: Part 2 – Pig Commands
Lecture 27 Demo: Part 3 – More Pig Commands
Section 5: Apache Spark
Lecture 28 Introduction to Apache Spark
Lecture 29 Spark WordCount
Lecture 30 Demo: Spark installation and WordCount
Lecture 31 RDDs
Lecture 32 Demo: RDD Transformations and Actions
Lecture 33 Overview of RDD Transformations and Actions
Lecture 34 Spark MLLib
Section 6: Hive
Lecture 35 Introduction to Hive
Lecture 36 Hive Queries
Lecture 37 Demo: Hive Installation and Hive Queries
Lecture 38 Hive Partitioning, Buckets, UDFs, and SerDes
Lecture 39 The Stinger Initiative
Lecture 40 Hive in Spark
Section 7: Real Time Processing
Lecture 41 Introduction to Realtime Processing
Section 8: Kafka
Lecture 42 Introduction to Kafka
Lecture 43 Kafka Topics
Lecture 44 Kafka Messages and Log Compaction
Lecture 45 Kafka Use Cases and Usage
Lecture 46 Demo: Kafka Installation and Usage
Section 9: Storm
Lecture 47 Introduction to Storm
Lecture 48 A Storm Topology
Lecture 49 Demo: Storm installation and Example Topology
Lecture 50 Storm Message Processing and Reliability
Lecture 51 Trident
Section 10: Spark Streaming
Lecture 52 Introduction to Spark Streaming
Lecture 53 Spark Streaming Architecture
Lecture 54 Spark Receivers and WordCount Streaming Example
Lecture 55 Demo: Spark Streaming with Kafka
Lecture 56 Spark Streaming State and Checkpointing
Lecture 57 Demo: Stateful Spark Streaming
Lecture 58 More Spark Streaming Features
Section 11: HBase
Lecture 59 Introduction to HBase
Lecture 60 HBase Tables
Lecture 61 The HBase Meta Table
Lecture 62 HBase Writes
Lecture 63 HBase Reads
Lecture 64 Compactions
Lecture 65 Crash Recovery
Lecture 66 Region Splits
Lecture 67 Hotspotting
Lecture 68 Demo: HBase Install
Lecture 69 Demo: HBase Shell
Lecture 70 Demo: Spark HBase
Section 12: Phoenix
Lecture 71 Introduction to Phoenix
Lecture 72 Salting, Compression, and Indexes in Phoenix
Lecture 73 JOINs, VIEWs, and Phoenix in Spark
Lecture 74 Demo: Phoenix
Section 13: Hadoop Security
Lecture 75 Introduction to Kerberos
Lecture 76 Kerberos on Hadoop
Lecture 77 Kerberos Terminology
Lecture 78 Demo: Enabling Kerberos
Lecture 79 Introduction to SPNEGO
Lecture 80 Demo: SPNEGO
Lecture 81 Introduction to Knox
Section 14: Ranger
Lecture 82 Introduction to Ranger
Lecture 83 Demo: Ranger Installation
Lecture 84 Demo: Ranger with Hive
Section 15: HDFS Encryption
Lecture 85 Introduction to HDFS Transparent Encryption
Lecture 86 Demo: HDFS Encryption using Ranger KMS
Section 16: Advanced Topics
Lecture 87 Yarn Schedulers
Lecture 88 Demo: Capacity Scheduler
Lecture 89 Label based scheduling
Lecture 90 Yarn Sizing
Lecture 91 Hive Query Optimizations
Lecture 92 Join Strategies
Lecture 93 Spark Optimizations
Lecture 94 NameNode High Availability
Lecture 95 Demo: NameNode High Availability Setup
Lecture 96 Database High Availability
Section 17: Thank You
Lecture 97 Thank You!
Lecture 98 Bonus Lecture: My Other Courses
This course is for anyone that wants to know how Big Data works, and what technologies are involved,The main focus is on the Hadoop ecosystem. We don’t cover any technologies not on the Hortonworks Data Platform Stack,The course compares MapR, Cloudera, and Hortonworks, but we only use the Hortonworks Data Platform (HDP) in the demos
Course Information:
Udemy | English | 5h 58m | 2.60 GB
Created by: Edward Viaene
You Can See More Courses in the IT & Software >> Greetings from CourseDown.com