Pentaho for ETL Data Integration Masterclass 2023 PDI 9
What you’ll learn
Understanding of the entire data integration process using PDI
Extracting data from all popular data sources including Excel, JSON, Zipped files, TXT files and even cloud storage
Cleaning the data using Pentaho Data Integration
Applying business rules on the data in PDI
Different types of Data transformations
Loading the data into different formats
Managing SQL database using PDI
Metadata Injection – a powerful tool offered by PDI
Understanding of the concepts of data marts and data warehouse
Requirements
Basic understanding of the data storage concepts will be helpful. Coding background is NOT required for this course
Description
What is ETL?The ETL (extract, transform, load) process is the most popular method of collecting data from multiple sources and loading it into a centralized data warehouse. ETL is an essential component of data warehousing and analytics.Why Pentaho for ETL?Pentaho has phenomenal ETL, data analysis, metadata management and reporting capabilities. Pentaho is faster than other ETL tools (including Talend). Pentaho has a user-friendly GUI which is easier and takes less time to learn. Pentaho is great for beginners. Also, Pentaho Data Integration (PDI) is an important skill in data analytics field.How much can I earn?In the US, median salary of an ETL developer is $74,835 and in India average salary is Rs. 7,06,902 per year. Accenture, Tata Consultancy Services, Cognizant Technology Solutions, Capgemini, IBM, Infosys etc. are major recruiters for people skilled in ETL tools; Pentaho ETL is one of the most sought-after skills that recruiters look for. Demand for Pentaho Data Integration (PDI) techniques is increasing day after day.What makes us qualified to teach you?The course is taught by Abhishek and Pukhraj. Instructors of the course have been teaching Data Science and Machine Learning for over a decade. We have experience in teaching and implementing Pentaho ETL, Pentaho Data Integration (PDI) for data mining and data analysis purposes.We are also the creators of some of the most popular online courses – with over 150,000 enrollments and thousands of 5-star reviews like these ones:I had an awesome moment taking this course. It broaden my knowledge more on the power use of Excel as an analytical tools. Kudos to the instructor! – SikiruVery insightful, learning very nifty tricks and enough detail to make it stick in your mind. – ArmandOur PromiseTeaching our students is our job and we are committed to it. If you have any questions about the course content on Pentaho, ETL, practice sheet or anything related to any topic, you can always post a question in the course or send us a direct message.Download Practice files, take Quizzes, and complete AssignmentsWith each lecture, there is a practice sheet attached for you to follow along. You can also take quizzes to check your understanding of concepts on Pentaho, ETL, Pentaho Data Integration, Pentaho ETL. Each section contains a practice assignment for you to practically implement your learning on Pentaho, ETL, Pentaho Data Integration, Pentaho ETL. Solution to Assignment is also shared so that you can review your performance.By the end of this course, your confidence in using Pentaho ETL and Pentaho Data Integration (PDI) will soar. You’ll have a thorough understanding of how to use Pentaho for ETL and Pentaho Data Integration (PDI) techniques for study or as a career opportunity.Go ahead and click the enroll button, and I’ll see you in lesson 1 of this Pentaho ETL course!CheersStart-Tech Academy
Overview
Section 1: Introduction
Lecture 1 Welcome to the course
Lecture 2 Course resources
Section 2: Pentaho Data Integration (PDI) Installation and Setup
Lecture 3 Setting up environment and installing PDI
Lecture 4 This is a milestone!
Lecture 5 Opening Spoon – The Graphical UI
Section 3: A Simple ETL Demonstration
Lecture 6 The example problem statement
Lecture 7 Demonstration of a PDI transformation
Lecture 8 Demonstration of a PDI Job
Section 4: Basic concepts – Theory for foundational understanding
Lecture 9 What is ETL?
Lecture 10 Data Warehouse, Ops Database and Data mart
Lecture 11 Inmon vs Kimball Architecture
Lecture 12 ETL vs ELT
Section 5: The ETL process: The practical part begins here
Lecture 13 Data and the ETL process
Section 6: DATA EXTRACTION: Extracting tabular data
Lecture 14 Manually entering data into PDI
Lecture 15 Inputting Data from a TXT (text) file
Lecture 16 Input from multiple CSV files at the same time
Lecture 17 Inputting Data from an Excel file
Lecture 18 Extracting Data from Zipped files
Section 7: DATA EXTRACTION: Extracting non-tabular data
Lecture 19 Extracting from XML
Lecture 20 Extracting from JSON
Section 8: Extracting from an SQL table
Lecture 21 Plan for importing sales data
Lecture 22 Installing PostgreSQL and pgAdmin in your PC
Lecture 23 Creating Sales table in SQL
Lecture 24 Extracting from an SQL table
Section 9: Storing and Retrieving Data from Cloud storage
Lecture 25 Storing Data on AWS S3
Lecture 26 Reading data from AWS S3
Section 10: Merging Data Streams
Lecture 27 Concepts: Merging Data Streams
Lecture 28 Sorted Merge Step – Merging customer data
Lecture 29 Merging product data
Lecture 30 Append data stream – merging sales data
Section 11: Data Cleansing
Lecture 31 Introduction to Data Cleansing
Lecture 32 Value Mapper Step
Lecture 33 Replace in String Step
Lecture 34 Fuzzy Match concepts
Lecture 35 Fuzzy Match Step in PDI
Lecture 36 Fuzzy Match Algorithms
Lecture 37 Formula Step and changing data format
Lecture 38 Common Data Cleaning Steps
Section 12: Data Validation
Lecture 39 Introduction to Data validation
Lecture 40 Data_validation 1 – String-to-Int and integer range validations
Lecture 41 Data validation 2 – Checking Reference Values using stream look-up
Lecture 42 Data validation 3 – Order date < shipping date using calculator step
Lecture 43 Common Data Validation steps
Section 13: Error Handling
Lecture 44 Correcting the errors and merging with main stream
Lecture 45 Writing the errors to the log
Lecture 46 Writing the errors to a separate file
Section 14: Transformation and Analytics steps
Lecture 47 Concatenating Address Fields
Lecture 48 Data Aggregation using Group-by
Lecture 49 Normalization and Denormalization
Lecture 50 Number Range Step
Section 15: PDI SQL Connection
Lecture 51 Introduction to PDI – SQL connection
Lecture 52 Reading and filtering data from DB into PDI
Lecture 53 Updating and Inserting data into DB from PDI
Lecture 54 Deleting data from SQL DB using PDI
Section 16: Conceptual understanding for Loading Data
Lecture 55 Facts and Dimensions tables
Lecture 56 Surrogate Keys in Dimension tables
Lecture 57 Type 1 & 2 Slowly Changing Dimensions
Lecture 58 Schemas
Section 17: Loading the data into a Data Mart
Lecture 59 Creating tables in DB
Lecture 60 Loading Customer Data using combination lookup/ update step
Lecture 61 Loading product data using dimension lookup step
Lecture 62 Loading sales data after database lookup steps
Section 18: Running Java and Javascript
Lecture 63 Scripting Steps
Section 19: PDI Jobs
Lecture 64 PDI Jobs vs Transformation
Lecture 65 Controlling the flow of execution
Lecture 66 Setting variables using set variables step
Lecture 67 File and Folder Management
Lecture 68 Sending Email Step
Lecture 69 Abort Job Step
Section 20: Scheduling a job for production environment
Lecture 70 Running using command prompt and scheduling
Section 21: Metadata injection
Lecture 71 Metadata injection
Section 22: Regex Notation
Lecture 72 Regular Expressions for advanced String Matching
Section 23: Congratulations and about your certificate
Lecture 73 Alternative to Pentaho
Lecture 74 The final milestone!
Lecture 75 Bonus Lecture
Students who want to have a career in the field of Data warehouse/ETL developer,ETL developers and data process automation developers,Business managers who want to understand the entire ETL process and become capable of implementing it
Course Information:
Udemy | English | 9h 22m | 3.46 GB
Created by: Start-Tech Academy
You Can See More Courses in the Business >> Greetings from CourseDown.com