Python Data Science with Pandas Master 12 Advanced Projects

Work with Pandas, SQL Databases, JSON, Web APIs & more to master your real-world Machine Learning & Finance Projects
Python Data Science with Pandas Master 12 Advanced Projects
File Size :
6.32 GB
Total length :
15h 35m



Alexander Hagmann


Last update




Python Data Science with Pandas Master 12 Advanced Projects

What you’ll learn

Advanced Real-World Data Workflows with Pandas you won´t find in any other Course.
Working with Pandas and SQL-Databases in parallel (getting the best out of two worlds)
Working with APIs, JSON and Pandas to import large Datasets from the Web
Bringing Pandas to its Limits (and beyond…)
Machine Learning Application: Predicting Real Estate Prices
Finance Applications: Backtesting & Forward Testing Investment Strategies + Index Tracking
Feature Engineering, Standardization, Dummy Variables and Sampling with Pandas
Working with large Datasets (millions of rows/columns)
Working with completely messy/unclean Datasets (the standard case in real-world)
Handling stringified and nested JSON Data with Pandas
Loading Data from Databases (SQL) into Pandas and vice versa
Loading JSON Data into Pandas and vice versa
Web-Scraping with Pandas
Cleaning large & messy Datasets (millions of rows/columns)
Working with APIs and Python Wrapper Packages to import large Datasets from the Web
Explanatory Data Analysis with large real-world Datasets
Advanced Visualizations with Matplotlib and Seaborn

Python Data Science with Pandas Master 12 Advanced Projects


You should be familiar with Python (Standard Library, Numpy, Matplotlib)
You should have worked with Pandas before (at least you should know the basics)
A desktop computer (Windows, Mac, or Linux) capable of storing and running Anaconda. The course will walk you through installing the necessary free software.
An internet connection capable of streaming HD videos.
Some high school level math skills would be great (not mandatory, but it helps)


Welcome to the first advanced and project-based Pandas Data Science Course! This Course starts where many other courses end: You can write some Pandas code but you are still struggling with real-world Projects becauseReal-World Data is typically not provided in a single or a few text/excel files -> more advanced Data Importing Techniques are requiredReal-World Data is large, unstructured, nested and unclean -> more advanced Data Manipulation and Data Analysis/Visualization Techniques are required many easy-to-use Pandas methods work best with relatively small and clean Datasets -> real-world Datasets require more General Code (incorporating other Libraries/Modules) No matter if you need excellent Pandas skills for Data Analysis, Machine Learning or Finance purposes, this is the right Course for you to get your skills to Expert Level! Master your real-world Projects! This Course covers the full Data Workflow A-Z:Import (complex and nested) Data from JSON files.Import (complex and nested) Data from the Web with Web APIs, JSON and Wrapper Packages.Import (complex and nested) Data from SQL Databases.Store (complex and nested) Data in JSON files.Store (complex and nested) Data in SQL Databases.Work with Pandas and SQL Databases in parallel (getting the best of both worlds).Efficiently import and merge Data from many text/CSV files.Clean large and messy Datasets with more General Code.Clean, handle and flatten nested and stringified Data in DataFrames.Know how to handle and normalize Unicode strings.Merge and Concatenate many Datasets efficiently.Scale and Automate data merging.Explanatory Data Analysis and Data Presentation with advanced Visualization Tools (advanced Matplotlib & Seaborn).Test the Performance Limits of Pandas with advanced Data Aggregations and Grouping.Data Preprocessing and Feature Engineering for Machine Learning with simple Pandas code.Use your Data 1: Train and test Machine Learning Models on preprocessed Data and analyze the results.Use your Data 2: Backtesting and Forward Testing of Investment Strategies (Finance & Investment Stack).Use your Data 3: Index Tracking (Finance & Investment Stack).Use your Data 4: Present your Data with Python in a nicely looking HTML format (Website Quality).and many more…I am Alexander Hagmann, Finance Professional and Data Scientist (> 7 Years Industry Experience) and best-selling Instructor for Pandas, (Financial) Data Science and Finance with Python. Looking forward to seeing you in this Course!


Section 1: Getting Started

Lecture 1 Course Overview (don´t skip!)

Lecture 2 Tips: How to get the most out of this Course (don´t skip!)

Lecture 3 FAQ / Your Questions answered

Lecture 4 How to download and install Anaconda for Python coding

Lecture 5 Jupyter Notebooks – let´s get started

Lecture 6 How to work with Jupyter Notebooks

Section 2: Project 1: Explanatory Data Analysis & Data Presentation (Movies Dataset)

Lecture 7 Project Overview

Lecture 8 Downloads (Project 1)

Lecture 9 Project Brief for Self-Coders

Lecture 10 Data Import from csv file and first Inspection

Lecture 11 The best and the worst movies… (Part 1)

Lecture 12 The best and the worst movies… (Part 2)

Lecture 13 Which Movie would you like to see next?

Lecture 14 What are the most common Words in Movie Titles, Taglines and Overviews?

Lecture 15 Are Franchises more successful?

Lecture 16 What are the most successful Franchises?

Lecture 17 The most successful Directors

Lecture 18 The most successful Actors (Part 1)

Lecture 19 The most successful Actors (Part 2)

Lecture 20 Now it´s your turn (Homework)

Section 3: Project 2: Data Import – Working with APIs and JSON (Movies Dataset)

Lecture 21 Project Overview

Lecture 22 What is JSON?

Lecture 23 Downloads (Project 2)

Lecture 24 Project Brief for Self-Coders

Lecture 25 Importing Data from JSON files

Lecture 26 JSON and Orientation/Formats

Lecture 27 What is an API? – The Movie Database API

Lecture 28 Working with APIs and JSON (Part 1)

Lecture 29 How to work with your own API-KEY

Lecture 30 Working with APIs and JSON (Part 2)

Lecture 31 Importing and Storing the Movies Dataset (Best Practice)

Lecture 32 Importing and Storing the Movies Dataset (Real World Scenario)

Section 4: Project 3: Data Cleaning – Tidy up messy Datasets (Movies Dataset)

Lecture 33 Project Overview

Lecture 34 Downloads (Project 3)

Lecture 35 Project Brief for Self-Coders

Lecture 36 First Steps

Lecture 37 Dropping irrelevant Columns

Lecture 38 How to handle stringified JSON columns (Part 1)

Lecture 39 How to handle stringified JSON columns (Part 2)

Lecture 40 How to flatten nested Columns

Lecture 41 How to clean Numerical Columns (Part 1)

Lecture 42 How to clean Numerical Columns (Part 2)

Lecture 43 How to clean Columns with DateTime Information

Lecture 44 How to clean String / Text Columns

Lecture 45 How to remove Duplicates

Lecture 46 Handling Missing Values & Removing Obervations/Rows

Lecture 47 Final Steps

Section 5: Project 4: Merging, Cleaning & Transforming Data (Movies Dataset)

Lecture 48 Project Overview

Lecture 49 Downloads (Project 4)

Lecture 50 Project Brief for Self-Coders

Lecture 51 Getting the Datasets

Lecture 52 Preparing the Data for Merge

Lecture 53 Merging the Data (Left Join)

Lecture 54 Cleaning and Transforming the new “Cast” Column

Lecture 55 Cleaning and Transforming the new “Crew” Column

Lecture 56 Final Steps

Section 6: Project 5: Working with Pandas and SQL Databases (Movies Dataset)

Lecture 57 Project Overview

Lecture 58 What is a Database / SQL?

Lecture 59 Downloads (Project 5)

Lecture 60 Project Brief for Self-Coders

Lecture 61 How to create an SQLite Database

Lecture 62 How to load Data from DataFrames into an SQLite Database

Lecture 63 How to load Data from SQLite Databases into DataFrames

Lecture 64 Some simple SQL Queries

Lecture 65 Some more SQL Queries

Lecture 66 Join Queries

Lecture 67 Final Case Study

Section 7: Project 6: Importing & Concatenating many files (Baby Names Dataset)

Lecture 68 Project Overview

Lecture 69 Downloads (Project 6)

Lecture 70 Project Brief for Self-Coders (Part 1)

Lecture 71 Getting the Data from the Web

Lecture 72 Importing one File & Understanding the Data Structure (easy case)

Lecture 73 Importing & merging many Files (easy case)

Lecture 74 Final Steps

Lecture 75 Project Brief for Self-Coders (Part 2)

Lecture 76 Importing one File & Understanding the Data Structure (complex case)

Lecture 77 The glob module

Lecture 78 Importing & merging many Files (complex case)

Lecture 79 Excursus: Saving Memory – Categorical Features

Section 8: Project 7: Explanatory Data Analysis & Advanced Visualization (Baby Names)

Lecture 80 Project Overview

Lecture 81 Downloads (Project 7)

Lecture 82 Project Brief for Self-Coders

Lecture 83 First Inspection: The most popular Names in 2018

Lecture 84 Evergreen Names (1880 – 2018)

Lecture 85 Advanced Data Aggregation

Lecture 86 What are the most popular Names of all Times?

Lecture 87 General Trends over Time (1880 – 2018)

Lecture 88 Creating the Features “Popularity” and “Rank”

Lecture 89 Visualizing Name Trends over Time

Lecture 90 Why does a Name´s Popularity suddenly change? (Part 1)

Lecture 91 Why does a Name´s Popularity suddenly change? (Part 2)

Lecture 92 Persistant vs. Spike-Fade Names

Lecture 93 Most Popular Unisex Names

Section 9: Project 8: Data Preprocessing & Feature Engineering for Machine Learning

Lecture 94 Project Overview

Lecture 95 Downloads (Project 8)

Lecture 96 Project Brief for Self-Coders

Lecture 97 Data Import and first Inspection

Lecture 98 Data Cleaning and Creating additional Features

Lecture 99 Which Factors influence House Prices?

Lecture 100 Advanced Explanatory Data Analyis with Seaborn

Lecture 101 Feature Engineering – Part 1

Lecture 102 Feature Engineering – Part 2

Lecture 103 Splitting the Data into Train and Test Set

Lecture 104 Training the ML Model (Random Forest)

Lecture 105 Evaluating the Model on the Test Set

Lecture 106 Feature Importance

Section 10: Project 9: Data Import – Web Scraping, APIs & Wrappers (US Stocks)

Lecture 107 Project Overview

Lecture 108 Downloads (Project 9)

Lecture 109 Web Scraping – the Dow Jones Constituents

Lecture 110 Normalizing Unicode Strings and Getting the Ticker Symbols

Lecture 111 Download and Installation of an API Wrapper Package

Lecture 112 Loading and Saving Historical Stock Prices

Section 11: Project 10 (Finance Stack): Backtesting Investment Strategies (US Stocks)

Lecture 113 Project Overview

Lecture 114 Downloads (Project 10)

Lecture 115 Importing the Data

Lecture 116 Data Visualization & Returns

Lecture 117 Backtesting a simple Momentum Strategy

Lecture 118 Backtesting a simple Contrarian Strategy

Lecture 119 More complex Strategies & Backtesting vs. Fitting

Lecture 120 Simple Moving Averages (SMA)

Lecture 121 Backtesting Simple Moving Averages (SMA) Strategies

Lecture 122 Backtesting the Perfect Strategy (…in case you can predict the future…)

Section 12: Project 11 (Finance Stack): Index Tracking and Forward Testing (US Stocks)

Lecture 123 Project Overview

Lecture 124 Downloads (Project 11)

Lecture 125 Importing & Merging the Data

Lecture 126 Transforming the Data

Lecture 127 Explanatory Data Analysis (Risk, Return & Correlations)

Lecture 128 Index Tracking – Introduction

Lecture 129 Index Tracking – Selecting the Tracking Stocks

Lecture 130 Index Tracking – A simple Tracking Portfolio

Lecture 131 Index Tracking – The optimal Tracking Portfolio

Lecture 132 Forward Testing (Part 1)

Lecture 133 Forward Testing (Part 2)

Section 13: Project 12: Explanatory Data Analysis and Seaborn Visualization (Olympic Games)

Lecture 134 Project Overview

Lecture 135 Downloads (Project 12)

Lecture 136 Project Brief for Self-Coders

Lecture 137 Data Import and first Inspection

Lecture 138 Merging and Concatenating

Lecture 139 Data Cleaning (Part 1)

Lecture 140 Data Cleaning (Part 2)

Lecture 141 What are the most successful countries of all times?

Lecture 142 Do GDP, Population and Politics matter?

Lecture 143 Statistical Analysis and Hypothesis Testing with scipy

Lecture 144 Aggregating and Ranking

Lecture 145 Summer Games vs. Winter Games – does Geographical Location matter?

Lecture 146 Men vs. Women – do Culture & Religion matter?

Lecture 147 Do Traditions matter?

Section 14: Extra Project: Prepare yourself for the Future – Pandas Version 1.0

Lecture 148 Intro and Overview

Lecture 149 How to update Pandas to Version 1.0

Lecture 150 Downloads for this Section

Lecture 151 Important Recap: Pandas Display Options (Changed in Version 0.25)

Lecture 152 Info() method – new and extended output

Lecture 153 NEW Extension dtypes (“nullable” dtypes): Why do we need them?

Lecture 154 Creating the NEW extension dtypes with convert_dtypes()

Lecture 155 NEW pd.NA value for missing values

Lecture 156 The NEW “nullable” Int64Dtype

Lecture 157 The NEW StringDtype

Lecture 158 The NEW “nullable” BooleanDtype

Lecture 159 Addition of the ignore_index parameter

Lecture 160 Removal of prior Version Deprecations

Section 15: Appendix: Pandas Crash Course

Lecture 161 Intro to Tabular Data / Pandas

Lecture 162 Downloads for this Section

Lecture 163 Create your very first Pandas DataFrame (from csv)

Lecture 164 Pandas Display Options and the methods head() & tail()

Lecture 165 First Data Inspection

Lecture 166 Built-in Functions, Attributes and Methods with Pandas

Lecture 167 Selecting Columns

Lecture 168 Selecting one Column with the “dot notation”

Lecture 169 Zero-based Indexing and Negative Indexing

Lecture 170 Selecting Rows with iloc (position-based indexing)

Lecture 171 Slicing Rows and Columns with iloc (position-based indexing)

Lecture 172 Position-based Indexing Cheat Sheets

Lecture 173 Selecting Rows with loc (label-based indexing)

Lecture 174 Slicing Rows and Columns with loc (label-based indexing)

Lecture 175 Label-based Indexing Cheat Sheets

Lecture 176 First Steps with Pandas Series

Lecture 177 Analyzing Numerical Series with unique(), nunique() and value_counts()

Lecture 178 Analyzing non-numerical Series with unique(), nunique(), value_counts()

Lecture 179 Sorting of Series and Introduction to the inplace – parameter

Lecture 180 Filtering DataFrames by one Condition

Lecture 181 Filtering DataFrames by many Conditions (AND)

Lecture 182 Filtering DataFrames by many Conditions (OR)

Lecture 183 Creating Columns based on other Columns

Lecture 184 User-defined Functions with apply(), map() and applymap()

Lecture 185 Data Visualization with Matplotlib

Lecture 186 GroupBy – an Introduction

Lecture 187 Understanding the GroupBy Object

Lecture 188 Splitting with many Keys

Lecture 189 split-apply-combine explained

Lecture 190 split-apply-combine applied

Lecture 191 Data with DateTime Information – Part 1

Lecture 192 Data with DateTime Information – Part 2

Lecture 193 Data with DateTime Information – Part 3

Lecture 194 Data with DateTime Information – Part 4

Section 16: What´s next? (outlook and additional resources)

Lecture 195 Bonus Lecture

Everyone who really want to master large, messy and unclean Datasets.,Everyone who want to improve skills from “I can write some Pandas Code” to “I can master my real-word Data Projects with Pandas”,Data Scientists,Machine Learning Professionals,Finance & Investment Professionals,Researchers

Course Information:

Udemy | English | 15h 35m | 6.32 GB
Created by: Alexander Hagmann

You Can See More Courses in the Business >> Greetings from

New Courses

Scroll to Top