The Complete Pandas Bootcamp 2022 Data Science with Python

Pandas fully explained | 150+ Exercises | Must-have skills for Machine Learning & Finance | + Scikit-Learn and Seaborn
The Complete Pandas Bootcamp 2022 Data Science with Python
File Size :
12.06 GB
Total length :
33h 54m



Alexander Hagmann


Last update




The Complete Pandas Bootcamp 2022 Data Science with Python

What you’ll learn

Bring your Data Handling & Data Analysis skills to an outstanding level.
Learn and practice all relevant Pandas methods and workflows with Real-World Datasets
Learn Pandas based on NEW Version 1.x (the days of versions 0.x are over)
Import, clean, and merge messy Data and prepare Data for Machine Learning
Master a complete Machine Learning Project A-Z with Pandas, Scikit-Learn, and Seaborn
Analyze, visualize, and understand your Data with Pandas, Matplotlib, and Seaborn
Practice and master your Pandas skills with Quizzes, 150+ Exercises, and Comprehensive Projects
Import Financial/Stock Data from Web Sources and analyze them with Pandas
Learn and master the most important Pandas workflows for Finance
Learn how to best transition from Versions 0.x to new Version 1.x
Learn the Basics of Pandas and Numpy Coding (Appendix)
Learn and master important Statistical Concepts with scipy

The Complete Pandas Bootcamp 2022 Data Science with Python


A desktop computer (Windows, Mac, or Linux) capable of storing and running Anaconda. The course will walk you through installing the necessary free software.
An internet connection capable of streaming videos.
Ideally some Spreadsheet Basics/Programming Basics (not mandatory, the course guides you through the basics)


Welcome to the web´s most comprehensive Pandas Bootcamp with 34 hours of video content, 150+ exercises, and two large and comprehensive Final Projects to test your skills! This course has one goal: Bringing your data handling skills to the next level to build your career in Data Science, Machine Learning, Finance & co. This course has five parts:Pandas Basics – from Zero to Hero (Part 1). The complete data workflow A-Z with Pandas: Importing, Cleaning, Merging, Aggregating, and Preparing Data for Machine Learning. (Part 2) Two Comprehensive Project Challenges that are frequently used in Data Science job recruiting/assessment centers: Test your skills! (Part 3).Application 1: Pandas for Finance, Investing and other Time Series Data (Part 4)Application 2: Machine Learning with Pandas and scikit-learn (Part 5)Why should you learn Pandas?The world is getting more and more data-driven. Data Scientists are gaining ground with $100k+ salaries. It´s time to switch from soapbox cars (spreadsheet software like Excel) to High Tuned Racing Cars (Pandas)! Python is a great platform/environment for Data Science with powerful Tools for Science, Statistics, Finance, and Machine Learning. The Pandas Library is the Heart of Python Data Science. Pandas enables you to import, clean, join/merge/concatenate, manipulate, and deeply understand your Data and finally prepare/process Data for further Statistical Analysis, Machine Learning, or Data Presentation. In reality, all of these tasks require a high proficiency in Pandas! Data Scientists typically spend up to 85% of their time manipulating Data in Pandas.Can you start right now?A frequently asked question of Python Beginners is: “Do I need to become an expert in Python coding before I can start working with Pandas?” The clear answer is: “No! Do you need to become a Microsoft Software Developer before you can start with Excel? Probably not!”You require some Python Basics like data types, simple operations/operators, lists and numpy arrays. In the Appendix of this course, you can find a Python crash course. This Python Introduction is tailor-made and sufficient for Data Science purposes!In addition, this course covers fundamental statistical concepts (coding with scipy).   As a Summary, if you primarily want to use Python for Data Science or as a replacement for Excel, this course is a perfect match!Why should you take this Course?It is the most relevant and comprehensive course on Pandas.It is the most up-to-date course and the first that covers Pandas Version 1.x. The Pandas Library has experienced massive improvements in the last couple of months. Working with and relying on outdated code can be painful. Pandas isn´t an isolated tool. It is used together with other Libraries: Matplotlib and Seaborn for Data Visualization | Numpy, Scipy and Scikit-Learn for Machine Learning, scientific and statistical computing. This course covers all these Libraries. In real-world projects, coding and the business side of things are equally important. This is probably the only Pandas course that teaches both: in-depth Pandas Coding and Big-Picture Thinking. It serves as a Pandas Encyclopedia covering all relevant methods, attributes, and workflows for real-world projects. If you have problems with any method or workflow, you will most likely get help and find a solution in this course.It shows and explains the full real-world Data Workflow A-Z: Starting with importing messy data, cleaning data, merging and concatenating data, grouping and aggregating data, Explanatory Data Analysis through to preparing and processing data for Statistics, Machine Learning, Finance, and Data Presentation.  It explains Pandas Coding on real Data and real-world Problems. No toy data! This is the best way to learn and understand Pandas.It gives you plenty of opportunities to practice and code on your own. Learning by doing. In the exercises, you can select the level of difficulty with optional hints and guidance/instruction.Pandas is a very powerful tool. But it also has pitfalls that can lead to unintended and undiscovered errors in your data. This course also focuses on commonly made mistakes and errors and teaches you, what you should not do. Guaranteed Satisfaction: Otherwise, get your money back with 30-Days-Money-Back-Guarantee. I am looking forward to seeing you in the course!


Section 1: Getting Started

Lecture 1 Overview / Student FAQ

Lecture 2 Tips: How to get the most out of this course

Lecture 3 Did you know that…?

Lecture 4 More FAQ / Important Information

Lecture 5 Installation of Anaconda

Lecture 6 Opening a Jupyter Notebook

Lecture 7 How to use Jupyter Notebooks

Lecture 8 How to tackle Pandas Version 1.0


Lecture 9 Intro to Tabular Data / Pandas

Lecture 10 Download: Part 1 Course Materials

Section 3: Pandas Basics (DataFrame Basics I)

Lecture 11 Create your very first Pandas DataFrame (from csv)

Lecture 12 Pandas Display Options and the methods head() & tail()

Lecture 13 First Data Inspection

Lecture 14 Built-in Functions, Attributes and Methods with Pandas

Lecture 15 Make it easy: TAB Completion and Tooltip

Lecture 16 Explore your own Dataset: Jupyter Coding Exercise 1 (Intro)

Lecture 17 Explore your own Dataset: Jupyter Coding Exercise 1 (Solution)

Lecture 18 Selecting Columns

Lecture 19 Selecting one Column with the “dot notation”

Lecture 20 Zero-based Indexing and Negative Indexing

Lecture 21 Selecting Rows with iloc (position-based indexing)

Lecture 22 Slicing Rows and Columns with iloc (position-based indexing)

Lecture 23 Position-based Indexing Cheat Sheets

Lecture 24 Selecting Rows with loc (label-based indexing)

Lecture 25 Slicing Rows and Columns with loc (label-based indexing)

Lecture 26 Label-based Indexing Cheat Sheets

Lecture 27 Indexing and Slicing with reindex()

Lecture 28 Summary, Best Practices and Outlook

Lecture 29 Jupyter Coding Exercise 2 – Intro

Lecture 30 Jupyter Coding Exercise 2 – Solution

Lecture 31 Advanced Indexing and Slicing (optional)

Section 4: Pandas Series and Index Objects

Lecture 32 Intro

Lecture 33 First Steps with Pandas Series

Lecture 34 Analyzing Numerical Series with unique(), nunique() and value_counts()

Lecture 35 Analyzing non-numerical Series with unique(), nunique(), value_counts()

Lecture 36 Creating Pandas Series (Part 1)

Lecture 37 Creating Pandas Series (Part 2)

Lecture 38 Indexing and Slicing Pandas Series

Lecture 39 Sorting of Series and Introduction to the inplace – parameter

Lecture 40 nlargest() and nsmallest()

Lecture 41 idxmin() and idxmax()

Lecture 42 Manipulating Pandas Series

Lecture 43 Jupyter Coding Exercise 3 (Intro)

Lecture 44 Jupyter Coding Exercise 3 (Solution)

Lecture 45 First Steps with Pandas Index Objects

Lecture 46 Creating Index Objects from Scratch

Lecture 47 Changing Row Index with set_index() and reset_index()

Lecture 48 Changing Column Labels

Lecture 49 Renaming Index & Column Labels with rename()

Lecture 50 Jupyter Coding Exercise 4 (Intro)

Lecture 51 Jupyter Coding Exercise 4 (Solution)

Section 5: DataFrame Basics II

Lecture 52 Intro

Lecture 53 Filtering DataFrames by one Condition

Lecture 54 Filtering DataFrames by many Conditions (AND)

Lecture 55 Filtering DataFrames by many Conditions (OR)

Lecture 56 Advanced Filtering with between(), isin() and ~

Lecture 57 any() and all()

Lecture 58 Removing Columns

Lecture 59 Removing Rows

Lecture 60 Adding new Columns to a DataFrame

Lecture 61 Creating Columns based on other Columns

Lecture 62 Adding Columns with insert()

Lecture 63 Creating DataFrames from Scratch with pd.DataFrame()

Lecture 64 Adding new Rows (hands-on approach)

Lecture 65 Jupyter Coding Exercise 5 (Intro)

Lecture 66 Jupyter Coding Exercise 5 (Solution)

Section 6: Manipulating Elements in a DataFrame / Slice +++Important, know the Pitfalls!+++

Lecture 67 Intro

Lecture 68 Best Practice (How you should do it)

Lecture 69 Chained Indexing: How you should NOT do it (Part 1)

Lecture 70 Chained Indexing: How you should NOT do it (Part 2)

Lecture 71 View vs. Copy

Lecture 72 Simple Rules what to do when…

Lecture 73 Coding Exercise 6 (Intro)

Lecture 74 Coding Exercise 6 (Solution)

Section 7: DataFrame Basics III

Lecture 75 Intro

Lecture 76 Sorting DataFrames with sort_index() and sort_values() (Version 1.0 Update)

Lecture 77 Ranking DataFrames with rank()

Lecture 78 nunique() and nlargest() / nsmallest() with DataFrames

Lecture 79 Summary Statistics and Accumulations

Lecture 80 The agg() method

Lecture 81 Coding Exercise 7 (Intro)

Lecture 82 Coding Exercise 7 (Solution)

Lecture 83 User-defined Functions with apply(), map() and applymap()

Lecture 84 Hierarchical Indexing (Part 1)

Lecture 85 Hierarchical Indexing (Part 2)

Lecture 86 String Operations (Part 1)

Lecture 87 String Operations (Part 2)

Lecture 88 Coding Exercise 8 (Intro)

Lecture 89 Coding Exercise 8 (Solution)

Section 8: Visualization with Matplotlib

Lecture 90 Intro

Lecture 91 The plot() method

Lecture 92 Customization of Plots

Lecture 93 Histograms (Part 1)

Lecture 94 Histograms (Part 2)

Lecture 95 Barcharts and Piecharts

Lecture 96 Scatterplots

Lecture 97 Coding Exercise 9 (Intro)

Lecture 98 Coding Exercise 9 (Solution)

Section 9: —- PART 2: FULL DATA WORKFLOW A-Z —-

Lecture 99 Welcome to PART 2: Full Data Workflow A-Z

Lecture 100 Download: Part 2 Course Materials

Section 10: Importing Data

Lecture 101 Importing csv-files with pd.read_csv

Lecture 102 Importing messy csv-files with pd.read_csv

Lecture 103 Importing Data from Excel with pd.read_excel()

Lecture 104 Importing messy Data from Excel with pd.read_excel()

Lecture 105 Importing Data from the Web with pd.read_html()

Lecture 106 Coding Exercise 10

Section 11: Cleaning Data

Lecture 107 First Inspection & Handling of inconsistent Data

Lecture 108 String Operations

Lecture 109 Changing Datatype of Columns with astype()

Lecture 110 Intro NA values / missing values

Lecture 111 Detection of missing Values

Lecture 112 Removing missing values

Lecture 113 Replacing missing values

Lecture 114 Intro Duplicates

Lecture 115 Detection of Duplicates

Lecture 116 Handling / Removing Duplicates

Lecture 117 The ignore_index parameter (NEW in Pandas 1.0)

Lecture 118 Detection of Outliers

Lecture 119 Handling / Removing Outliers

Lecture 120 Categorical Data

Lecture 121 Pandas Version 1.0: New dtypes and pd.NA

Lecture 122 Coding Exercise 11 (Intro)

Lecture 123 Coding Exercise 11 (Solution)

Section 12: Merging, Joining, and Concatenating Data

Lecture 124 Intro

Lecture 125 Adding Rows with append() and pd.concat() (Part 1)

Lecture 126 Adding Rows with pd.concat() (Part 2)

Lecture 127 Arithmetic with Pandas Objects / Data Alignment

Lecture 128 EXCURSUS: Comparing two DataFrames / Identify Differences

Lecture 129 Outer Joins with merge()

Lecture 130 Inner Joins with merge()

Lecture 131 Outer Joins (without Intersection) with merge()

Lecture 132 Left Joins (without Intersection) with merge()

Lecture 133 Right Joins (without Intersection) with merge()

Lecture 134 Left Joins with merge()

Lecture 135 Right Joins with merge()

Lecture 136 Joining on different Column Names / Indexes

Lecture 137 Joining on more than one Column

Lecture 138 pd.merge() and join()

Lecture 139 Coding Exercise 12

Section 13: GroupBy Operations

Lecture 140 Intro

Lecture 141 Understanding the GroupBy Object

Lecture 142 Splitting with many Keys

Lecture 143 split-apply-combine explained

Lecture 144 split-apply-combine applied

Lecture 145 Advanced aggregation with agg()

Lecture 146 GroupBy Aggregation with Relabeling (NEW – Pandas Version 0.25)

Lecture 147 Transformation with transform()

Lecture 148 Replacing NA Values by group-specific Values

Lecture 149 Generalizing split-apply-combine with apply()

Lecture 150 Hierarchical Indexing with Groupby

Lecture 151 stack() and unstack()

Lecture 152 Coding Exercise 13 (Intro)

Lecture 153 Coding Exercise 13 (Solution)

Section 14: Reshaping and Pivoting DataFrames

Lecture 154 Intro

Lecture 155 Transposing Rows and Columns

Lecture 156 Pivoting DataFrames with pivot()

Lecture 157 Limits of pivot()

Lecture 158 pivot_table()

Lecture 159 pd.crosstab()

Lecture 160 melting DataFrames with melt()

Lecture 161 Coding Exercise 14

Section 15: Data Preparation and Feature Creation

Lecture 162 Intro

Lecture 163 Arithmetic Operations (Part 1)

Lecture 164 Arithmetic Operations (Part 2)

Lecture 165 Transformation/Mapping with map()

Lecture 166 Conditional Transformation

Lecture 167 Discretization and Binning with pd.cut() (Part 1)

Lecture 168 Discretization and Binning with pd.cut() (Part 2)

Lecture 169 Discretization and Binning with pd.qcut()

Lecture 170 Floors and Caps

Lecture 171 Scaling / Standardization

Lecture 172 Creating Dummy Variables

Lecture 173 String Operations

Lecture 174 Coding Exercise 15

Section 16: Advanced Visualization with Seaborn

Lecture 175 Intro

Lecture 176 First Steps in Seaborn

Lecture 177 Categorical Plots

Lecture 178 Joint Plots / Regression Plots

Lecture 179 Matrixplots / Heatmaps

Lecture 180 Coding Exercise 16


Lecture 181 Intro and Downloads

Section 18: Data Manipulation and Aggregation Challenge (Olympic Medal Tables)

Lecture 182 Olympic Medal Tables (Instruction & Hints)

Lecture 183 Olympic Medal Tables (Solution Part 1)

Lecture 184 Olympic Medal Tables (Solution Part 2)

Lecture 185 Olympic Medal Tables (Solution Part 3)

Section 19: Explanatory Data Analysis Challenge

Lecture 186 Challenge Introduction and Overview

Lecture 187 Merging and Concatenating (Solution Part 1)

Lecture 188 Data Cleaning 1 (Solution Part 2)

Lecture 189 Data Cleaning 2 (Solution Part 3)

Lecture 190 The most successful Countries (Solution Part 4)

Lecture 191 Impact of GDP, Population and Politics (Solution Part 5)

Lecture 192 Statistical Analysis and Hypothesis Testing (Solution Part 6)

Lecture 193 Aggregating and Ranking (Solution Part 7)

Lecture 194 Summer Games vs. Winter Games – does Location matter? (Solution Part 8)

Lecture 195 Men vs. Women – do Culture & Religion matter? (Solution Part 9)

Lecture 196 National Sports and Traditions (Solution Part 10)


Lecture 197 Welcome to PART 4: Finance and Investments with Pandas

Lecture 198 Download: Part 4 Course Materials

Section 21: Time Series Basics

Lecture 199 Importing Time Series Data from csv-files

Lecture 200 Converting strings to datetime objects with pd.to_datetime()

Lecture 201 Initial Analysis / Visualization of Time Series

Lecture 202 Indexing and Slicing Time Series

Lecture 203 Creating a customized DatetimeIndex with pd.date_range()

Lecture 204 More on pd.date_range()

Lecture 205 Downsampling Time Series with resample() (Part 1)

Lecture 206 Downsampling Time Series with resample (Part 2)

Lecture 207 The PeriodIndex object

Lecture 208 Advanced Indexing with reindex()

Section 22: Pandas for Finance and Investing

Lecture 209 Intro

Lecture 210 Getting Ready (Installing required package)

Lecture 211 Importing Stock Price Data from Yahoo Finance (it still works!)

Lecture 212 Initial Inspection and Visualization

Lecture 213 Normalizing Time Series to a Base Value (100)

Lecture 214 The shift() method

Lecture 215 The methods diff() and pct_change()

Lecture 216 Measuring Stock Performance with MEAN Returns and STD of Returns

Lecture 217 Financial Time Series – Return and Risk

Lecture 218 Financial Time Series – Covariance and Correlation

Lecture 219 Helpful DatetimeIndex Attributes and Methods

Lecture 220 Filling NA Values with bfill, ffill and interpolation

Lecture 221 Coding Exercise 17


Lecture 222 Overview & Downloads

Section 24: Introduction to Regression and Classification

Lecture 223 Machine Learning – an Overview

Lecture 224 Linear Regression with scikit-learn – a simple Introduction

Lecture 225 Making Predictions with Linear Regression

Lecture 226 Overfitting

Lecture 227 Underfitting

Lecture 228 Logistic Regression with scikit-learn – a simple Introduction (Part 1)

Lecture 229 Logistic Regression with scikit-learn – a simple Introduction (Part 2)

Section 25: BONUS: Machine Learning Project A-Z (Regression)

Lecture 230 Project Intro

Lecture 231 Importing the Dataset and first Inspection

Lecture 232 Cleaning the Data and Creating more Features

Lecture 233 Explanatory Data Analysis (Part 1)

Lecture 234 Explanatory Data Analysis (Part 2)

Lecture 235 Feature Engineering (Part 1)

Lecture 236 Feature Engineering (Part 2)

Lecture 237 Splitting the Data into Training Set and Test Set

Lecture 238 Training the Machine Learning Model

Lecture 239 Testing/Evaluating the Model with the Test Set

Lecture 240 Feature Importance


Lecture 241 Intro and Overview

Lecture 242 How to update Pandas to Version 1.0

Lecture 243 Downloads for this Section

Lecture 244 Important Recap: Pandas Display Options (Changed in Version 0.25)

Lecture 245 Info() method – new and extended output

Lecture 246 NEW Extension dtypes (“nullable” dtypes): Why do we need them?

Lecture 247 Creating the NEW extension dtypes with convert_dtypes()

Lecture 248 NEW pd.NA value for missing values

Lecture 249 The NEW “nullable” Int64Dtype

Lecture 250 The NEW StringDtype

Lecture 251 The NEW “nullable” BooleanDtype

Lecture 252 Addition of the ignore_index parameter

Lecture 253 Removal of prior Version Deprecations


Lecture 254 Welcome to the Appendix

Section 28: Python Basics

Lecture 255 Downloads

Lecture 256 Intro

Lecture 257 First Steps

Lecture 258 Variables

Lecture 259 Data Types: Integers and Floats

Lecture 260 Data Types: Strings

Lecture 261 Data Types: Lists (Part 1)

Lecture 262 Data Types: Lists (Part 2)

Lecture 263 Data Types: Tuples

Lecture 264 Data Types: Sets

Lecture 265 Operators & Booleans

Lecture 266 Conditional Statements (if, elif, else, while)

Lecture 267 For Loops

Lecture 268 Key words break, pass, continue

Lecture 269 Generating Random Numbers

Lecture 270 User Defined Functions (Part 1)

Lecture 271 User Defined Functions (Part 2)

Lecture 272 User Defined Functions (Part 3)

Lecture 273 Visualization with Matplotlib

Lecture 274 Python Basics Quiz: Solution

Section 29: The Numpy Package

Lecture 275 Downloads

Lecture 276 Introduction to Numpy Arrays

Lecture 277 Numpy Arrays: Vectorization

Lecture 278 Numpy Arrays: Indexing and Slicing

Lecture 279 Numpy Arrays: Shape and Dimensions

Lecture 280 Numpy Arrays: Indexing and Slicing of multi-dimensional Arrays

Lecture 281 Numpy Arrays: Boolean Indexing

Lecture 282 Generating Random Numbers

Lecture 283 Performance Issues

Lecture 284 Case Study: Numpy vs. Python Standard Library

Lecture 285 Summary Statistics

Lecture 286 Visualization and (Linear) Regression

Lecture 287 Numpy Quiz: Solution

Section 30: Statistical Concepts

Lecture 288 Statistics – Overview, Terms and Vocabulary

Lecture 289 Downloads for this Section

Lecture 290 Population vs. Sample

Lecture 291 Visualizing Frequency Distributions with plt.hist()

Lecture 292 Relative and Cumulative Frequencies with plt.hist()

Lecture 293 Measures of Central Tendency (Theory)

Lecture 294 Coding Measures of Central Tendency – Mean and Median

Lecture 295 Coding Measures of Central Tendency – Geometric Mean

Lecture 296 Variability around the Central Tendency / Dispersion (Theory)

Lecture 297 Minimum, Maximum and Range with Python/Numpy

Lecture 298 Percentiles with Python/Numpy

Lecture 299 Variance and Standard Deviation with Python/Numpy

Lecture 300 Skew and Kurtosis (Theory)

Lecture 301 How to calculate Skew and Kurtosis with scipy.stats

Lecture 302 How to generate Random Numbers with Numpy

Lecture 303 Reproducibility with np.random.seed()

Lecture 304 Probability Distributions – Overview

Lecture 305 Discrete Uniform Distributions

Lecture 306 Continuous Uniform Distributions

Lecture 307 The Normal Distribution (Theory)

Lecture 308 Creating a normally distributed Random Variable

Lecture 309 Normal Distribution – Probability Density Function (pdf) with scipy.stats

Lecture 310 Normal Distribution – Cumulative Distribution Function (cdf) with scipy.stats

Lecture 311 The Standard Normal Distribution and Z-Values

Lecture 312 Properties of the Standard Normal Distribution (Theory)

Lecture 313 Probabilities and Z-Values with scipy.stats

Lecture 314 Confidence Intervals with scipy.stats

Lecture 315 Covariance and Correlation Coefficient (Theory)

Lecture 316 Cleaning and preparing the Data – Movies Database (Part 1)

Lecture 317 Cleaning and preparing the Data – Movies Database (Part 2)

Lecture 318 How to calculate Covariance and Correlation in Python

Lecture 319 Correlation and Scatterplots – visual Interpretation

Lecture 320 What is Linear Regression? (Theory)

Lecture 321 A simple Linear Regression Model with numpy & Scipy

Lecture 322 How to interpret Intercept and Slope Coefficient

Lecture 323 Case Study (Part 1): The Market Model (Single Factor Model)

Lecture 324 Case Study (Part 2): The Market Model (Single Factor Model)

Section 31: Download .py files

Lecture 325 Parts 1 & 2 .py files

Section 32: What´s next? (outlook and additional resources)

Lecture 326 Bonus Lecture

Everyone who want to step into Data Science. Pandas is Key to everything.,Data Scientists who want to improve their Data Handling/Manipulation skills.,Everyone who want to switch Data Projects from Excel to more powerful tools (e.g. in Research/Science),Investment/Finance Professionals who reached the limits of Excel.

Course Information:

Udemy | English | 33h 54m | 12.06 GB
Created by: Alexander Hagmann

You Can See More Courses in the Developer >> Greetings from

New Courses

Scroll to Top