Web Scraping 101 with Python3 using REQUESTS LXML SPLASH
What you’ll learn
LXML core fundamentals
XPath & CSS selectors
How send HTTP requests with Python
Scraping HTML web pages
Scraping multiple pages using recursion
Scraping APIs
Splash HTTP API
Scraping JavaScript websites using Splash
Authentication and Login to websites using Requests
Web scraping best practices
Building datasets
Requirements
Basic level of Python
PC with Internet connection
Description
What is web scraping ?Let’s say your boss at work gave you a task where he wants you to extract about 1000 product from a website, structure the data and save it to a database, would you copy paste manually all the product details from the product name, url and price ? I can imagine you would work days and nights and you wont finish the task, so this is where web scraping shines. So web scraping, or web harvesting or web data extraction is like writing a script that will automate data extraction from websites in a matter of minutes !.Why learn Web Scraping ?Whether you’re a data analyst, a web developer or even someone who wants to work as a freelancer you should learn web scraping. For a data analyst building a dataset is extremely important, so without web scraping you simply can’t generate it in addition to that adding web scraping in your resume is a plus for you. Web scraping can be used in a variety of fields, so let me give you some examples on what you can do with it: Generate leads, Drop shipping where basically you gonna constantly scrape products from different online stores and show case them on your website to make money,Monitor products prices to get the best deals,automation,Machine learning,Web scraping freelancerOf course there are tons and tons and variety of fields where web scraping can be extremely beneficial. Is this course the right one for you ?I’ve carefully planned and designed this course to be beginner friendly, from my experience I know those who do web scraping are mostly data analyst with no background knowledge on how the web works, how requests are made, how to locate and parse the data from the web and much much more, in addition to that this the most updated course regarding the material included and the tools used, so in this course I’ll introduce to you the most used web scraping tools/frameworksWe will setup the development environment from scratchYou will learn and understand LXML core fundamentalsHow to use XPath & CSS selectors to select the data from a web pageHow the web works (Request/Response)How to scrape simple HTML web pages How to scrape multiple web pagesExtract data from APIsYou will learn Splash(crash course) so you can use it to scrape JavaScript websitesAuthentication/Login Store the extracted data whether to JSON/CSV files or MongoDb/SQLite3Exclusive tips and tricks regarding web scrapingFinally this course is project based, each section starting from the 2nd one we will experiment with a different website, each project has a certain degree of difficulty and each one is completely independent from other projects. Is there is any assignments/exercises included in this course ? Yes, each section has an assignment included to it, this will help to get your hands dirty and by the end of each section after doing the assignment included you will feel more confident and comfortable with web scraping. Why LXML and not BeautifulSoup ?LXML is a lightweight HTML parser even the most popular web scraping framework (Scrapy) is built on the top of LXML, BeautifulSoup is a little bit overloaded with the number of functions exposed to us, it has more functions to use, yes that’s right ! however in Web Scraping most of the time we use XPath and CSS Selectors to navigate and select what to scrape from the HTML web page (tree) so there is no need to learn about new functions and wasting all that time to familiarize yourself with the BeautifulSoup API and the internal architecture, in addition to all of that LXML in terms of performance is way better than BeautifulSoup. Who is your instructor ? Hi! I’m Ahmed nice to meet you, my students prefer to call me web scraping Ninja and currently I have taught more than 2000 students around the world how to do web scraping. I personally do web scraping on daily basis whether for fun, for personal projects or as a freelancer and guess what ? I even have a master degree in computer science. Should I enroll to this course ?Honestly, by enrolling to this course you have nothing to lose, because if this course didn’t meet your requirements, you can always ask for a refund in less than 30 days from the day you enrolled to the course guaranteed by Udemy with NO QUESTION TO ASK !SO IF YOU DON’T KNOW ANYTHING ABOUT WEB SCRAPING & YOU DON’T KNOW WHERE TO START ENROLL NOW ! 🙂
Overview
Section 1: Getting Started
Lecture 1 Course Introduction
Lecture 2 Web Scraping tools
Lecture 3 Setting up the development environnement
Lecture 4 Udemy 101 (OPTIONAL)
Lecture 5 How to Ask questions (Please don’t skip)
Section 2: LXML core fundamentals
Lecture 6 Section Info
Lecture 7 ElementTree object
Lecture 8 Element object
Lecture 9 Introduction to LXML with XPath
Lecture 10 Introduction to LXML with CSS Selectors
Lecture 11 Code source lecture by lecture
Section 3: XPath & CSS Selectors
Lecture 12 Section Info
Lecture 13 What is XPath & CSS
Lecture 14 CSS Selectors fundamentals
Lecture 15 CSS selectors in theory
Lecture 16 XPath fundamentals
Lecture 17 Navigating using XPath(Going UP)
Lecture 18 Navigating using XPath(Going DOWN)
Lecture 19 XPath in theory
Section 4: HTTP Requests with Python
Lecture 20 Section Info
Lecture 21 How the web works
Lecture 22 Python Requests
Lecture 23 Request/Response headers
Section 5: Project 1: Simple & Clean
Lecture 24 Section Info
Lecture 25 Locating the data
Lecture 26 Building the Scraper
Lecture 27 Cleaning the data
Lecture 28 Writing data to JSON/CSV files
Lecture 29 Turning it into a command line app
Lecture 30 Project 1 source code
Section 6: Project 2: Recursion
Lecture 31 Section Info
Lecture 32 Getting rid of unnecessary JavaScript
Lecture 33 Scraping Data
Lecture 34 Scraping multiple pages (Recursion)
Lecture 35 Storing the data in MongoDb cloud
Lecture 36 6-5 Prevent storing same records and updating records
Lecture 37 CoinMarketCap update
Lecture 38 Project 2 source code
Section 7: Project 3: APIs
Lecture 39 Section Info
Lecture 40 API/HTML What’s the difference ?
Lecture 41 Generating code using Postman
Lecture 42 Parsing APIs
Lecture 43 Recursion challenge
Lecture 44 Challenge solution (Scraping APIs recursively)
Lecture 45 Inserting data into SQLite3 database
Lecture 46 Project 3 source code
Section 8: Splash crash course
Lecture 47 Section Info
Lecture 48 What is Splash ?
Lecture 49 Setting up Splash
Lecture 50 Intro to Splash
Lecture 51 Selecting Elements, filling Inputs and clicking on Buttons
Lecture 52 Splash Request & Response headers
Lecture 53 Very useful resource (SPLASH FAQ)
Section 9: Project 4: Scraping JavaScript websites using Splash, Requests and LXML
Lecture 54 Section Info
Lecture 55 Splash private mode and cookies
Lecture 56 Quick note (Splash private mode)
Lecture 57 Using Splash with Requests
Lecture 58 Parsing the Response
Lecture 59 Project 4 source code
Section 10: Project 5: Authentication/Login
Lecture 60 Section Info
Lecture 61 Browser authentication
Lecture 62 Requests authentication
Lecture 63 Parse and clean HTML
Lecture 64 Project 5 source code
Section 11: BONUS
Lecture 65 Bonus lecture
Anyone wants to learn Web scraping using Python, Requests and LXML,Anyone wants to learn how to use Splash to scrape JavaScript websites,Complete beginners with no background on web scraping,Those who already have basic familiarity with web scraping and want to fill the gaps
Course Information:
Udemy | English | 4h 28m | 2.21 GB
Created by: Ahmed Rafik
You Can See More Courses in the IT & Software >> Greetings from CourseDown.com