What you will learn

  • Fundamental concepts of web scraping

  • Ethics and legality of web scraping

  • Common data scraping applications and ways to monetise it

  • Dealing with scraping countermeasures to scrape sites/apps that don't want to be scraped

  • Helpful tools for web and API scraper development: Chrome Devtool, curl, mitmproxy and others

  • Fetching and parsing HTML

  • Data extraction with XPath queries, CSS selector and regular expressions

  • Reverse engineering private API of web and mobile apps for data scraping purposes (API scraping)

  • How to get serious with Scrapy framework

  • What is Selenium and why you don't need it most of the time

Who is this course for?

For anyone who knows basic of Python and would benefit from scraping some data...

  • Web developers

  • Tech entrepreneurs

  • Security professionals

  • Ecommerce merchants

  • Data analysts and scientists

  • OSINT investigators

  • Digital marketers and growth hackers

Data is the new oil - learn to extract it

This course will introduce you to tools and techniques to scrape data even from the sites that are utilizing anti-bot technologies. Learn to automatically gather data with Python libraries such as requests, BeautifulSoup, lxml and Scrapy. 

Introducing API Scraping

You don't always need to parse HTML. Sometimes data is available in structured form through a private API for easy extraction. We will show you how.

Scrape mobile app data as well!

We introduce tools and techniques to reverse engineer mobile app network communications to reproduce requests for automation and data extraction.

Case studies

Three real-world examples of scraping projects that you may be doing as a freelancer.

  • Intro to web

    Introduce yourself to HTTP, HTML, REST, JSON, XML

  • Tools of the trade

    Go beyond the GUI with curl, wget, mitmproxy and other tools

  • Scraping the web

    Learn how to traverse pages and extract information

  • Scraping the APIs

    Discover and reverse engineer hidden Application Programming Interfaces for data extraction from web and mobile apps

  • Scrapy framework

    Level up your web scraping for scale and reliability

  • Counter the countermeasures

    Scrape sites and apps that don't want to be scraped

Course curriculum

  1. 2
  2. 3
  3. 4
    • 01. Fetching HTML with Python

      FREE PREVIEW
    • 02. Parsing HTML with BeautifulSoup

    • 03. Parsing HTML with lxml

    • 04. XPath

    • 05. Traversing the pages

    • 06. Regular expressions

    • 07. Using pandas to parse HTML tables

    • 08. Using js2xml to get data from Javascript

    • 09. Leveraging JSON inside HTML pages

    • 10. Using CSS selectors for scraping

    • 11. Programmatic browser control with Selenium

    • Resources

  4. 5
  5. 6
    • 01. Scraping public APIs

      FREE PREVIEW
    • 02. Discover hidden web APIs

    • 03. Scraping hidden web APIs

    • 04. Setting up mitmproxy with iOS device

    • 05. Setting up mitmproxy with Android device

    • 06. Scraping private API of mobile app

    • Resources

  6. 7
  7. 8
    • 01. Introducing email harvesting through Google

    • 02. Harvesting emails from Google SERP

    • 03. Harvesting emails from LinkedIN profiles

  8. 9
    • 01. Introducing Zillow FSBO scraping

    • 02. Implementing Scrapy project

  9. 10
    • 01. Introducing GOAT API scraping

    • 02. Scraping sneaker data from GOAT API

  10. 11
    • 01. Summary

    • 02. Homework

    • 03. Further learning

    • Resources

Email [email protected] for any inquiries.