Skip to content

An introduction to Web Scraping with Python and Azure Functions.

License

Notifications You must be signed in to change notification settings

pyladiesams/web-scraping-beginner-may2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An introduction to Web Scraping with Python and Azure Functions

Level: Beginner

Presentation: Presentation slides

Workshop description

During the workshop you will learn how to implement a web scraper using Scrapy, store the output in a Blob storage on Azure, and use an Azure function to generate a wordcloud of the text obtained.

Requirements

Python version: 3.8.5

You can check the required python libraries to run this project here

Tools:

Usage

  • Clone the repository
  • Start Visual Studio Code and navigate to the solutions folder

scrapy_workshop_pyladies

To put our spider to work, go to the project’s top level directory and run:

scrapy crawl cuisines

wordcloud_azure_function

BlobTrigger - Python

The BlobTrigger makes it incredibly easy to react to new Blobs inside of Azure Blob Storage. This sample demonstrates a simple use case of processing data from a given Blob using Python.

How it works

For a BlobTrigger to work, you provide a path which dictates where the blobs are located inside your container, and can also help restrict the types of blobs you wish to return. For instance, you can set the path to samples/{name}.png to restrict the trigger to only the samples path and only blobs with ".png" at the end of their name.

Learn more

Azure Blob storage trigger for Azure Functions

Video record

Re-watch YouTube stream here

Credits

This workshop was set up by @pyladiesams and @danielamiranda