Web scraping is a method of extracting and restructuring information from web pages. Given the enormous quantity of unstructured data that is now populating the internet, web scraping is an essential skill for making sense of the world as we enter an information-led epoch. This tutorial introduces the concept and then moves on to basic techniques for web scraping using R, which is the most common programming language for statistical analysis. By the end of the tutorial, participants will be able to harvest unstructured text data from any website of their choosing and then perform simple analytics on the data they have collected. The script is available for download on GitHub.
Web scraping tutorial for beginners