Survey the prices of your competitor directly online with Python Beautiful Soup

datamink
3 min readMar 27, 2020

In retail, survey the prices of the different competitors is a frequent task that require a lot of ressources. A simple way can be to extract the prices from their online assortment.

Web scraping performs price survey in minutes

Beautiful Soup Library

Recently, I started a category management and had to compare the client’s price index versus the other competitors. Usually we do it by visiting physically the competitors and secretly survey their prices, which is time consuming and to be honest very boring.

Then I looked for a better alternative, because majority of the retailer are now in online business and they apply (almost) the same price than the physical stores. So why not to take the price directly from their website ?

The concept was there but I was afraid as I don’t know that much HTML and was expecting a long period of black clouds. But I quickly discover Pyhton Beautiful Soup and suddenly the sky became clear !

This Library is very simple to start Web Scraping, even without high knowledge of HTML you can simply run it and enjoy with popcorn.

You can find the full Beautiful Soup documentation here

Web Scraping

For the purpose of the article I will launch the prices survey from https://www.carrefouruae.com/ then you can simply change according to your needs.

The first steps are to identify the :

1/ URL of the page you want to scrap

2/ The class attribute of the product name

3/ The class attribute of the product price

To get the URL just copy the URL of the first page you want to scrap, and identify where the page number is located :

https://www.carrefouruae.com/mafuae/en/food-cupboard/c/F1700000?&qsort=relevance&pg=0

Here luckily the page number is located at the end of the URL, so you can just iterate the number of pages you want to scrap.

To get the class attribute, go to the page and click right “inspect” on the product name :

Click right on “inspect”

The console will open the HTML tree on the right side, you have now to find the class attribute :

In this case it is (‘p’, {‘class’: ‘comp-productcard__name’})

Do the same and get the following product price class attribute : (‘p’, {‘class’: ‘comp-productcard__price’})

Code :

Output :

That is it, pretty easy with Beautiful Soup library !

You can extract more information if you want, or even the picture if you are at the stage of building your own online business.

Now the next job is to match the price survey with your assortment, and as you don’t have the bar-code you need to match the product description (using tf-idf matrix with cosine similarity), which will be described soon in another story.

--

--

datamink

Sharing some thought and advanced data analytics tips.