How to Scrape Product Data from Google Shopping
Originally posted on https://medium.com/serpapi/how-to-scrape-product-data-from-google-shopping-f1193abd5dc3
Google provides a wealth of data for outside sources, much of it very useful if able to be harnessed and properly formatted for the intended end use. This tutorial will explore utilizing SerpAPI’s Google Product Results API to scrape Google Shopping, specifically to scrape Google’s product pages. Users will want to use this API to scrape Google for a product’s price, description, reviews, title, price comparisons with other online stores, and other product suggestions.
For our tutorial, we will be using the “DeWalt DCD771C2” product throughout.
It’s important to note that SerpAPI has two APIs that target shopping and products: 1) the Google Shopping Results API, and 2) the Google Product Results API.
Google Shopping Results API vs Google Product Results API
Both APIs target scraping Google for data points pertaining to shopping/product data, but each has its own special use. The Google Shopping Results API is used to scrape the search results returned when querying Google Shopping. You can filter results by price, seller, and other particular parameters and features that are unique to that product’s category, depending upon whether Google recognizes those parameters or not. Using our example product, the above-referenced DeWalt drill, some unique parameters returned include battery features, weight, chuck size of drills, power type (cordless vs corded), etc. And obviously those parameters will change depending upon the product type you searched.
Google Shopping SERP results for DeWalt DCD771C2
The Google Product Results API will allow you to scrape the data returned from that particular product’s unique Google product page. Each product box in the Google Shopping search result will direct the user to that product’s page.
What are Google Product Pages?
A Google product page is a derivative of Google’s Shopping platform, whereby each product has a unique identifying page containing several identifying properties. The best way to think of a product page is to frame it as a landing page for that specific product. A product page may contain its product title, price, rating, reviews, description, specs, features, prices from other online stores, other products from that manufacturer, and similar products from competing manufacturers. Each product listed is identified by a unique id number.
Google Product Page for DeWalt DCD771C2
Google product pages are accessible either by navigating directly to that product’s page, or via a link from Google Shopping’s search results, when searching for that product.
Starting on Google’s Shopping page, type in the query “DeWalt DCD771C2.” The first result on the page is what we will be focusing on. The product result box will expand to show more details about that product when clicking on the title. There are two links at the bottom of the product box that you want to notice — “Related Items” and “Reviews” — that will lead you to that product’s unique stand-alone page, as pictured above.
Google Shopping result expanded
A screenshot taken from our documentation illustrates the scrape-able portions of the product pages. A complete break down of the page, a list of all parameters available, along with what the data looks like when returned through JSON format may be found here.
Scrape-able portions of a Google Product Page — https://serpapi.com/google-product-api
We’re going to use our playground to simulate a search. The playground will return a link for the scraped data in HTML and JSON format. For those that haven’t used our playground, it’s a dashboard that provides a quick and easy method to use our APIs and their corresponding parameters. Head over to the playground and make sure you set the search type at the top left corner to Google Product.
Google Product API Playground
The next parameter to address is the Product ID (product_id) search field. This field identifies the product that will be queried and will only accept the unique Product ID that’s been assigned by Google. This number is found in the URL of that product’s product page, immediately after product/ , in this case is 2478210754218635618.
Google Product ID for DeWalt DCD771C2
Once all parameters are set, hit search. You will notice that the API returned back two visuals on the page — the results in HTML and JSON. Access to the links that provide these results is accessible by clicking the “Export To Code” button on the far top-right corner. A drop-down box will provide you with a link to the HTML and JSON URLs, along with the code version of the parameters of your query in eight different languages/environments.
Google Product results links/code
You can navigate to your scraped data results by following either the HTML or JSON URLs provided. Here are the URLs to the data just scraped:
JSON — https://serpapi.com/search.json?engine=google_product&product_id=2478210754218635618&google_domain=google.com&gl=us
HTML — https://serpapi.com/search.html?engine=google_product&product_id=2478210754218635618&google_domain=google.com&gl=us
There you have it — we just used SerpAPI’s Google Product API. The uses and possibilities are endless.
Sorry, the comment form is closed at this time.