How can we use WSA within our project? Let’s have a look at this quick example where we scrape Amazon’s to find the most expensive Graphics Card on a page. This example is written in JavaScript, but you can do it in any programming language you feel comfortable with.
First, we need to install some packages to help us out with the HTTP request (got) and parsing the result (jsdom) using this command line in the project’s terminal:
npm install got jsdom
Our next step is to set the parameters necessary to make our request:
const params = {
api_key: "XXXXXX",
url: "https://www.amazon.com/s?k=graphic+card"
}
This is how we prepare the request to WebScrapingAPI to scrape the website for us:
const response = await got('https://api.webscrapingapi.com/v1', {searchParams: params})
Now we need to see where each Graphics Card element is located inside the HTML. Using the Developer Tool, we found out that the class s-result-item contains all the details about the product, but we only need its price.
Inside the element, we can see there is a price container with the class a-price and the subclass a-offscreen where we will extract the text representing its price.
WebScrapingAPI will return the page in HTML format, so we need to parse it. JSDOM will do the trick.
const {document} = new JSDOM(response.body).window
After sending the request and parsing the received response from WSA, we need to filter the result and extract only what is important for us. From the previous step, we know that the details of each product are in the s-result-item class, so we iterate over them. Inside each element, we check if the price container class a-price exists, and if it does, we extract the price from the a-offscreen element inside it and push it into an array.
Finding out which is the most expensive product should be child’s play now. Just iterate through the array and compare the prices between one another.
Wrapping it up with an async function and the final code should look like this:
const {JSDOM} = require("jsdom");
const got = require("got");
(async () => {
const params = {
api_key: "XXX",
url: "https://www.amazon.com/s?k=graphic+card"
}
const response = await got('https://api.webscrapingapi.com/v1', {searchParams: params})
const {document} = new JSDOM(response.body).window
const products = document.querySelectorAll('.s-result-item')
const prices = []
products.forEach(el => {
if (el) {
const priceContainer = el.querySelector('.a-price')
if (priceContainer) prices.push(priceContainer.querySelector('.a-offscreen').innerHTML)
}
})
let most_expensive = 0
prices.forEach((price) => {
if(most_expensive < parseFloat(price.substring(1)))
most_expensive = parseFloat(price.substring(1))
})
console.log("The most expensive item is: ", most_expensive)
})();