AutoScraper and Flask: Create an API From Any Website in Less Than 5 Minutes

Written by Mika | Published 2020/09/25
Tech Story Tags: programming | python | data-science | software-engineering | software-development | artificial-intelligence | webscraping-by-ai | technology | web-monetization

TLDR In this tutorial, we are going to create our own e-commerce search API with support for both eBay and Etsy search results. We are able to achieve this goal in fewer than 20 lines of Python code for each site using AutoScraper and Flask. The final code for this tutorial is available on GitHub. This is a development setup suitable for developing and testing. For production usage, please check Flask’s deployment options. This tutorial is intended for personal and educational use. If you want to scrape websites, you can check their policies regarding the scraping bots.via the TL;DR App

In this tutorial, we are going to create our own e-commerce search API with support for both eBay and Etsy without using any external APIs.
With the power of AutoScraper and Flask, we are able to achieve this goal in fewer than 20 lines of Python code for each site. I recommend reading my last article about AutoScraper if you haven’t done so yet.

Requirements

Install the required libraries using pip:
pip install -U autoscraper flask

Let’s Do It

First, we are going to create a smart scraper to fetch data from eBay’s search results page. Let’s say we want to get the title, price, and product link of each item. Using AutoScraper, it would be easily done by just providing some sample data:
Note that if you want to copy and run this code, you may need to update the
wanted_list
.
Now let’s get the results grouped by scraping rules:
scraper.get_result_similar(url, grouped=True)
From the output, we’ll know which rule corresponds to which data, so we can use it accordingly. Let’s set some aliases based on the output, remove redundant rules, and save the model so we can use it later:
Note that the rule IDs will be different for you if you run the code.
OK, we’ve got eBay covered. Let’s add support for Etsy search results too. We’ll start by building its scraper. This time, we will use 
wanted_dict 
instead of
wanted_list
. It will automatically set aliases for us:
As the links are generated with a unique ID on Etsy each time, we added one sample product ID to the 
wanted_dict
 so we can create the link from it. Also, we provided two samples for title and price, as the structure of items on Etsy search result pages is different and we want the scraper to learn them all.
After analyzing the output, let’s keep our desired rules, remove the rest, and save our model:
Now that we have our scrapers ready, we can create our fully functioning API for both sites in fewer than 40 lines:
Here, we are defining an API with the parameter 
q
 as our search query. We will get and join eBay and Etsy search results and return them as a response. Note that we are
passing group_by_alias=True
 to the scraper to get the results grouped by our defined aliases.
By running this code, the API server will be up listening on port 8080. So let’s test our API by opening 
http://localhost:8080/?q=headphone
 in our browser:
Some results from eBay
Some results from Etsy
Voila! We have our e-commerce API ready. Just replace 
headphone
 in the URL with your desired query to get its search results.

Final Notes

The final code for this tutorial is available on GitHub.
This is a development setup suitable for developing and testing. Flask’s built-in server is not suitable for production. For production usage, please check Flask’s deployment options.
This tutorial is intended for personal and educational use. If you want to scrape websites, you can check their policies regarding the scraping bots.
I hope this article is useful and helps to bring your ideas into code faster than ever. Happy coding!

Written by Mika | Software Engineer - Always Learning
Published by HackerNoon on 2020/09/25