With more than 40.000 searches happening in Google per second, Google Trends is a powerful tool that allows us to visualize searching behavior and uncover trends in Web Search, Google News, Google Images, Google Shopping, and YouTube. A sample of that size can provide a lot of insights to inform a business marketing strategy, which products or services to focus on, identify interests based on location, and much more. To get more out of the tool, today we’ll build a simple Google Trends scraper using PyTrends, an unofficial Google Trends API. Why Using PyTrends Instead of Google Trends Interface? There’s no problem with using ’ website as it is. It has all the features you’ll need to run your analysis and it has a clear interface. Google Trends So why would you complicate the process by building a script? Well, like most things in web scraping, it comes down to time and scalability. You can enter each keyword one by one and track different timeframes. This is a good method until you have a list of hundreds of keywords. Checking each one by hand will result in a lot of wasted time. Instead, we can easily automate the reports for a list of keywords in a few seconds or minutes using , and because PyTrends is a Python API, using the tool is effortless to use. PyTrends With that said, there are a few things to understand about Google Trends before actually writing our code. How Does Google Trends Work? Google Trends is a tool that represents, well, trends. It does not show any search volume for the keywords. Instead, it shows these trends in a graph that’s scaled based on the highest peak of interest over the time period we specify. As an example, let’s look for a ‘fidget spinner’ in the US: According to the graph, the interest in fidget spinners has been constant and quite high, right? Well… that’s when it gets tricky. We have to remember that the graph is showing us data based on the highest point over 12 months. 100 represents the highest point and 0 represents the lack of data. Let’s see what happens when we zoom out: Doesn’t seem so stable now, does it? As we can see, after getting into the market, fidget spinners exploded in popularity and then rapidly decreased in popularity. Besides timeframe, region, categories and platform will influence the trends data, so make sure to understand what you’re looking for before doing any analysis. So we’ll have to account for those factors in our scraper. How to Build a Google Trends Scraper with PyTrends Now that we know the basics, let’s start writing our PyTrends Script: 1. Install Python and PyTrends If you’re using Mac, you probably already have a version of Python installed on your machine. To check if that’s the case, enter python -v into your terminal. For those of you who don’t have any version of python installed or want to upgrade, we recommend using , instructions are inside the link. Homebrew With Homebrew package manager installed, you can now install the last version of Python by using the command brew install python3. You’ll be able to verify the installation with python3 --version command. The next step is to install PyTrends using pip install . If you don’t have pip installed in your machine, Python is now able to install it without any extra tool. Just go to your terminal and type . pytrends sudo -H python -m ensurepip Your development environment should be ready to go now! Note: when we check the documentation for PyTrends, it says that it requires , , and . If you don’t have those, just pip install them as well. Requests LXML Pandas 2. Connect to Google Trends Using PyTrends Alright, we’ll create a new file named ‘ ’ and open it in your text editor and we’ll add our first lines to connect to Google Trends. gtrends-scraper.py pytrends.request TrendReq

pytrends = TrendReq(hl= ) from import 'en-US' Note: hl stands for ‘host language’ and it can be changed for any other location you might need. 3. Write the Payload The payload is where we’ll store all the parameters of our request to be sent to the server. When checking PyTrend’s documentation, we can see there are five different inputs we can add to our payload (which are the same we would use on the original platform). kw_list (list of keywords we want to analyze) cat (category) timeframe geo (region or location of the data) gprop (Google’s property) Let’s take look at the snippet presented for us in the documentation: kw_list = [ ]

pytrends.build_payload(kw_list, cat= , timeframe= , geo= , gprop= ) "Blockchain" 0 'today 5-y' '' '' The cat is equal to 0 as a default. If you go to Google Trends and change the category, you’ll be able to see that every category has a value assigned. For example, for ‘Arts & Entertainment’ cat is equal to 3. Depending on your needs, you can change this value to whatever category you want to select. For this example, let’s set it as 14 for ‘People & Society’. Note: Noticed that although ‘blockchain’ is only one keyword, it is still passed as a list. Also, there’s a limit of 5 keywords we can send at a time but that’s because it’s the same limit of keywords we can compare on Google Trends’ site. But you can add more than 5 to the list as long as you’re passing them one by one. We’ll add our parameters values as variables to make it easier to work with them. So here’s how our code is looking so far: pytrends.request TrendReq

pytrends = TrendReq(hl= )

all_keywords = [ , , ]

cat = #people&society

timeframes = [ , , , ]

geo = #worldwide

gprop = #websearch

pytrends.build_payload(

    kw_list,

    cat,

    timeframes[ ],

    geo,

    gprop

    ) from import 'en-US' 'event management' 'event planning' 'event planner' '14' 'today 5-y' 'today 12-m' 'today 3-m' 'today 1-m' '' '' 0 We use a variable named all_keywords because we’ll use it to pass each keyword individually - if not, it would compare them with each other. We added a variable for time frames to be able to analyze the keyword from different timeframes. However, the [0] will select the first one of the list.3. Extract Data from Google Trends The first thing we need to implement is a temporary list called keywords and make it an empty list: keywords = []. Then, we’ll wrap our payload inside a function called check_trends(). Plus, we’ll add a new variable to our function that will return pandas.Dataframe: data = pytrends.interest_over_time() keywords = []

def check_trends():

    pytrends.build_payload(

        kw_list,

        cat,

        timeframes[ ],

        geo,

        gprop

        )

    data = pytrends.interest_over_time() keyword all_keywords:

    keywords.append(keyword)

    check_trends()

    keywords.pop() 0 for in The loop at the end will take a keyword from our main list and append it to our empty list. Then it will pass only that keyword to our function for analysis before popping it out, leaving an empty list again, and pushing the next one. for check_trends() 4. Determining If There’s a Trend Our code as-is will get data from Google Trends and bring it back. However, we’ll need to do further analysis if to interpret the data. Let’s start by calculating the mean: mean = round(data.mean(), )

    print(keyword + + str(mean[keyword])) 2 ': ' Here’s how the output should look like: Here’s where understanding the way Google Trends work will pay off. Because we’re reading the mean in a 5 years timeframe, we need to understand that there was a point when the interest overtime was 100, so a low number would mean that over the five years, the interest has been pretty low. To make it easier to visualize here’s the graph for ‘event management’: As you can see, over the period of five years, it has stayed pretty stable with the highest peak around February 2020. We could read this query as stable but not rising in popularity. So for a new business, there’s a stable need for event management professionals, services, and software. Ok, we might need more information to jump to that conclusion, but as you can see, that’s exactly the kind of assumption we can make by using data from Google Trends. For example, event planning at 25.96 is fairly low in comparison to the peak, so we could conclude that this is a query without much popularity, right? Well, maybe. But remember that this is a mean, it could happen that it is a seasonal query with peaks closer to 100 in specific months. Automating Conclusions With PyTrends Here’s an exercise, we want to know how does last year’s trend compares to the rest of the timeframe. avg = round(data[keyword][ :].mean(), )

    trend = round(((avg/mean[keyword]) )* , )

    print( + keyword + + str(mean[keyword]))

    print( + keyword + + str(trend) + ) -52 2 -1 100 2 'The average 5 years interest of ' ' was ' 'The last year interest of ' ' compared to the last 5 years has changed by ' '%.' What we’re doing at this point is calculating the average trend without the last 52 weeks (because Google Trends is sending us weekly data) that would represent the last year and then calculating the change of trend converting it in percentages. This is how the outcome would look like: So there’s a new conclusion to be made. Yes, the mean interest of the queries has been fairly stable (spacially for ‘event management’), but the overall interest has been decreasing in the last year. However, we can automate more conclusions faster using Python than just going through the data by hand. These are a few conclusions (but not the only ones) we could also run: #Stable trend mean[keyword] > and abs(trend) <= :

        print( + keyword + )

    elif mean[keyword] > and trend > :

        print( + keyword + )

    elif mean[keyword] > and trend < :

        print( + keyword + )

    #Relatively stable

    elif mean[keyword] > and abs(trend) <= :

        print( + keyword + )

    elif mean[keyword] > and trend > :

        print( + keyword + )

    elif mean[keyword] > and trend < :

        print( + keyword + )

    #Seasonal

    elif mean[keyword] > and abs(trend) <= :

        print( + keyword + )

    #New keyword

    elif mean[keyword] > and trend > :

        print( + keyword + )

    #Declining keyword

    elif mean[keyword] > and trend < :

        print( + keyword + )

    #Cyclinal

    elif mean[keyword] > and abs(trend) <= :

        print( + keyword + )

    #New

    elif mean[keyword] > and trend > :

        print( + keyword + )

    #Declining

    elif mean[keyword] > and trend < :

        print( + keyword + )

    #Other :

        print( )

    print( ) if 75 5 'The interest for ' ' is stable in the last 5 years.' 75 5 'The interest for ' ' is stable and increasing in the last 5 years.' 75 -5 'The interest for ' ' is stable and decreasing in the last 5 years.' 60 15 'The interest for ' ' is relatively stable in the last 5 years.' 60 15 'The interest for ' ' is relatively stable and increasing in the last 5 years.' 60 -15 'The interest for ' ' is relatively stable and decreasing in the last 5 years.' 20 15 'The interest for ' ' is seasonal.' 20 15 'The interest for ' ' is trending.' 20 -15 'The interest for ' ' is significantly decreasing.' 5 15 'The interest for ' ' is cyclical.' 0 15 'The interest for ' ' is new and trending.' 0 -15 'The interest for ' ' is declining and not comparable to its peak.' else 'This is something to be checked.' '' The power of automating Google Trends with PyTrends is that we can automate as many conclusions as we want, and by changing the list of keywords, we can pull a lot of conclusions with just the press of a button. Before we say goodbye, we want to show you, fairly quickly, another route you can take to pull trends data using JavaScript this time. Note: If you want to learn more about Python for web scraping, check our guides on How to scrape multiple pages with Python and Scrapy . How to build a Beautiful Soup scraper from scratch Google Trends Scraping with Fetch and Cheerio: Alternative Route Data is crucial for any business decision. For this use case, let’s say we want to start a new business but we don’t want to just depend on intuition. We want to create a product that has demand right now. Setting Up Our Development Environment To set up our development environment follow these instructions: First, download and install Node.js on your machine: . https://nodejs.org/en/download/ Next, create a new folder for your project (we’ll use the same we used for our PyTrends project) and navigate to it from your terminal. Inside the folder, enter the command npm init -y to create the necessary files. To install our dependencies, let’s type npm install cheerio and then npm i node-fetch Pretty simple, right? So let’s move to the next step! Fetching Our Target URL is a relatively new site that analyzes millions of web searches to identify rising (or exploding) queries/topics. This makes it a perfect alternative to Google Trends but the caveat is that we don’t need to analyze the data itself to make conclusions as the website is meant to provide only trending and raising queries. Exploding Topics We’ll change the parameters to ‘1 month’ (to try to catch those new rising topics) and business, and grab the resulting URL ‘ ’. https://explodingtopics.com/business-topics-this-month fetch ; { load } ;
( { res = fetch( ); text = res.text(); import from 'node-fetch' import from 'cheerio' async ( ) function const await 'https://explodingtopics.com/business-topics-this-month' const await Our code is now fetching the URL and then storing the response in the text constant. That said, there’s one more thing we need to do before parsing the response. If we want our scraper to scale, we can’t just send all requests through our IP address. Websites will quickly figure out our script isn’t a human and will block it. Making our web scraper useless. Integrating Fetch with ScraperAPI For the next step, we want to tell Fetch to send the request through ScraperAPI’s servers. There, it will change the IP address automatically for every request, handle any CAPTCHAs that might get in the way, and use years of statistical analysis to determine the best header to use for the request. All of this will be handled for us by just adding a few lines to our URL. First, we’ll to generate an API key. create a free ScraperAPI account With our API key, we can now build our target URL: res = fetch( ); const await 'http://api.scraperapi.com?api_key=51e43be283e4db2a5afb62660xxxxxxx&url=https://explodingtopics.com/business-topics-this-month' Now that our scraper is blacklisting-proof, we can send the response to Cheerio for parsing. First of all, we need to declare Cheerio at the top of our file like this import { load } from 'cheerio' and then add to our function. const $ = load(text) Done! Our response is now in Cheerio and we can navigate it using CSS selectors. Let’s say that we want to pull the title, description, and searches per month. For that, let’s grab the bigger element that’s wrapping the elements we’re looking for, a <div> with the class “topicInfoContainer”: containers = $( ).toArray(); const '.topicInfoContainer' As you can see, we’re converting the element into an array. It will allow us to select several elements inside the big element we just grab. Using CSS selectors, we can then get every element inside the Array. For the sake of simplicity and time, here’s the finished code: fetch ; { load } ;

( { res = fetch( ); text = res.text(); $ = load(text); containers = $( ).toArray(); trends = containers.map( { active = $(c); keyword = active.find( ).text(); description = active.find( ).text(); searches = active.find( ).first().text(); {keyword, description, searches}

    }) .log({trends});

})(); import from 'node-fetch' import from 'cheerio' async ( ) function const await 'http://api.scraperapi.com?api_key=51e43be283e4db2a5afb62660xxxxxxx&url=https://explodingtopics.com/business-topics-this-month' const await const const '.topicInfoContainer' const => c const const '.tileKeyword' const '.tileDescription' const '.scoreTag' return console If you run this code in your terminal (adding your own API key), the output should look like this: : For a more in-depth guide on using JavaScript for web scraping, check our . Note step-by-step tutorial on building a Node.js web scraper Congratulations! You have now successfully built not only one but two web scrapers that will bring more and better data to your business. Whether you are using PyTrends to interact with Google Trends and automate conclusions or using Node.js to scraper Exploding Topics to find new business opportunities, ScraperAPI is ready to help you scrape the internet without getting blocked by anti-scraping techniques. Until next time, happy scraping! Also published . here

The Graph

Fetch

Google

Target

YouTube

2022 - Seo Sleuth

Proxy API for Web Scraping

Nominated for 2022 - Seo Sleuth

Too Long; Didn't Read

Build a Powerful Google Trends Scraper Using PyTrends: A Step-By-Step Guide

Build a Powerful Google Trends Scraper Using PyTrends: A Step-By-Step Guide

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

How to Deal With Pagination in Python; A Step-by-Step Guide

What Are Generics? An Introduction for Beginners

DevOps Isn't a Tool, It's a Chain Reaction

What is Mutuum Finance (MUTM)? Phase 6 is 90% Gone, 20% Price Jump Expected Soon

Bybit PWM Posts 16.9% Fund Return As Crypto Markets Weather “Uptober” Shock

How to Deal With Pagination in Python; A Step-by-Step Guide

What Are Generics? An Introduction for Beginners

DevOps Isn't a Tool, It's a Chain Reaction

What is Mutuum Finance (MUTM)? Phase 6 is 90% Gone, 20% Price Jump Expected Soon

Bybit PWM Posts 16.9% Fund Return As Crypto Markets Weather “Uptober” Shock

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps