Ever since Google Web Search API deprecation in 2011, I’ve been searching for an alternative. I need a way to get links from Google search into my Python script. So I made my own, and here is a quick guide on scraping Google searches with requests and Beautiful Soup. First, let’s install the requirements. Save the following into a text file name requirements.txt urllib requests bs4 BeautifulSoup import import from import Google returns different search results for `mobile vs. desktop. So depending on the use case, we need to specify appropriate user-agent. = = # desktop user-agent USER_AGENT "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:65.0) Gecko/20100101 Firefox/65.0" # mobile user-agent MOBILE_USER_AGENT "Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36" To perform a search, Google expects the query to be in the parameters of the URL. Additionally, all spaces must be replace with a +. To build the URL, we properly format the query and put it into the q parameter. = = query.replace( , ) = f query "hackernoon How To Scrape Google With Python" query ' ' '+' URL "https://google.com/search?q={query}" Making the request is easy. However requests expects the user-agent to be in the headers. To properly set the headers, we must pass in a dictionary for the headers. = { : USER_AGENT} = requests.get(URL, headers=headers) headers "user-agent" resp Next is parsing the data and extracting all anchor links from the page. That is easy with Beautiful Soup. As we iterate through the anchors, we need to store the results into a list. <br class= > resp == :
    soup = BeautifulSoup(resp , )
    results = [] g soup.find_all( , class_= ):
        anchors = g.find_all( ) anchors:
            link = anchors[ ][ ]
            title = g.find( ) item = { : title, : link
            }
            results.append(item)
    print(results) "Apple-interchange-newline" if .status_code 200 .content "html.parser" for in 'div' 'r' 'a' if 0 'href' 'h3' .text "title" "link" That is it. This script is pretty simple and error-prone. But should get you started with your own Google Scraper. You can clone or download the entire script over at the . git repo There are also some caveats with scraping Google. If you perform too many requests over a short period, Google will start to throw a captcha at you. This is annoying and will limit how much or how fast you scrape. That is why we created a which lets you perform unlimited searches without worrying about captcha. Google Search API Previously published at https://blog.goog.io/web%20scraping/2019/12/30/how-to-scrape-google-with-python.html

Apple

Google

Intel

Mozilla

How To Easily Validate Startup Ideas

Nominated for 2022 - Pythonistas Paradise

Nominated for 2022 - Seo Sleuth

Nominated for 2022 - HackerNoon Contributor of the Year - Api

Nominated for 2022 - Most Valuable Marketer

Too Long; Didn't Read

How To Scrape Google With Python

How To Scrape Google With Python

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

How To Easily Validate Startup Ideas

104 Stories To Learn About Go

105 Stories To Learn About Functional Programming

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

10 Websites to Learn JavaScript for Beginners

104 Stories To Learn About Programming Top Story

How To Easily Validate Startup Ideas

104 Stories To Learn About Go

105 Stories To Learn About Functional Programming

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

10 Websites to Learn JavaScript for Beginners

104 Stories To Learn About Programming Top Story

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps