Introduction This post will teach us to scrape Google Images results with Node JS using multiple methods. Web Scraping Google Images. Requirements Install Libraries Before we begin, install these libraries so we can move forward and prepare our scraper. npm i unirest
npm i cheerio To extract our HTML data, we will use Unirest JS, and for parsing the HTML data, we will use Cheerio JS. Target: Process Method-1 We have set up all the things to prepare our scraper. Now, let us discuss our first method to scrape Google Images. First, we will make a GET request on our target URL with the help of Unirest to extract the raw HTML data. let header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36"
};
return unirest
.get("https://www.google.com/search?q=nike&oq=nike&hl=en&tbm=isch&asearch=ichunk&async=_id:rg_s,_pms:s,_fmt:pc&sourceid=chrome&ie=UTF-8")
.headers(header)
.then((response) => {
let $ = cheerio.load(response.body); Step-by-step explanation: In the first step, we made a GET request to our target URL. In the second step, we passed the headers required with our target URL. Then we stored the returned response in the Cheerio instance. But one User Agent might not be enough as Google can block your request. So, we will make an array of User Agents and rotate it on every request. const selectRandom = () => {
    const userAgents =  ["Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36",
    ]
    var randomNumber = Math.floor(Math.random() * userAgents.length);
    return userAgents[randomNumber];
    }
    let user_agent = selectRandom();
    let header = {
    "User-Agent": `${user_agent}`
    } Copy the below target URL. Paste it into your browser, which will download a text file. Open that text file in your code editor and convert it into an HTML file. https://www.google.com/search?q=nike&oq=nike&hl=en&tbm=isch&asearch=ichunk&async=_id:rg_s,_pms:s,_fmt:pc&sourceid=chrome&ie=UTF-8 Additional parameters which can be used with this URL: - Term By Search parameter. Read more about this parameter in this article. tbs - Used to filter image results. chips - Used for pagination. = 0 will return the first page of results, = 1 will return the second page of results and so on. ijn ijn ijn Scroll the HTML file till the end of the style tag you will see the HTML tags of the respective image results. Now, we will parse the required things we want in our response and search for the title tag from the above image. You will find inside an anchor tag. Just below the title, we have the tag for our source as . .mVDMnf .FnqxG let images_results= [];
    $("div.rg_bx").each((i, el) => {
     images_results.push({    
     title: $(el).find(".iKjWAf .mVDMnf").text(),
     source: $(el).find(".iKjWAf .FnqxG").text()
    });
  }); After the end of the anchor tag, you will find the tag with the class name which contains a JSON string. div rg_meta {"bce":"rgb(249,252,249)","cb":21,"cl":21,"clt":"n","cr":21,"ct":21,"id":"qYZE1rcH_OCntM","isu":"www.nike.com","itg":0,"oh":1088,"os":"15KB","ou":"https://c.static-nike.com/a/images/w_1920,c_limit/bzl2wmsfh7kgdkufrrjq/image.jpg","ow":1920,"pt":"Nike.
Just Do It.
Nike.com","rh":"www.nike.com","rid":"mgtROrdDu1XGJM","rmt":0,"rt":0,"ru":"https://www.nike.com/","st":"www.nike.com","th":169,"tu":"https://encrypted-tbn0.gstatic.com/images?q\\u003dtbn:ANd9GcQQAtNCsBlvuD_5pu9bKrTr-Sv5mMwD1-hZE9MS4Px4GKk05naP\\u0026s","tw":298} We will parse it and extract the link and the URL of the original image from it. let images_results= [];
    $("div.rg_bx").each((i, el) => {
        let json_string = $(el).find(".rg_meta").text();
        images_results.push({
        title: $(el).find(".iKjWAf .mVDMnf").text(),
        source: $(el).find(".iKjWAf .FnqxG").text(),
        link: JSON.parse(json_string).ru,
        original: JSON.parse(json_string).ou,
    });     
  }); And at last, we will find the thumbnail URL. If you look at the HTML, there is an image tag under the first anchor tag, which contains the thumbnail URL. Now, our parser looks like this: let images_results= [];
    $("div.rg_bx").each((i, el) => {
        let json_string = $(el).find(".rg_meta").text();
        images_results.push({
        title: $(el).find(".iKjWAf .mVDMnf").text(),
        source: $(el).find(".iKjWAf .FnqxG").text(),
        link: JSON.parse(json_string).ru,
        original: JSON.parse(json_string).ou,
        thumbnail: $(el).find(".rg_l img").attr("src")? $(el).find(".rg_l img").attr("src") : $(el).find(".rg_l img").attr("data-src")
    });    
  }) Results: [
{
title: 'Shoes, Clothing & Accessories. Nike ...',
source: 'www.nike.com',
link: 'https://www.nike.com/in/men',
original: 'https://c.static-nike.com/a/images/w_1920,c_limit/mdbgldn6yg1gg88jomci/image.jpg',
thumbnail: 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTCsZWS0YPC1NFXd4g_Ucn4jkz8VYxL4VbLvWfKa5QI3PKRuHc&s'
},
{
title: 'Nike. Just Do It. Nike.com',
source: 'www.nike.com',
link: 'https://www.nike.com/',
original: 'https://static.nike.com/a/images/f_jpg,q_auto:eco/61b4738b-e1e1-4786-8f6c-26aa0008e80b/swoosh-logo-black.png',
thumbnail: 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRbbeIzjUozRCMzN8gaujUFBJlIFHheriDFvKhSCMD84JL8KeuX&s'
},
.... Here is the full code: const unirest = require("unirest");
    const cheerio = require("cheerio");
    
    const getImagesData = () => {
        const selectRandom = () => {
        const userAgents = [
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
            "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36",
        ];
        var randomNumber = Math.floor(Math.random() * userAgents.length);
        return userAgents[randomNumber];
        };
        let user_agent = selectRandom();
        let header = {
        "User-Agent": `${user_agent}`,
        };
        return unirest
        .get(
            "https://www.google.com/search?q=nike&oq=nike&hl=en&tbm=isch&asearch=ichunk&async=_id:rg_s,_pms:s,_fmt:pc&sourceid=chrome&ie=UTF-8"
        )
        .headers(header)
        .then((response) => {
            let $ = cheerio.load(response.body);
    
            let images_results = [];
            $("div.rg_bx").each((i, el) => {
            let json_string = $(el).find(".rg_meta").text();
            images_results.push({
                title: $(el).find(".iKjWAf .mVDMnf").text(),
                source: $(el).find(".iKjWAf .FnqxG").text(),
                link: JSON.parse(json_string).ru,
                original: JSON.parse(json_string).ou,
                thumbnail: $(el).find(".rg_l img").attr("src") ? $(el).find(".rg_l img").attr("src") : $(el).find(".rg_l img").attr("data-src"),
            });
            });
    
            console.log(images_results);
        });
    };
    
    getImagesData(); Method - 2 In this method, we will use a simple GET request to fetch the first page results of Google Images. So, let us find the tags for the image results. https://www.google.com/search?q=Badminton&gl=us&tbm=isch First, we will find the tag for the title. Look at the above image. You will find the tag for the title as under the div with the class name . h3 MSM1fd const images_results = [];

    $(".MSM1fd").each((i,el) => {
        images_results.push({
        title: $(el).find("h3").text(),
        })
    }) Then we will find the tag for the source. If you look at the image, you will find the source of the image under the second anchor tag with the class name as inside the div with the class name . Also, this anchor tag contains our link. So, our parser would look like this: VFACy MSM1fd const images_results = [];

    $(".MSM1fd").each((i,el) => {
        images_results.push({
        image: $(el).find("img").attr("src") ? $(el).find("img").attr("src") : $(el).find("img").attr("data-src"),
        title: $(el).find("h3").text(),
        source: $(el).find("a.VFACy .fxgdke").text(),
        link: $(el).find("a.VFACy").attr("href")
        })
    }) The tag is the only image tag inside the , so it is not important to look for its class name as particular. img div Results {
    image: 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSjxyuvqYQfybxq9F2XgME-ya6xb81WUyw3Dpa-YA40-Fy7fx0IlOhXIrK17kNON-r6vNs&usqp=CAU',
    title: 'Nike for Men - Shop New Arrivals - FARFETCH',
    source: 'farfetch.com',
    link: 'https://www.farfetch.com/in/shopping/men/nike/items.aspx'
    },
    {
    image: 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSfCJOrZr0zFxogpQjNT_6kBQ3rmxSPqvCHPpTWLmpOltZinUpptGM-290ssFMCIzFnD1M&usqp=CAU',
    title: "Women's Clothing. Nike IN",
    source: 'nike.com',
    link: 'https://www.nike.com/in/w/womens-clothing-5e1x6z6ymx6'
    },
    .... Note: You will also find some images with base64 URLs. This method is fast, but we can't use pagination in this method, while in the first, we can use it. Another method you can work with is the Puppeteer Infinite Scrolling Method, which can solve the problem of pagination. But it is a very time-consuming method. Using Google Image API Using this API, you don’t have to worry about creating and maintaining the scraper, and also you can scale the number of requests easily without getting blocked. Example const axios = require('axios');

   axios.get('https://api.serpdog.io/images?api_key=APIKEY&q=football&gl=us')
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.log(error);
  }); You can get your API Key by registering on this . link Results "image_results": [
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS_Tu78LWxIu_M_sN_kMfj2guqIbu2VcSLyI84CQGbuFRIyTCVR&s",
        "title": "Football - Wikipedia",
        "source": "en.wikipedia.org",
        "link": "https://en.wikipedia.org/wiki/Football",
        "original": "https://upload.wikimedia.org/wikipedia/commons/b/b9/Football_iu_1996.jpg",
        "rank": 1
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTxvsz_pjLnFyCnYyCxxY5rSHQCHjNJyYGFZqhQUtTm0XOzOWw&s",
        "title": "Soft toy, American football/brown - IKEA",
        "source": "www.ikea.com · In stock",
        "link": "https://www.ikea.com/us/en/p/oenskad-soft-toy-american-football-brown-90506769/",
        "original": "https://www.ikea.com/us/en/images/products/oenskad-soft-toy-american-football-brown__0982285_pe815602_s5.jpg",
        "rank": 2
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTNJYuYBLrUxLrXkbnP18Y6DEgKf_H4HYGCzecsGRAoFtkiGEM&s",
        "title": "NFL postpones three games due to Covid ...",
        "source": "www.cnbc.com",
        "link": "https://www.cnbc.com/2021/12/17/nfl-will-postpone-some-games-over-covid-surge-source-says.html",
        "original": "https://image.cnbcfm.com/api/v1/image/106991253-1639786378304-GettyImages-1185558312r.jpg?v=1639786403",
        "rank": 3
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTg4WJ88A83JqXwGXsWtiB5qSoHrU_RukbrfXdkWggEKMsJ5Ro1&s",
        "title": "USFL schedule Week 2: What football ...",
        "source": "www.sportingnews.com",
        "link": "https://www.sportingnews.com/us/nfl/news/usfl-schedule-week-2-football-tv-channels-times-scores/oadvrtsc5vn9l4knu8hvnpo0",
        "original": "https://library.sportingnews.com/2022-04/usfl-football-042122-getty-ftr.jpg",
        "rank": 4
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTsOPQ2-nzsTrdRK1HHXIqB-x96yuiPg7pwfJsm8mToVlw5-UaM&s",
        "title": "Why is football called 'football'?",
        "source": "www.newsnationnow.com",
        "link": "https://www.newsnationnow.com/us-news/hold-why-is-football-called-football/",
        "original": "https://www.newsnationnow.com/wp-content/uploads/sites/108/2022/02/FootballGettyImages-78457130.jpg?w=1280",
        "rank": 5
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS0WimlY0Nykjy8k5An6k4wViDvyVZzo_0K1MSZNZcwDFGsugNW&s",
        "title": "The Duke NFL Football | Wilson Sporting ...",
        "source": "www.wilson.com · In stock",
        "link": "https://www.wilson.com/en-gb/product/the-duke-nfl-football-wf10011",
        "original": "https://www.wilson.com/en-gb/media/catalog/product/b/c/bc340309-c2a3-441d-ac36-a26187fd94f0_yceho2py9sgzklxk.png",
        "rank": 6
      },
      {
        "image": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQ6ZuMtA_I4WZZKXPhiqMpIHsdKJX-SVkcN2yo1KAQ9SQuxdCI&s",
        "title": "National Football League",
        "source": "www.nfl.com",
        "link": "https://www.nfl.com/",
        "original": "https://static.www.nfl.com/image/private/t_editorial_landscape_mobile/f_auto/league/iaesayubxpbxmbxfwm3b.jpg",
        "rank": 7
      }, Conclusion In this tutorial, we learned to scrape Google Search Images Results. If you have any questions,  feel free to ask me in the comments. Follow me on . Thanks for reading! Twitter Additional Resources Scrape Google Search Organic Results with Node JS Scrape Google Maps Reviews Scrape Google News Results

Google

Target

Scrape Google Images with Node JS

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

How to Scrape Data from Google Maps

Google Custom Search Engine and More: Search Engines, Made By Google

10 Web Development Boosting Node.js Libraries and Frameworks

100+ FREE Resources to Learn Full Stack Web Development

10 Top Advantages of Node.js in eCommerce Industry

JavaScript Frameworks for Frontend and Backend Developers [Top Ten Picks]

How to Scrape Data from Google Maps

Google Custom Search Engine and More: Search Engines, Made By Google

10 Web Development Boosting Node.js Libraries and Frameworks

100+ FREE Resources to Learn Full Stack Web Development

10 Top Advantages of Node.js in eCommerce Industry

JavaScript Frameworks for Frontend and Backend Developers [Top Ten Picks]

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps