Photo by on Elijah O’Donell Unsplash As the news broke about Instagram hitting monthly active users, I couldn’t help but create an account of my own and try to win myself some following. one billion Given my keen interest in IT, however, I wished to take the opportunity to test some real-world applications of what I had been learning and do so (gain followers, that is) using a bot, instead of putting in all the tedious work. First thing I did was check for an . To no avail, however, as it turned out to be a useless, outdated piece of software. Instagram API And although Facebook has been releasing recently, they only support business clients. new Instagram API But hey, that’s no problem — I thought — I can create one of my own. And that is precisely what we will learn today. If you think about it, Instagram’s website is in and of itself the platform’s API. All we need to do is just figure out how to interact with it remotely instead of manually, like regular users. And when there’s a will there’s a way. Here is where comes into the picture. The library allows us to create a headless Google Chrome / Chromium instance and control it by using the DevTools protocol. Puppeteer Setting up the project You can go ahead and copy the . repository git clone https://github.com/maciejcieslar/instagrambot.git Our structure looks like this: | -- instagrambot/| | -- .env| | -- .eslintrc.js| | -- .gitignore| | -- README.md| | -- package-lock.json| | -- package.json| | -- src/| | | -- common/| | | | -- browser/| | | | | -- api/| | | | | | -- authenticate.ts| | | | | | -- comment-post.ts| | | | | | -- find-posts.ts| | | | | | -- follow-post.ts| | | | | | -- get-following.ts| | | | | | -- get-post-info.ts| | | | | | -- get-user-info.ts| | | | | | -- index.ts| | | | | | -- like-post.ts| | | | | | -- unfollow-user.ts| | | | | -- index.ts| | | | -- interfaces/| | | | | -- index.ts| | | | -- scheduler/| | | | | -- index.ts| | | | | -- jobs.ts| | | | -- scraper/| | | | | -- scraper.js| | | | -- utils/| | | | | -- index.ts| | | | -- wit/| | | | | -- index.ts| | | -- config.ts| | | -- index.ts| | -- tsconfig.json| | -- tslint.json Browser Interface With that covered, let’s create our Browser interface which we will use to get rendered pages from Puppeteer. Our getPage function creates a browser’s page for us, goes to the provided URL and injects our scraper (mentioned later). Also, it waits for our callback to return a promise, resolves it and closes the page. src/common/browser/index.ts To be perfectly clear, Puppeteer is on its own a browser interface, we just abstracted some code that would be constantly repeated. We don’t have to worry about memory leaks caused by pages that we might have left open by accident. Scraper Another helpful thing for anything related to web-scraping (which is somewhat what we do here) is creating your own scraper helper. Our scraper will come in handy as we proceed to more advanced scraping. First, we define some helpers, mostly for setting . Other than that, we have this Element class which is an abstraction over normal HTMLElement. Also, there is a find function, it gives us a more developer-friendly way of querying elements. data attributes src/common/scraper/scraper.js Authentication Finally, we can create our first function that is actually related to Instagram. When we open the browser for the first time we need to authenticate our user. First, we wait for the page to open, then we type in our credentials provided in . There is a 100 ms delay between each character. Then we take the button and if it exists, we click on it. config.ts Log in src/common/browser/api/authenticate.ts If you would like to see the magic happen, set in Puppeteer’s launch options headless to false. It will open a browser and follow every action our bot will make. Now that we are logged in, Instagram will automatically set cookies in our browser, so we don’t have to worry about having to log in ever again. We can close the page (our interface will take care of it), and move on to creating our first function for finding posts with a #hashtag. Instagram’s URL for the most recent posts with a #hashtag is . https://www.instagram.com/explore/tags/follow4follow First 9 posts are always , meaning that they will probably never return our follow or like as they have thousands of them. Ideally, we should skip them and get only the recent ones. Top posts More posts will load as we scroll down. In one scroll there will appear 12 posts, so we have to calculate how many times do we have to scroll in order to get the expected number. On the first load, there is 9 and 12 posts. That gives us 21 in total. If we wanted to find 36 posts and omit the first 9, we would have to subtract the first 12 and then divide the rest by 12, so we know how many times we have to scroll. top normal 36(total) — 12(first) = 24 (the missing posts) 24 / 12 = 2 (the times we need to scroll) Also, we will add one more scroll to the result, because if something took too long to render that would be our safety net. src/common/browser/api/find-posts.ts We can iterate over returned URLs and execute a given set of actions on each one. The thing is, we don’t know anything about the post except its URL, but we can find all the necessary information by scraping it. Getting information about a post An Instagram post example As we see above, there is a lot of useful information on the website regarding the post, such as: Is the author followed? The follow button selector Is the post liked? The like button selector The author’s username The description and the comments The comment selector The number of likes But before we go any further… Adding NLP for our comments There is this one thing that we should take into consideration and that is . Is it to show off someone’s new watch or to mourn a departed relative? what is the purpose of the post? Ideally, we would like to know what the post is about. Here’s how we can do that: is a service from Facebook which let us create an app and teach it to understand sentences. Wit.ai That’s where NLP comes from, it stands for Natural Language Processing. It is also included in the Messenger API if you would ever like to make a chatbot, for example. While it may take some time, we can teach our app to understand the description of a post and give us insights. It is very simple, really, all we have to do is tell it what to look for in a sentence. In our case, the sentence will be a post’s description, that we will send with the library. node-wit First, you will need to create an account on . You can use your GitHub account to log in. wit.ai Then you can either create your own app or use someone else’s app. If you would like to use my trained app, . here Our app takes a message and returns whether it’s happy_description or sad_description and how sure it is about it. There’s also an for making our comments more lively. emoji library Now let’s put our token in the and make a little helper for transforming messages to intents and generating comments based on the intent provided. config.ts src/common/wit/index.ts While the code is ready for us to put emoji into our comments, . Once the issue is resolved, just uncomment the line and you are good to go. Puppeteer has had some issues lately with typing them Now that we can get the post information and it’s intent, using selectors we have previously found on the website, we can get to the elements holding the data. src/common/browser/api/get-post-info.ts Getting information about users There is only so much we can take from a post. Sometimes we would be interested in the user’s profile. An Instagram profile example We can glean a lot of useful information such as: The number of posts The number of followers The number of following Is the account followed? Bio (the description), but since this is of no use to us, we are not going to scrap it. src/common/browser/api/get-user-info.ts Post actions For now, we need to implement . like-post.ts We simply check if the post is already liked. If not, we take the like selector and click on it. src/common/browser/api/like-post.ts The same goes for . follow-post.ts src/common/browser/api/follow-post.ts With we type the comment in the textarea and press enter. comment-post.ts src/common/browser/api/comment-post.ts Users actions Our bot also has to be able to unfollow people, since otherwise it might cross the 7500 follows limit. First, we need to get the URLs of the people whom we are following. We click on the button in our profile. following A list should show up with the last 20 users that we have followed. We can then execute for each URL. unfollow-user.ts src/common/browser/api/get-following.ts Now that we have the URLs, we simply unfollow one user at a time. We click on the unfollow button and then click on the confirmation dialog. src/common/browser/api/unfollow-user.ts Scheduler Now that everything is ready, we have to think about how our bot is going to be working. Obviously, it can’t just mindlessly keep following people all the time, because, as you may have guessed, Instagram would ban him very quickly. From what I have and seen during my own tests, there are different limits, depending on the size of the account and on age. read Let’s play it safe for now. Don’t worry, though, as an account grows, the limits get pushed further. We can create a simple scheduler that will execute registered jobs once an hour, given that hour is in their expected time range. src/common/scheduler/index.ts src/common/scheduler/jobs.ts As you might have noticed, we were using a lot of variables coming from a config. There are some values that we provide with proccess.env, which means they are sensitive and we should include them in the .env file. The rest can be changed manually. src/config.ts Let’s register our jobs with hour ranges that a normal user would be active in and the only thing left will be to run the application. DISCLAIMER Due to Instagram’s policy, such an application is not allowed to be run. The post was made for educational purposes only. robots.txt src/index.ts npm run start Thank you very much for reading, hope you liked it! If you have any questions or comments feel free to put them in the comment section below or send me a . message Follow me on twitter . @maciejcieslar ! Join my newsletter Originally published at www.mcieslar.com on July 10, 2018.