I am a co-founder of a product called Alto's POS & Inventory. Every day, I review charts, logs, and numbers from multiple platforms to gain a complete understanding of whether everything is okay.
On average, this process takes about 15 minutes per day, but sometimes it can extend to 30 minutes. After performing some basic calculations, I realized that over the course of one year, it amounts to more than 12 workdays. That's a substantial amount of time—12 workdays per year—that I spend on repetitive tasks. Consequently, I have decided to address this issue and allocate this time to more productive tasks.
Now that we've identified the problem, let's outline the requirements for the solution:
The solution should be worth the effort. For instance, if I have to invest a significant amount of time in resolving this issue, it might be better to leave it as it is now. So, let's strive to find a solution that is easy to implement. Solutions are listed from the easiest to the most challenging:
In my case, the easiest way to receive notifications is to use Telegram Bots. I can quickly set up and start using it immediately.
To make the dashboard accessible on any device, at any time and from anywhere, it should be hosted on services like AWS or IONOS.
Initially, I tried to find existing services that would allow me to easily gather data, but I couldn't find anything that suited my needs. So, I decided to explore the possibility of creating a custom solution.
My approach involved collecting data from the platforms we use through their APIs, processing that data, and presenting it on a dashboard. However, I encountered a challenge as many of these platforms didn't have available APIs for data access. This forced me to explore alternative methods, and I found that Puppeteer was the most suitable choice for my specific requirements.
Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples:
Here's a simple example of Puppeteer in action, which performs the following steps:
import puppeteer from 'puppeteer';
(async (searchValue) => {
// Launch the browser and open a new blank page
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Search
await page.goto('https://google.com');
await page.locator('textarea').fill(searchValue);
await page.$eval('form', form => form.submit());
// Go to the first link
await page.waitForNavigation();
await page.click(`div[data-async-context^="query:"] a`);
// Take a screenshot
await page.waitForNavigation();
await page.screenshot({path: './screenshot.png'});
await browser.close();
})("HackerNoon");
Let's validate the idea with a simple implementation that closely resembles a real scenario.
Since we need to gather data from multiple platforms, let's choose a simpler scenario that is similar to this task, such as building a page that displays HackerNoon top stories and software engineering jobs.
The following code collects data from HackerNoon Top Stories and HackerNoon Jobs every hour, generates simple HTML content from this data, and then serves this HTML content when we receive an HTTP request. It's quite straightforward.
index.js:
import http from 'http';
import * as scraper from './scraper.js';
(async () => {
let scrapedHtml = 'Try again later...';
http.createServer((req, res) => {
res.writeHead(200, {'Content-Type': 'text/html; charset=utf-8'})
res.end(scrapedHtml);
}).listen(8080);
scrapedHtml = await scrapeAll();
setInterval(async () => scrapedHtml = await scrapeAll(), 60*60*1000);
})();
async function scrapeAll() {
const browser = await scraper.launchBrowser();
const [stories, jobs] = await Promise.all([
scraper.getTopStories(browser),
scraper.getJobs('Software Engineer', browser)
]);
await browser.close();
return `
<h2>Top Stories</h2>
<ul>${stories.map(e => linkToHtml(e.title, e.url)).join('')}</ul>
<h2>Jobs</h2>
<ul>${jobs.map(e => linkToHtml(e.title, e.url)).join('')}</ul>
`;
}
const linkToHtml = (title, url) => {
return `<li>
<a target="_blank" href="${url}">
${title}
</a>
</li>`;
}
scraper.js:
import puppeteer, {Browser} from 'puppeteer';
/**
*
* @returns {Browser}
*/
export async function launchBrowser() {
return await puppeteer.launch();
}
/**
*
* @param {Browser} browser
* @returns {[{title: String, url: String}]}
*/
export async function getTopStories(browser) {
const page = await browser.newPage();
await page.goto('https://hackernoon.com/tagged/hackernoon-top-story');
// Wait for articles
await page.waitForSelector('main .story-card');
// Get articles
const res = [];
const articles = await page.$$('main .story-card h2 a');
for (const article of articles) {
res.push(
await article.evaluate(el => ({
"title": el.textContent,
"url": el.href,
}))
);
}
return res;
}
/**
*
* @param {String} keyword
* @param {Browser} browser
* @returns {[{title: String, url: String}]}
*/
export async function getJobs(keyword, browser) {
const page = await browser.newPage();
await page.goto('https://jobs.hackernoon.com');
// Search
await page.locator('#search-jobkeyword input').fill(keyword);
await page.click('button[type=submit]');
// Wait for result
await page.waitForSelector('.job-list-item');
// Get jobs
const res = [];
const items = await page.$$('.job-list-item');
for (const item of items) {
res.push(
await item.evaluate(el => ({
"title": [
el.querySelector('.job-name'),
...el.querySelectorAll('.desktop-view span')
].map(e => e.textContent).join(', '),
"url": el.href,
}))
);
}
return res;
}
The result looks something like this:
In my perspective, this solution offers the following advantages:
From my point of view, the disadvantages of this solution are as follows:
This solution proved effective for my situation, allowing me to resolve my problem quickly. If you believe there's a better approach to solving it, I would be delighted if you could share your insights.
You can access the source code for this example on GitHub.