In this article, I share what I learned from exploring how to integrate Azure Functions and TagUI in order to create a process automation proof-of-concept (POC) application. The application runs on a schedule, on the cloud, and saves the output to blob storage. The POC scrapes a table from a Wikipedia page, saves the table as a file, and uploads the file to Azure Blob storage. No need to worry about having a running machine at home or at work, or to kick it off manually or to set up a separate scheduler to execute the script. No need to use a proprietary web scraping tool, or to code a solution from scratch. CSV Azure Functions is a cloud service available on-demand that provides the infrastructure and resources needed to run applications. TagUI is an open-source Robotic Process Automation (RPA) tool developed by AI Singapore and the community to help you rapidly automate your repetitive or time-critical tasks — use cases include process automation, data acquisition and testing of web apps. The article covers three main development areas : Azure Functions-specific choices, the application dependencies, and the tweaks necessary to make the integration work. The function uses as this seemed to be the most appropriate choice given that TagUI is a native JavaScript application. The local development environment consists of Visual Studio Code (VSC), the VSC Azure Tools extension, and and . NodeJS npm NodeJS Please note that the application described in this article is a proof-of-concept (POC) rather than a fully fledged application. For this reason the code is kept to the minimum with no error handling or similar advanced software development best practice. NodeJS Azure Functions Serverless vs. App Service Plan Even though my preference would have been to implement the POC using the free, serverless Azure Functions , it turned out that the ability to manipulate files and directories is limited. The Consumption Plan does not come with a data directory. The on the other hand allows for far greater freedoms. It comes with a data directory ( ) and it provides access to (in the portal UI) which include a shell to navigate the directories of the function instance. However, the downside is that the are not supported in Free and Shared Plans. Consumption Plan App Service Plan /home/data/ Advanced Tools Bash App Service Plan For this POC the following cloud architecture was used: an Azure Resource Group with a Storage Account (Standard, Locally-redundant, with hierarchical namespace) Function App (Code, Node.js 16LTS, Linux OS) with App Service Plan an Azure function of type to allow for automation based on a schedule Timer trigger Anatomy of an Azure Function The bare-bones code of such a function is shown below. Code can be placed inside or outside the function. When placed outside the function, the code instructions are executed once on startup. When placed inside the function, instructions are executed on each request/trigger handled by the function. This distinction is noteworthy as it will guide where the various application code snippets will be placed. async async async // This code only runs at first startup of the function instance Console.log(‘Hello World’) module.exports = async function (context, myTimer) { //This code runs at each invocation/request var timeStamp = new Date().toISOString(); if (myTimer.isPastDue) { context.log('JavaScript is running late!'); } context.log('JavaScript timer trigger function ran!', timeStamp); }; Dependencies The POC Azure function has three core dependencies: , and which is used to execute shell commands. These dependencies are defined in the file as shown below. To install the dependencies, simply run from the command line. will create a new directory named to hold all the installation files, including the TagUI source files. Note that will raise security warnings. For the purposes of this POC they will be ignored. The warnings relate to , a JavaScript utility that is installed as part of the TagUI package. TagUI Azure-Storage ShellJS package.json npm install NPM node_modules npm install casper.js npm { "name": "tagui-azfunc-nodejs", "version": "1.0.0", "description": "", "scripts": { "start": "func start", "test": "echo \"No tests yet...\"" }, "dependencies": { "tagui": "^5.0.0", "shelljs": "^0.8.5", "azure-storage": "^2.10.7" } } TagUI Adapting the Source Code TagUI was not designed to run on a third-party platform. In order to integrate TagUI with Azure Functions, some modifications to the TagUI source code need to be made. The path to the source code file is . node_modules/tagui/src/tagui Change the shebang line at the top of the script from to #!/usr/bin/env bash #!/usr/bin/env /bin/bash Set the OpenSSL configuration environment variable: add to the script (e.g. to line 6) export OPENSSL_CONF=/etc/ssl/ In line 83, change to tagui_web_browser="phantomjs" tagui_web_browser=”google-chrome" My knowledge of the inner working of Azure Functions is limited at this moment, and the changes listed above reflect the solution that worked for me. Other approaches might be possible. I identified these changes through trial and error, and with the help of Google searches and Stack Overflow. Changing the shebang line is needed to accommodate for the default file structure of the VM hosting the Azure Function. The default browser used by TagUI is , a headless web browser. I changed it to to use a browser I am familiar with. It may be possible to use the default TagUI browser. For the remaining modification, I can only make an educated guess. The environment variable must be set in order for TagUI to interact with the target website. PhantomJS Chrome openssl Dependencies A key obstacle to integrating TagUI with Azure Functions is its dependencies. TagUI requires , and a web browser, all of which are not native to Azure Functions. It turns out that the package can be leveraged to execute commands and to install those dependencies. All that needs to be done is to wrap commands inside a statement, e.g. . The dependencies of the POC are installed by adding the code below to the file. Note how these instructions are above (i.e. outside) the function to ensure that they are executed only once, on startup. PHP Python npm ShellJS JavaScript shell exec shell.exec(apt-get …) index.json async const shell = require('shelljs') ; // PHP shell.exec('apt-get install php -y') ; // Python shell.exec('apt-get install python3') ; shell.exec('ln -s /usr/bin/python3.9 /bin/python') ; // Chrome shell.exec("wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb"); shell.exec('sudo dpkg -i google-chrome-stable_current_amd64.deb'); module.exports = async function (context, myTimer) { … } The TagUI script The TagUI script used for this POC is shown below. In two lines of code, an HTML table is scraped from a Wiki page and saved as a file. The script should be saved with a , and saved in a custom directory e.g. . CSV .tagui extension e.g. wiki.tagui scripts // visit website https://en.wikipedia.org/wiki/Instrumental_temperature_record // save table table (//table)[1] to /home/data/filename.csv The TagUI script is executed by the function as shown below. The pattern to execute a TagUI script is , followed by . Note the flag to indicate that TagUI should execute in headless mode, i.e. without physically launching a browser. async path-to-tagui-executable path-to-tagui-script -h module.exports = async function (context, myTimer) { // execute TagUI script var tg = shell.exec('/home/data/node_modules/tagui/src/tagui /home/data/scripts/wiki.tagui -h'); // see the TagUI logs context.log(tg); ... }; The Data Directory The primary reason for developing this application using the App Service Plan option is that it provides access to a data directory from where TagUI can be executed. The two directories that are required for executing the application are , where the TagUI script resides, and , where the TagUI executable resides. The code below shows how to copy the directories recursively to . The copy instructions are above (i.e. outside) the async function to ensure that they are executed only once, on startup. scripts node_modules /home/data shell.exec('cp -ar ./scripts/ /home/data/'); shell.exec('cp -ar ./node_modules/ /home/data/'); Upload to Blob storage The upload of the file to blob storage can be achieved using the code below. The function will look for credentials. These can be provided in the form of the connection string for the Storage Account. The connection string should be saved in the App Function section under . Create a new application setting named CSV createBlobService() Configuration Settings AZURE_STORAGE_CONNECTION_STRING to store the connection string. var blobSvc = await azure.createBlobService(); blobSvc.createBlockBlobFromLocalFile('container_name', destination_filename, '/home/data/filename.csv', function(error, result, response){ if(!error){ context.log('File successfully uploaded to Blob Storage'); //remove the file after upload shell.exec('rm -r /home/data/filename.csv'); } }); Publishing and Testing The process for deploying the application to Azure Functions consists of two steps. First, the application will be deployed and built in its non-customised state. Second, the custom TagUI code will be pushed to Azure cloud. The first deployment is done by right-clicking on the code directory and selecting . This will push the files to the App function and perform the of the dependencies. Once completed, the custom TagUI source code can be pushed using the following command: Deploy to Function App npm install func azure functionapp publish <Function App name> To test the application, click on in the Azure Function App, open the function and click on . The screenshot below shows the output produced by the TagUI script during execution. To view the TagUI output, store the TagUI execution call in a variable (e.g. ) and print the variable to the Test/Run console with Functions Code + Test tg context.log(tg); Final Remarks There are many tried and tested ways of automating RPA tasks. However, to the best of my knowledge, the integration of TagUI and Azure Functions has not been done before. I spent a good few hours trying to get this integration to work and was quite satisfied to have been able to develop a working POC. The proposed solution has, I believe, very little overhead and is easy to implement. TagUI is a powerful open-source RPA tool that can, with a little programming experience, be programmed to automate a great deal of mundane web-based tasks. Azure Functions provides a cost-effective cloud environment for hosting a TagUI RPA solution. I actually set out to get this integration working in order to learn mode about Azure Functions. My verdict is that the learning curve for Azure Functions is steep. There are a lot of options to chose from, testing is somewhat cumbersome, and the serverless offering is rather limited for non-standard projects. That said, once you get the hang of them, Azure Functions are indeed quite useful and versatile. I will certainly explore them further. The repo with the code for this proof-of-concept integration of TagUI and Azure Functions can be found here: . https://github.com/raoulbia/azure-function-node-tagui-poc