Warming Up In the previous post I’ve proposed a new approach to the Architecture specification. In this post, I argued, that the current modeling practices use too limited vocabulary and produce artifacts without consistent semantics: Serverless _Could we make it better?_medium.com Serverless Architecture Language As a running sample I chose the MakirOto application, presented by IL Solution Architects Team at the recent : AWS AWS Tel Aviv Summit As a result of applying new modeling approach to the MakirOto Frontend component the following high-level Process Model was produced: MakirOto Process Model so far In this diagram: dashed rectangles represent services, implemented as Cloud Formation stacks AWS arrows denote visibility and access permissions between computations and resources from various stacks The first post was about modeling online services. In this post, I will take a closer look at consistent modeling of long-running background processes. For that purpose I will use another MakirOto component — Data Collector. This component is responsible for crawling social networks in order to obtain important information about user connections and interests. MakirOto Data Collector: Starting Point This is the DataCollector architecture presented during the : session MakirOto Data Collector In this diagram, some icons represent computations and resources: , while others still represent AWS services: Connection lines do not have any clear semantics. Let’s see if we could improve it without too much effort: MakirOto Data Collector Process Model This is, indeed, an improvement. Here, every icon represents either computation process instance: AWS Lambda, Step Function, Fargate Service) or a fully manged resource: S3 Bucket, SQS Queue Every arrow denotes visibility and access rights (not control or data flow!). Although this diagram is now semantically consistent, it is still hard to reason about. It has too many boxes, too many connections, too many unrelated concepts combined together on one page. It is still hard to say what exactly this component does, and how it is related to the rest of the system. To make progress we need to break this diagram into smaller chunks. Let’s start with the Crawler. MakirOto Crawler MakirOto Crawler Process Model MakirOto Crawler Typical Event Sequence With these two diagrams we have a better understanding of what’s going on, and can start evaluating alternatives. The whole purpose of introducing a consistent modeling language for serverless architecture is to enable systematic evaluation of multiple, clearly articulated, alternatives. Ideally, every model element has to be scrutinized, justified, and compared with possible alternatives. While it might not always be possible due to time constraints, the architecture language must support such process at full extent. Let’s start with the Worker Fargate service. AWS Fargate — First Class Serverless Citizen Although it is not officially admitted by AWS yet, there is growing consensus among AWS practitioners that the AWS Fargate service is the first class citizen in the Serverless Land. serverless AWS Lambda and AWS Fargate constitute two alternative resolutions for the same “cost vs. control trade-off”. As AWS Lambda, the AWS Fargate model does not break the main constraint of not managing servers directly. Differences between two options are summarized in the table below: AWS Lambda vs AWS Fargate You may find some interesting analysis of the AWS Lambda performance here: _How differently does a function perform when using the different programming languages supported by AWS Lambda?_read.acloud.guru Comparing AWS Lambda performance when using Node.js, Java, C# or Python and here: _An updated runtime performance benchmark of all five programming languages supported by AWS Lambda_read.acloud.guru Comparing AWS Lambda performance of Node.js, Python, Java, C# and Go You may find an interesting experience report about migrating to AWS Fargate here: _Following my talk at the AWS Summit Tel-Aviv 2018, I’m sharing our end to end journey of migrating our production…_medium.com Migrating to AWS ECS Fargate in Production 🚀 The main reason for choosing Fargate over Lambda for the Worker is that massive download, especially photos, from social networks may take more than 5 minutes. Also, sometimes, due to external API constraints, this process must not be interrupted. Notice, however, that when user base grows above certain size, the crawling process will work all the time anyhow — refreshing social networks data for existing users alongside with initial download for new ones. In the steady state, there will be almost no idle to pay for. The Crawler Tasks queue is also justified — crawling requests may come in bursts when, for example, too many new users are registered. This queue is a good way for smoothing these bursts out. The second Fargate service, Poller, however, raises some questions. Its only purpose is to periodically check whether AWS Step Function has a pending activity task and to send a corresponding crawling task specification to the Crawler Tasks queue. That will not happen all the time, and we are going to pay for idle. What, would be an alternative? Quite simple — wrap sending a crawling task request to the queue with a Lambda Function: Invoking a Lambda Function to Submit a Crawler Task Notice some subtle yet important changes in naming. Personally I would always prefer more domain-specific concrete names over generic but less informative names such as Poler, Worker, Manager, Dispatcher, Init, Update, and so on. At the end of each crawling sequence, the Crawler Fargate service directly updates the Data Collector pending activity status. Is it really justified? Why the Crawler micro-service needs to know that it is orchestrated by a Step Function? This seems to be unnecessary coupling. What would be an alternative? One possibility is to use an AWS SNS Topic to signal that the Crawling task is over: Introducing Crawler Task Status Notification As a nice side effect we will now be able to keep track of the crawling progress for other purposes, such as monitoring. AWS SNS is not only one possible notification mechanism. You will find a more detailed analysis here: _AWS offers a wealth of options for imple­ment­ing mes­sag­ing pat­terns such as Publish/Subscribe (often short­ened to…_medium.freecodecamp.org How to choose the best event source for pub/sub messaging with AWS Lambda There is another subtle problem with the current design. In order to understand what it is, we need to look at the Crawler service implementation model: Crawler Service Implementation Model This is a violation of the , which states that the system should . Open-Closed Principle support adding new functionality without modification of existing one From the Implementation Model above appears that we will need to rebuild and re-deploy the whole Crawler Service Docker image every time we decide to: support a new social network extract more data from an existing one upgrade to a new version of some API Introducing all these changes will jeopardize operational stability of the whole service. An alternative would be to encapsulate crawling process for each social network into a separate service: Individual Crawler Service per Social Network This diagram looks a bit complicated. Let’s compress it to a high-level overview: MakirOto Crawler High-Level Structure Now, individual social network crawlers could be documented separately. For example: MakirOto Facebook Crawler We have made a good progress with specifying an architecture of serverless long-running background tasks implemented on the top of AWS Fargate service. In order to complete the picture, we also need to look at long-running workflow processes, sometimes called Saga, implemented using AWS Step Functions. In the case of MakirOto application, we will need to take a closer look at internal details of the Data Collector service. I will cover this topic in the next post. Concluding Remarks Architecture process is primarily about evaluating multiple alternatives and communicating decisions in a clear and unequivocal way. Without multiple alternatives on the table we are at risk of slipping from engineering to ideology, which is bad for business. To be able to evaluate multiple alternatives and to communicate the final decision, we need to specify them precisely. For that purpose we need a suitable language. What we use currently is based on very limited vocabulary and inconsistent semantics. For that reason, I started looking for an alternative based on the seminal work of P. Kruchten: Developing a new language by writing a grammar book is a hopeless task. Languages are living creatures. To develop a language one has to speak it, to write prose and poems. To make bad jokes, if necessary. In the case of serverless architecture that means analyzing case studies and retelling their stories using this new language. The more, the better. This is what I started doing with the MakirOto application. The IL AWS Solutions Architect team made a very decent job of picking up a really good sample application. This application will continue supplying great materials for another couple of posts. After that I might start looking elsewhere including analysis of Medium posts tagged with “serverless”. The current version of the Serverless Architecture Language (shall I call it SAL?) is far from being perfect. It will definitely need to undergo multiple modifications. Personally I believe it may succeed only through a joint community effort. Premature fixation or, Heaven forbids, commercialization would be a fatal blow. If you have a case study, which you would like to try to retell in this emerging language, drop me a line. Otherwise, stay tuned for the next post. In any case, would love to hear what you think.

Flow

Stacks

Alongside

Consistent modelling of Serverless long-running background tasks

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Serverless Architecture Language

101 Stories To Learn About Cloud Infrastructure

10 Things in Engineering We Don't Spend Enough Time On

10 Things I Did To Increase CloudTrail Logs Security

10 reasons to give cloud computing a go

10 Lessons from 10 Years of AWS (part 1)

Serverless Architecture Language

101 Stories To Learn About Cloud Infrastructure

10 Things in Engineering We Don't Spend Enough Time On

10 Things I Did To Increase CloudTrail Logs Security

10 reasons to give cloud computing a go

10 Lessons from 10 Years of AWS (part 1)

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps