Always set up concurrency limits for Lambdas and AWS Config rule to follow this guideline.
One Monday, we came to work and saw that our bill was going crazy on weekends. It turned out that one of our devs screwed up and made recursive call inside a Lambda function. And it was infinitely invoking itself.
Don’t be afraid to contact AWS support in case of weird behavior.
I was working with S3 event notifications to trigger Lambda. But I couldn’t manage to bind Lambda to S3 bucket. Error without any description was happening. Team started to build workaround but I contacted free AWS support and they explained to us undocumented behavior that allowed to solve the problem.
Don’t be afraid to request quota increase even if you can’t measure it.
We had a weird behavior of Step Functions:
This issue made working with Step Functions almost impossible and simply requesting higher quotas didn’t resolve the problem.
So Amazon provided us with Premium support for free. We provided all the information that was requested by the support and problem was solved.
Sounds very simple but the whole team was too reluctant to mess with it and preferred to suffer because they didn’t know that it is just that simple to fix the issue.
Use compute savings plans.
For most of the projects compute part comprise the biggest part of the bill. So any money that you can save is noticeable. Compute plans allow you to have flexibility while still saving money. You just need to commit that you are going to use the particular amount of resources without defining any particulars.
So I suggest you to establing the bare CPU minimum that you will always use and cover this minimum with Compute Savings Plan.
Use new generation instances.
AWS is regularly adding new instances and optimizing their performance thus reducing the cost. You should check periodically for this type of improvements. It can be not only EC2 instances but Elasicache, RDS or Redshift.
As an example Amazon lately released small 3rd gen RA instance for Redshift. And despite the fact that it has twice less CPU and RAM for the same amount of money it still works faster. Also we got elastic storage for Redshift thanks to this move.
Redshift has no way to uniquely identify records. It has no primary or unique keys.
It was a surprise for me that Redshift doesn’t have any native functionality to check a record for uniqueness.
For example: SELECT DISTINCT just will not gonna always work. You can’t rely on it as it can return improper data.
Use S3 Inteligent Tiering.
This is probably the most obvious advice.
Intelligent Tiering service automatically transfer data between the Standard S3 and Standard Infrequent storage plans depending on the usage patterns.
There is a price associated with this option but overall it is really cheap and you can greatly benefit from it.