AWS CodePipeline: Setup And Maintenance From Scratch

Written by WestPoint | Published 2020/06/27
Tech Story Tags: aws | architecture | aws-cdk | automation | aws-codepipeline | aws-troubleshooting | react | hackernoon-top-story

TLDR AWS CDK is a relatively new framework that aims for faster development of AWS Cloud stacks when compared to AWS CloudFormation and AWS SAM. This article will present how to deploy a complete AWS CodePipeline using CDK and troubleshoot all common issues that may occur in the process of creating the CDK application. The main goal when creating CDK applications is to easily deploy AWS Cloud resources. With that in mind, each resource that is going to be used in the application needs to be installed.via the TL;DR App

AWS CDK is a relatively new framework that aims for faster development
of AWS Cloud stacks when compared to AWS CloudFormation and AWS SAM. This article will present how to deploy a complete AWS CodePipeline
using AWS CDK and troubleshoot all common issues that may occur in the
process of creating the CDK application. For additional information about the framework, read the previous article named “How AWS CDK facilitates the development process of AWS Cloud Stacks”.

AWS CDK setup

To create any AWS CDK application, it is necessary to have the CDK installed in the machine and updated to its latest version. To do so, the following command can be used:
npm install -g aws-cdk
After this command finishes its process, AWS CDK will be updated to its
latest version. This process is important due to updates constantly
being released, integrating new functionalities and solving issues from
previous versions.
To check out if the current CDK version installed is the updated, check the latest release on the CDK Github repository and compare with the output of the following command:
cdk --version

Quick starting an AWS CDK application

With the latest version of the AWS CDK installed, it is possible to start
the CDK project that will generate the desired AWS cloud environment
containing all the resources needed for a given application. The following command is necessary to start the project:
cdk init app --language=typescript
The command above starts a CDK application using the app template, in the language typescript. Issues that may happen when using this command are:
  • Unknown init template: it is important to notice that app does not stand for the name of the application. It stands for the template that is going to be used to quick start it. Three templates are available:
    app
    ,
    lib
    and
    sample-app
    , and each template has a list of languages supported. The current default template is
    app
    ;
  • Unsupported language: as described in the previous item, some languages are not yet
    supported by all templates. The command below will specify all templates available and which languages are supported by them for the current CDK version installed:
  • cdk init --list
  • No language was selected: by the time of this article, AWS CDK does not have a standard language. One of the main goals of the framework is to let the developers decide which programming languages they want to use. Due to this, this event will fire when running the init command without specifying the language;
  • cdk init
    cannot be run in a non-empty directory: since it is only specified in this command the name of the template to be used, AWS CDK relies on the folder name to generate the name of the application, and it cannot be initialized in any folder that is not
    empty. To solve this issue, a new folder needs to be created, and the command should be used again inside this new folder.

AWS CDK constructs and versioning

The main goal when creating CDK applications is to easily deploy AWS Cloud resources. With that in mind, each resource that is going to be used in the application needs to be installed. Each language has its own way in
how to install those packages, and documentation about it can be found in the troubleshooting section, and each construct module name can be found in its section in the API Reference.
The same CDK application is usually used for long periods of time, as it
defines the entire architecture of AWS projects. It is a common necessity to add more constructs to the stacks, as the project keeps growing, which can be achieved, using Amazon S3 as example, using the command below:
npm i @aws-cdk/aws-s3
By doing so, an issue might appear: dependencies which versions are mismatching, triggering the error
Argument of type ‘this’ is not assignable to parameter of type ‘Construct’
. It happens when constructs can’t interact with each other, making the parameter this passed to be marked as invalid.
To prevent this issue from happening, it is necessary to update the dependencies using the command
npm update
every time a new construct is added to the project or a new version of the AWS CDK core is released.
It is also important to notice that the usage of some resources might
require the addition of some dependencies, as the case of AWS CodePipeline, which may require features present in the
aws-cdk/aws-codepipeline-actions
.
Being aware of those cases might be helpful to prevent issues and delays in the stack development.

Development of an AWS CodePipeline Stack with AWS CDK

AWS CodePipeline is a cloud resource that provides CI/CD services in the
AWS environment. In this section, a walkthrough on how to create a CodePipeline using AWS CDK in TypeScript is going to be provided, as well as solving some of the issues that may be presented while doing so. For this example, a pipeline that will automate the deployment of a ReactJS application present in a GitHub repository, which will be hosted on a Amazon S3 bucket is going to be used.
CodePipelines consist of stages, which represent actions that are going to be performed on the artifacts used as source. The pipeline that is used as example for this article will have three stages:
  1. Source: in this stage, the ReactJS application code will be gathered via webhook to be used as input for the pipeline;
  2. Build: this stage will be responsible to build the artifacts coming from the previous stage in a CodeBuild instance;
  3. Deploy: the last stage of this example pipeline, it will be responsible for deploying the S3 hosted website in the web.

Initializing the CodePipeline environment

To define a standardization of all the resources existent in the cloud environment, it is a good practice to set the resources name with a prefix indicating the stack they belong to. To do so, a property named
environment
is going to be set in the
cdk.json
file and imported in a context file with other environment variables the application may require.
This is how the JSON file should look like after setting the property:
{
    “app”: “npx ts-node bin/aws-cdk-example.ts”,
    “context”: {
        “@aws-cdk/core:enableStackNameDuplicates”: “true”,
        “aws-cdk:enableDiffNoFail”: “true”,
        “environment”: “aws-cdk-example”
    }
}
Besides the property just set, the entire file stays the same as it was created using the init command. An issue that may happen when modifying this file or the project structure is the
Cannot find module
error, being triggered when the file set in the
app
property is not present in the given path, usually happening after the file has been renamed or moved to better suit the development team’s pattern.
To solve it, change the value of the property with the proper path to the file where the stacks are being synthesized.
To get those values, a context function is recommended, importing all the context variables used in the application. The following snippet is the function and its interface requiring the variable from the JSON file and returning it when called:
import { Construct } from '@aws-cdk/core';

interface IContext {
    environment: string
};

function getContext(app: Construct): IContext {
    return {
        environment: app.node.tryGetContext('environment')
    }
}

export default getContext;
It is important to point out that the string inside the
tryGetContext
call must match the key from the JSON object, which otherwise will result in an
undefined
value and possible error generation, depending on the usage of the variable.

Initializing the hosting bucket

This bucket is going to serve as the host in which the ReactJS application is going to be stored and served in the web. For this example, a scenario where all users will be able to access the hosted website is going to be used, using CORS rules and public read access to allow its usage.
The following snippet demonstrates how to define an S3 bucket to host a ReactJS application, using the
aws-s3
module from AWS CDK:
const corsRules: [CorsRule] = [{
    allowedOrigins: [‘*’],
    allowedMethods: [HttpMethods.GET]
}];

const bucketName = `${environment}-bucket`;

const bucket = new Bucket(this, bucketName, {
    bucketName: bucketName,
    publicReadAccess: true,
    cors: corsRules,
    websiteIndexDocument: “index.html”,
    websiteErrorDocument: “index.html”,
    removalPolicy: RemovalPolicy.DESTROY
});
Note that the
environment
present in the code represents the context variable previously set in the
cdk.json
file, allowing the cloud resources to have names following the same pattern. Issues that may be presented when creating such S3 buckets are:
  • Forbidden status when entering the website: This issue happens when not setting the
    publicReadAccess
    property to the S3 bucket, as for safety purposes, the default value for this field is set as
    false
    . To solve this issue, the
    publicReadAccess
    property must be set as
    true
    , granting read access for everyone in the web accessing the bucket URL;
  • Page not found: This issue, usually generated when creating websites with ReactJS and similar frameworks, is generated due to the fact the
    websiteErrorDocument
    is pointing to a nonexistent file. since ReactJS builds do not generate an
    error.html
    file to act as a redirect page when a route is not found. To solve this issue, both
    websiteErrorDocument
    and
    websiteIndexDocument
    must be set as
    index.html
    as parameters when creating the resource Construct;
  • Bucket name already taken: Amazon S3 buckets have one rule for their naming: they must be unique globally. This means that the bucket name created must be well chosen and very specific, preventing the error
    BucketAlreadyExists
    to be generated when deploying the CDK.
As for the
removalPolicy
property in the bucket creation, it is set to destroy to allow the removal of this bucket when running the
cdk destroy
command, overriding the default
RETAIN
value. This allows the deletion of the given bucket if the same is empty when the command is used, although it will still need manual clean up and removal if the bucket contains any kind of data on it.

Building the website with AWS CodeBuild

The next step after creating the bucket where the website is going to be hosted is creating the CodeBuild project that will build the website with the code coming from the GitHub repository. The first step to achieve this goal is creating the AWS IAM Role the CodeBuild project will have, granting it permissions to perform actions on other AWS cloud resources, such as the website bucket. The following code snippet shows how to create this role:
const codebuildRoleName = `${environment}-codebuild-role`;

const codebuildRole: Role = new Role(this, codebuildRoleName, {
    roleName: codebuildRoleName,
    assumedBy: new ServicePrincipal(‘codebuild.amazonaws.com’)
});
codebuildRole.addToPolicy(new PolicyStatement({
    effect: Effect.ALLOW,
    actions: [
        “logs:CreateLogGroup”,
        “logs:CreateLogStream”,
        “logs:PutLogEvents”,
        “s3:*”
    ],
    resources: [artifactsBucket.bucketArn]
}));
The definition of the role has one important property:
assumedBy
. This property is responsible to tell the AWS environment which service this role is going to work for, being here assigned to
codebuild.amazonaws.com
, which represents AWS CodeBuild. It is important to assure that the proper service principal name is specified, as it is going to create the Trusted Entity AWS IAM will use to give the resource the basic permissions to operate.
If the value is not properly set, an error like this will be given when running the pipeline:
Error calling startBuild: CodeBuild is not authorized to perform: sts:AssumeRole
. To solve it, simply set the
ServicePrincipal
to the proper value of the desired AWS cloud resource.
As for the next code section, it is handling the policies that are granted to the previously created role. IAM policies allow AWS cloud resources to perform actions on other resources in the environment. It is a
PolicyStatement
that usually contains
effect
,
actions
,
resources
and
conditions
, although this CodeBuild project needs no conditions to be set, therefore, this property is not being used. It is very common to have issues in this section of CDK applications, as there is no easy method to define which properties a resource will need to have to run.
With time and experience, architecting AWS projects will facilitate the definition of what permissions each resource will require, reducing the time spent on trial and error methods to create resources.
To solve policy issues, tailing the logs when the stack is being create and when the resource runs is the best method, as the console will specify which permission is missing in the policy set and which resource it needs to be accessed.
The next step on creating a CodeBuild resource is to create the resource itself, using the
aws-codebuild
module from AWS CDK. The following snippet demonstrates how to declare the CodeBuild construct:
const codebuildProjectName = `${environment}-codebuild`;
const codebuildProject = new PipelineProject(this, codebuildProjectName, {
    environment: {
        buildImage: LinuxBuildImage.STANDARD_2_0,
        computeType: ComputeType.SMALL
    },
    role: codebuildRole,
    projectName: codebuildProjectName
});
The CodeBuild method used to instantiate this resource is by using the
PipelineProject
construct, which facilitates the usage of the resource when it is used inside a CodePipeline project, as used in this example. This construct is composed of a large number of properties, but just some of those are needed to be set, and for this project
environment
,
role
and
projectName
are used.
The environment property sets the specifications of the CodeBuild instance used to build the given project, and it needs to support in its environment the language that is used in the project that is going to be built. This CodeBuild reference tells about all images and its specificities, which will prevent CodeBuild to generate an error informing that the runtime version is not supported by the selected build image.
Another property, not used in this project due to the fact its default value is correct, is the
buildSpec
. The buildspec file contains the information that is going to be used by the resource to build the source code, with step-by-step instructions to CodeBuild on how the compilation should proceed.
The lack of a file named
buildspec.yml
in the root of the project to be built, or the custom if a value is set to the property, will result in an error labeled as
YAML_FILE_ERROR Message: YAML file does not exist
. To solve it, this file must be created, containing the steps to build the application, following the pattern present in the aws docs.

Creating the AWS CodePipeline

The CodePipeline is where everything that has been created until now comes together. As CodeBuild also required, the pipeline will also need to have its own IAM
Role
to have the necessary permissions to operate successfully, and its definition, which will use the
Pipeline
construct from the
aws-codepipeline
module present on AWS CDK.
The definition of the pipeline will follow the pattern as the snippet below:
const codepipelineName = `${environment}-pipeline`;
const codepipeline: Pipeline = new Pipeline(this, codepipelineName,{
    artifactBucket: artifactsBucket,
    pipelineName: codepipelineName,
    role: pipelineRole,
});
const outputSource = new Artifact();
const outputBuild = new Artifact();
That short snippet is necessary to declare the pipeline. The only parameter that wasn’t stated yet in this article is the
artifactBucket
, which needs to receive a S3 bucket that is going to be used to store the artifacts that are going to be passed from stage to stage. It is an optional parameter, that was set with the goal to standardize the bucket names, when not setting it would result in the creation of a bucket with a default name.
Also, the
Artifact
items created are going to serve as the artifacts holding the files that are going to be passed on the pipeline from a stage to another.
After everything is set up, all that is left to do is declare the stages themselves to the pipeline, where all the action will happen. As previously stated, three stages are going to be created: Source, Build and Deploy, and the following sections will state how to create those stages, using the
aws-codepipeline-actions
module from AWS CDK.
Adding stages: Source
This stage is responsible for gathering the code from the GitHub repository that is going to be used as source for the build using CodeBuild. This stage can be added using the snippet below:
const { repo, owner, oauthToken } = github;
codepipeline.addStage({
    stageName: ‘Source’,
    actions: [
        new GitHubSourceAction({
            actionName: ‘Source’,
            oauthToken: new SecretValue(oauthToken),
            output: outputSource,
            owner,
            repo
        })
    ]
});
The
github
object will be set in the
cdk.json
file, and is selected the same way as the
environment
was previously done. It is important to note that the object containing this information should be identical as the object defined in the
context.ts
interface, allowing
tryGetContext
to get its value without issues. If the pattern is not followed, an error will be triggered as one or more properties will have their values as
undefined
due to the fact the values will not be present in the JSON file.
Basically, each stage will contain a
stageName
and the
actions
to be performed. In this case, its name is going to be defined as
Source
and it will perform a
GitHubSourceAction
, which will collect the code in the GitHub repository specified in its parameters.
Some issues may be generated when creating this stage:
  • Authentication fails with GitHub oauthToken: GitHub provides to developers a token that needs to be passed to the pipeline to grant access to GitHub repositories. To solve this issue, an oauthToken must be generated in the repository owner’s profile containing the following permissions:
    admin:repo_hook
    and
    repo
    . This generated token needs to be passed to the pipeline as a new
    SecretValue
    , which represents a secure way to store the Secret value created in the AWS SecretsManager, in the
    oauthToken
    property;
  • Invalid repository/owner: This issue usually happens when the repository name and/or the owner are mistyped in the context file, therefore generating a incorrect link that CodePipeline can’t find, or if the provided oauthToken does not have the necessary permissions to visualize the repository. This issue is labeled on CodePipeline as
    Either the GitHub repository “repo-name” does not exist, or the GitHub access token provided has insufficient permissions to access the repository.
Adding stages: Build
This stage is responsible for using the previously defined AWS CodeBuild resource to build the code coming from the previous stage. To define this stage, the example snippet below can be used:
codepipeline.addStage({
    stageName: ‘Build’,
    actions: [
        new CodeBuildAction({
            actionName: ‘Build’,
            project: codebuildProject,
            outputs: [outputBuild],
            input: outputSource
        })
    ]
});
This stage is far simpler than the
Source
stage, as the CodeBuild resource was already created in this application. It is going to use the
outputSource
artifact, which contains the source code coming from the repository, as shown in the
Source
stage snippet, to create the build of the application.
In this stage, all the CodeBuild issues will appear, as this is when the resource is being put to use, so to debug those possible issues, it is needed to open the CodeBuild project in the AWS Console and check the logs of the building operation. This building process, if successful, will then generate the compiled application that will be sent to the
outputBuild
artifact to be used in the next stage.
Adding stages: Deploy
This stage is responsible for delivering the website via Amazon S3, using the bucket previously created to do so. It will require the usage of the
S3DeployAction
from the pipeline actions module, as shown in the snippet below:
codepipeline.addStage({
    stageName: ‘Deploy’,
    actions: [
        new S3DeployAction({
            actionName: ‘Deploy’,
            bucket: bucket,
            input: outputBuild
        })
    ]
});
This last stage will require just a few parameters, being the only one not described yet, the
bucket
, which refers to the bucket the application is going to be deployed, to serve hosting the built website.
Most of the common issues that may happen in this stage are being covered in the definition of the bucket, so at this point the pipeline should work just fine.

Instantiating the artifactsBucket and the CodePipeline

Now, working in a direct class that extends a CDK Stack, the pipeline and its artifacts bucket are going to be created, as the last step before deploying the pipeline. The following code snippet represents this creation:
const artifactsBucketName = `${environment}-artifacts-bucket`;
const artifactsBucket: Bucket = new Bucket(this, artifactsBucketName, {
    removalPolicy: RemovalPolicy.DESTROY,
    bucketName: artifactsBucketName
});
new CDKExamplePipeline(this, `${environment}-stack`, {
    artifactsBucket
});
This is a quite simple step, considering a S3 bucket has already been created, and the
artifactsBucket
is a much less complex resource, as it does not need to host a website. As previously mentioned, the usage of this bucket described in this snippet is optional, since the CodePipeline already creates a bucket by default, but be mindful that this default bucket will most certainly not follow the patterns used in the stack. As for the pipeline itself, the bucket is being passed as a parameter, taking in consideration that the CodePipeline class just created has an interface like the snippet below:
interface ICDKExamplePipelineProps {
    artifactsBucket: Bucket
}
This interface is used as the type of the
props
value in the constructor of the class, allowing the developers to create custom interfaces, passing all the parameters that they may want to.

Post-development issues

The development of the application might be the trickiest part, but issues might happen even after the code is completely right. Issues that happen following the development of the application may include:
  • Bootstrap required: When the issue
    Template too large to deploy (“cdk bootstrap” is required)
    appears, it means your CDK application generated a CloudFormation template larger than the maximum accepted, which is 50 KiB. This issue is quite simple to solve, since all that is needed to do is run the bootstrap command given in the terminal, containing the
    bucket ARN
    in which the CDK application is going to be uploaded and imported in AWS CloudFormation once the deploy command is used again;
  • Exceeding resource limit: Since AWS CDK applications are converted to CloudFormation templates, they have the same limitations. One of those limitations is the hard cap on the resource quantity present in a single stack: 200. While this may seem a lot, CDK constructs usually generate more than one resource in the stack to get created. To solve this problem, separating the application in multiple stacks is the recommended option, allowing the developers to organize the cloud environment in multiple parts that are connected when deployed;
  • Resources not being deleted after
    cdk destroy
    : The
    cdk destroy
    command has the objective of deleting a given stack and all of its resources. Although, AWS blocks the removal of some resources that may contain data, such as Amazon DynamoDB and Amazon S3, leaving the given resource in the cloud environment even after the entire stack has been destroyed. Specifying
    RemovalPolicy.DESTROY
    in the Construct definition will allow CDK to destroy all empty resources created. For resources that are not empty, running
    cdk destroy
    will generate an exception for the non-empty resources, and those
    resources will then have to be cleared and deleted manually via AWS console.

Conclusion

This article pointed out the most common issues that may happen while developing cloud environments using AWS CDK and how to easily fix them, is a complimentary article for “How AWS CDK facilitates the development process of AWS Cloud Stacks”, which introduces the tool and gives an overview in how it can be utilized. As the framework evolves, those issues will be addressed by AWS and the community, improving the facilitating tool and the lives of every AWS developer.
Mark Avdi
CTO | WestPoint.io | Lover of all things Serverless

Written by WestPoint | CTO WestPoint | Lover of all things Serverless
Published by HackerNoon on 2020/06/27