Things are heating up in the CouchDB universe now that CouchDB 2 is an out-of-the-box multi-master database that can scale to store a lot of data! Unfortunately, there is still a bit of a shortage in documentation when it comes to how to use CouchDB 2 in production. The point of this tutorial is to take you step by step through the process of setting up a CouchDB cluster in production using AWS and Docker. We’ve used a similar setup for , a digital dropbox and grading system, and it is working great! Quizster The setup below uses open source software and therefore, it can easily be adapted to work for the Google Cloud Platform, Azure or any other hosting providers, i.e. no vendor lock-in. Moreover, because we are using open source software, you can also set up a local environment to develop against! (VirtualBox and Vagrant are great for this) Why are we going to use Docker? Keeping up to date with the latest version of a database can be a real drag. One of the latest trends is to just stand up a new server and migrate your data over each time you need to upgrade. In some cases, this is the best option, but by using Docker, we also have the option of just issuing a when a new CouchDB docker image is released. This way, we don’t need to worry about whether our distro has the latest CouchDB binary and don’t have to fight our way out of dependency hell. Moreover, we can easily stand up a new server, install docker on this server and then bam, run a docker image for CouchDB! Docker also has some nice built in functionality for handling restarts for when your servers are rebooted or CouchDB just crashes. docker update Our was pretty ambitious and used Docker Swarm with AWS’s Network File System, called EFS. The advantage of this design was that you could stand up a cluster of docker swarm nodes and then just use to add more CouchDB nodes. In addition, Docker Swarm doesn’t appear to allow routing to a swarm node based on task slot. So, we decided to drop Docker Swarm in favor of a design where our CouchDB images are statically bound to specific servers. (Managing persistent storage with Docker Swarm is a known issue and nothing yet has really emerged to solve this problem). initial design docker service scale The deal breaker however, was that we found that running CouchDB on top of EFS made the database over 10 times slower! Here is what we are going to do: Create two EC2 instances on AWS, both running Docker. Each node will be located in a different availability zone (physical location). Run an instance of the CouchDB image on each EC2 instance Run a simple script to connect the CouchDB nodes Use a load balancer to distribute traffic to each node according to load and availability. The load balancer will also be used to serve database traffic over SSL. Note: AWS has a free tier, but it isn’t going to cover all the costs incurred by following the steps in this tutorial. Fortunately, AWS charges by the hour so you can easily follow this tutorial and then destroy all the pieces without incurring much of a cost. If you were to continue to use this setup in one of the cheaper regions, e.g. in the US West region, you’d be looking at a monthly bill of about $26 ($16 for the load balancer + $10 for the EC2 servers). This is pretty darn good for a production ready 2-node CouchDB cluster! I’ll assume you have little to no AWS experience. If this assumption is wrong, then please feel free to skip around. Step 1 — Create an AWS account Create a free AWS account Step 2 — Import Your Public SSH Key Overview: like most modern hosting providers, AWS encourages users to connect to their servers via SSH keys instead of using passwords as passwords are a lot easier to crack. Search for the EC2 service Select Key Pairs Click . You’ll then need to paste in your public SSH key and click . On Mac/Linux based systems, this text is found in ~/.ssh/id_rsa.pub Import Key Pair Import Step 3 — Create Security Groups Overview: security groups allow your servers to communicate with each other in a private cloud while exposing specific ports to the world. We are going to create 2 security groups as this configuration will give us a lot of flexibility to make changes in the future. From the EC2 dashboard, click Security Groups Click Create Security Group Enter a name and description of and specify an inbound rule on port 22 from anywhere. ssh Adding this rule simplifies our setup, but exposes a security hole where any box can SSH into our servers (assuming they have our SSH key). Therefore, after you have completed this tutorial, you should remove the port 22 rule and . set up a VPN instead Repeat the steps above to create a new security group, except call this new group and create a rule to allow inbound connections on port 443 from anywhere. couchdb-load-balancer When you are done, you should have 3 security groups: Step 4 — Create The 1st EC2 Instance Return to the EC2 Dashboard and then click Launch Instance Select Ubuntu (you can of course select almost any other OS that runs docker, but this tutorial is tailored for Ubuntu) Select and click t2.nano Review and Launch On the next screen, click Edit security groups Select the and security groups and click ssh default Review and Launch Then click Launch Choose the key pair that you imported above and click Launch Instances Click . Select the instance and make a note of the Public DNS and Private IP. We’ll refer to this Public DNS as DB1-PUBLIC-DNS and this Private IP as DB1-PRIVATE-IP. View Instances Note: if you ever stop and then start this instance, the Public DNS will change. Step 5—Install Docker and Run the CouchDB Container SSH into the EC2 instance $ ssh ubuntu@DB1-PUBLIC-DNS Download and run scripts to configure Ubuntu and Docker $ git clone $ cd $ sudo ./ubuntu.sh # Select "keep the local version ... "$ sudo ./docker.sh https://github.com/redgeoff/docker-ce-vagrant docker-ce-vagrant Create a directory for hosting your DB files $ mkdir /home/ubuntu/common Run a CouchDB Docker Container and make sure to replace accordingly. DB1-PRIVATE-IP $ Notes: Docker only has to download the image once and then will just run the container on all subsequent starts/restarts. The parameter ensures that your CouchDB node will automatically restart if it crashes or when the server is rebooted --restart always All the nodes in your server must use the same values. The value above will result in the password You can use the utility to generate this hash. For example, if your password is you can use admin. [couch-hash-pwd](https://github.com/redgeoff/couch-hash-pwd) mypassword couchdb-hash-pwd -p mypassword Enable CORS so that your application can communicate with the database from another domain/subdomain. $ curl -sL https://deb.nodesource.com/setup_8.x | sudo -E bash -$ sudo apt-get install -y nodejs build-essential$ sudo npm install npm -g$ sudo npm install -g add-cors-to-couchdb$ add-cors-to-couchdb -u admin -p admin http://localhost:5984 Step 6—Create Another EC2 Instance Overview: we are now going to create another EC2 instance and then run another CouchDB docker container. Most of the steps are the same as before. (An alternative route, that isn’t covered by this tutorial, is to create an Amazon Machine Image (AMI) of the 1st EC2 instance and then use this AMI to create other instances — this is a good option if you are going to be spinning up many nodes). Return to the EC2 Dashboard and select Instances Select the 1st instance and then select Launch More Like This Click the tab at the top of the page and be sure to select a different subnet/zone. Why? Well, we want our two CouchDB nodes to be located in different physical locations, also known as Availability Zones in the AWS world. This way, if there is something like a natural disaster in one zone, we won’t lose any data as our other node will remain intact. (Note: AWS works its magic to make sure that it is super fast to transfer data between different availability zones, but the data transfer between regions is a lot slower. Therefore, you should not attempt to run a cluster of nodes across different AWS regions). Configure Instance Click , select your SSH key and click . Review and Launch Launch, Launch Instance Make a note of the and of this new instance and repeat Step 5 to update Ubuntu, install docker and run the CouchDB container. In the command, be sure to use the Private IP of your 2nd EC2 instance. Public DNS Private IP docker run Step 7— Create the Cluster SSH into EC2 instance and run the following commands. Be sure to replace DB1-PRIVATE-IP and DB2-PRIVATE-IP accordingly. This script connects the 2 nodes and creates system databases. either $ git clone create-cluster$ cd create-cluster$ chmod +x ./create-cluster.sh$ ./create-cluster.sh admin admin 5984 5986 "DB1-PRIVATE-IP DB2-PRIVATE-IP" https://gist.github.com/redgeoff/5099f46ae63acbd8da1137e2ed436a7c You can then use to ensure that your cluster has been configured correctly. In the entry, you should see both your values for DB1-PRIVATE-IP and DB2-PRIVATE-IP. If you don’t, double check the parameters in you docker run command. Note: COUCHDB_USER, COUCHDB_PASSWORD, COUCHDB_SECRET and the value used after setcookie must be the same. See for more info on how to troubleshoot the cluster. curl [http://admin:admin@localhost:5984/_membership](http://admin:admin@localhost:5984/_membershipto) all_nodes Node Management Step 8— Import an SSL Certificate I highly recommend that you buy an SSL certificate if you do not already have one as transferring database data over an insecure connection just isn’t going to cut it in production. If you don’t have an SSL certificate and wish to purchase one, there is a great deal for $42/yr for the . If you wish to proceed without SSL, skip this step. AlphaSSL Wildcard Certificate Click on the cube in the top-left corner of the page and search for the Certificate Manager Click Getting Started Click Import Certificate Enter the certificate details, click and then click . Review and Import Import Step 9—Set Up a Load Balancer On the EC2 Dashboard, select . Load Balancers Click Create Load Balancer Select Application Load Balancer Specify and port . If you wish to proceed without SSL (not recommended) then you can use and port . HTTPS 443 HTTP 80 Select all the availability zones and click . Next: Configure Security Settings Choose an existing certificate and then click . Next: Configure Security Groups Select the couchdb-load-balancer and default security groups and then click . Next: Configure Routing Configure the routing and click Next: Register Targets Select both your EC2 instances and click . Add to registered Then click Create. Step 10— Configure the DNS Overview: we are going to set up DNS routing via AWS’s awesome Route 53 service as it can dynamically map to our load balancer. Click on the cube in the top-left corner and search for Route 53 Click Get started now Click Create Hosted Zone Enter the hosted zone details Check the box, click on the and select your load balancer. Alias Alias Target Make a note of the name servers in your hosted zone, e.g. Visit the domain registrar with which you have registered your domain name, e.g. GoDaddy, Google Domains, AWS, etc… and point your domain to these name servers. You’ll probably have to wait a few minutes until the DNS switches over. Step 11— Relax Spin up Fauxton by visiting and log in with admin/admin. It’s time to relax! https://db.mydomain.com/_utils (Note: if the DNS is slow to propagate, you can access your database via the Public DNS for your load balancer, e.g. Just click through the SSL warning displayed by your browser) https://LOAD-BALANCER-PUBLIC-DNS/_utils. Step 12 — Update When A New Version of CouchDB Is Available One of the coolest things about this setup is that you can update to the latest version of CouchDB just by running the following on all your boxes: $ sudo docker pull couchdb$ sudo docker rm couchdb --force$ sudo docker run -d --name couchdb ... # See docker run above And, this can be done one node at a time, because the CouchDB API maintains backwards compatibility. Of course, having a backup is always a best practice in case something unexpected happens. If you enjoyed this tutorial, please like it and share it. And, if you have any feedback, please leave it below. About the Author Geoff Cox is the creator of , a new declarative programming language that can be used to generate an app from JSON. He’s been self-employed for the greater part of the last 15 years and loves taking on ambitious, yet wife-maddening, projects like and . You can reach him or at . MSON creating a database distributed data syncing system @redgeoff7 github