Basics of running anything on AWS (or anything) part 1 — setup, running, logging

EC2, ECR, Docker, systemd, and basic CD capability

This article will set you up with basic understanding of how to prepare your code for deployment/execution anywhere and then show you how to go about deploying it to AWS, making it easily accessible, easily deployable etc. Then I’ll make sure to build on this case and it’s variations to provide you with more detailed solutions (and help you understand them, of course). For this part AWS is just an example, knowledge you acquire here can be mostly applied to all cloud providers and even custom stack with private server (virtual or dedicated).

Note that I have chosen NodeJS as server platform, but this can be basically applied to any technology that runs on linux. Also, this is not big, robust enterprise solution, for something like that you will need to upgrade on couple places - this is your startup how to guide.

There are two fully distinct use cases here:

Permanent - running NodeJS server on AWS EC2 instance.
Transient - running NodeJS computation heavy task in EC2 instance and then stopping the instance until the next time it is needed.

What is it actually

Server

Your regular web application, server listening on a certain port, exposing api, rendering html etc.

Task

When we have longer running, computation heavy task. It probably includes downloading some data, merging it with some stream and pushing to database or similar.

In this part, I’ll cover just code/server deployment and some important related matters. In part two I’ll get you through using what we learned here to give us ability to use EC2 to run tasks (and just run for the length of the task), how to schedule tasks, trigger them ad-hoc etc.

Before we start

Security

Pay attention to where and how you are storing your access keys, tons of personal data and computing power gets stolen from people because they commit their credentials to public GitHub repositories.
Be sure to take closer look at documentation whenever I point that out (in cases like Security Group and IAM Role setups, various authentications etc), miss-configuration of permissions can lead to all kinds of data leaks and hacks (make sure to look at Chicago voters data leak incident).
Make sure never to commit any of your auth data (yes this one is same as second one).

Common

When building this system, make sure to have it all set up on the same AWS region.

System elements - what are we going to use

Docker - virtualization
systemd - running stuff
AWS EC2 - running docker with systemd
AWS ECR - as repository for docker images
AWS CloudWatch - Logging

AWS specifics

Security Group for EC2
Access key (pem) for ec2
IAM access key id + key
IAM roles for EC2 and Lambda

Your web application

NodeJS

Let’s start with simple web server. Here is your server.js:

var http = require('http');

console.log('Listening on 8080');
http.createServer(function (request, response) {
   response.writeHead(200, {
      'Content-Type': 'text/plain',
      'Access-Control-Allow-Origin' : '*'
   });
   response.end('Hello people!');
 }).listen(8080);

Docker

Docker is container engine that allows us to create dedicated linux container for our app, define all the parameters of the system, preinstall dependencies and run it anywhere with little to no overhead. It also comes with addition of tons of infrastructure as a code goodness and possible improvements (docker-compose, docker-swarm, ability to run on Kubernetes cluster…) that will help you provision, scale and otherwise easily build and maintain your app’s infrastructure.

Simply, all the above means is that, if you have linux server with Docker installed, you can run your app without any additional installs and dependencies.

Put Dockerfile with following content in the same folder:

FROM node:boron
WORKDIR /usr/src/app
COPY . .
CMD ["node", "server.js"]

Above code tells docker to download specific nodejs oriented Docker image (at time of writing boron is NodeJs LTS version), prepare workdir, copy whatever is there in same folder as dockerfile and run node server.js when container is ran.

This came to be known as infrastructure as a code, which clearly illustrates ability it gives you (to write code for the infrastructure).

First results - local machine

Go on and do the following

docker build . -t node-app
docker run -p 8080:8080 node-app

So, by doing this, we managed to create server that is virtualized inside docker container… To run it anywhere you just need docker on the machine. This is, basically, only thing you need to quickly run multiple instances of the same server.

At this point you are platform agnostic, you can run your NodeJS code anywhere on the system that supports docker. You can build docker and run your code on any server now. This is, of course, not complete solution… and not easily to automate as well.

Making it available for deployment

When you are building something, logical part of it is deploying it somewhere (your laptop does not count).What we will do is build Docker image and then push it to container repository for further use.

What you need to do is create container repository, name it whatever you want (I will assume you named it node-test) and push your containerized server as a latest version there. Here are steps to set that up on local or build machine:

First command fetches ECR login command and executes it immediately. So, here AWS decided to go for really interesting approach to authentication - you run a command in cli and as a result you get command to execute in order to authenticate to ECR… for you to be able to do this your system has to be authenticated on AWS already.
Build.
Tag it like aws_account_id>.dkr.ecr..amazonaws.com/
Push to repo

aws ecr get-login - no-include-email - region <your_region> | /bin/bash
docker build -t node-test .
docker tag node-test <aws_account_id>.dkr.ecr.<region>.amazonaws.com/node-test
docker push <your ecr url>:latest

Cloud setup

I intentionally left out the provisioning for this article to truly guide you through basics.

Since we are talking about having as much as possible potential in an infrastructure without having to do too much of work (especially setting up private servers and such), it is logical to go for cloud solution.

This solution is at some point describing usage of AWS elastic compute cloud to deploy your code and some other AWS stuff for additional segments of the solution, but I strongly suggest considering other options as well.

Server (VM) setup - what does your EC2 need

Create EC2 instance (I’ll go on and act as if you created ubuntu one). download .pem file (to get ssh access).

Login to EC2 you have created using its pem file (name it access.pem and put it inside the repo, and gitignore it of course). You also need to give it proper permissions, about which you will be notified at some point

chmod 400 access.pem
ssh -i access.pem ubuntu@<instance_url/ip>

when in, you have to make sure that it has two things installed - Docker and AWS cli:

sudo su
apt-get install docker.io awscli

Also create ~/.aws/credentials with content:

[default]
aws_access_key_id = <your aws key id>
aws_secret_access_key = <your aws key>

That is the base of it - now all the dependencies are in and we can have as many instances of this VM as we need. It is a good practice to make your code platform agnostic, VM should be as simple as possible - shell that runs your virtualization+code. There is, in case when dealing with single machine, some service daemon provisioning to be done in order to achieve high level of automation, for the sake of this guide we are going to do it manually.

Systemd - core of your vm setup

In order for setup to be good enough, created VMs need to be simple shells with minimum of dependencies and loosely coupled to build system.

For this purpose we can create systemd deamon to start our script (or server) at system start, meaning that newly provisioned server or restarted server will do this same thing when starting:

Pull docker image from docker repository
Run new docker image

Additionally, for purpose of this article, since I’m going to guide you to deploying this on AWS, it needs to authenticate on AWS elastic container repository (ECR) to be able to pull docker images from it.

Here is how to do it:

Lets call it node-js, this means that we should create file /etc/systemd/system/node-js.service with following content:

[Unit]
Description=Docker NodeJS service
Requires=docker.service
After=syslog.target
[Service]
StandardOutput=syslog+console
# Change killmode from "control-group" to "none" to let Docker remove
# work correctly.
KillMode=none
#EnvironmentFile=-/opt/svc/driver-portal/systemd.env
# Pre-start and Start
## Directives with "=-" are allowed to fail without consequence
ExecStartPre=/bin/sh -c '/usr/bin/aws ecr get-login --region us-east-1 | /bin/bash'
ExecStartPre=-/usr/bin/docker kill nodejs-server
ExecStartPre=-/usr/bin/docker rm nodejs-server
ExecStartPre=/usr/bin/docker pull <your ecr repo>:latest
ExecStart=/usr/bin/docker run --log-driver=awslogs --log-opt awslogs-region=eu-central-1 --log-opt awslogs-group=nodejs --log-opt awslogs-stream=server \
--name nodejs-server <your ecr repo>:latest
# Stop
ExecStop=/usr/bin/docker stop nodejs-server
[Install]
WantedBy=multi-user.target

As you can see, apart from basic systemd setup there are three distinct things we are doing here:

Authenticating to AWS ECR - for this IAM Role of EC2 instance has to allow usage of ECR
Stopping and cleaning up docker space (getting rid of old instance if there is any).
Pulling newest version of docker image and running it

Bonus - logging

As you can see, there is one additional part I included in running docker image: –log-driver=awslogs –log-opt awslogs-region=eu-central-1 –log-opt awslogs-group=nodejs –log-opt awslogs-stream=server , you need this for logging purposes and I recommend it, but be aware that this is first cloud service (in this case AWS) specific thing that will change if you decide to go for different cloud solution or even logging approach.

This is part that insures that all the logs from Docker instance go to AWS CloudWatch, this also means that you will need additional authentication for docker service on VM (yeah IAM role… not enough, for reasons).

For Docker service on VM to have auth to aws, create: /etc/systemd/system/docker.service.d/aws-credentials.conf

[Service]
Environment="AWS_ACCESS_KEY_ID=<your_aws_access_key_id>"
Environment="AWS_SECRET_ACCESS_KEY=<your_aws_secret_access_key>"

Pay attention that this is specific authentication, not for the Docker process/container/image itself (which runs the code that can use AWS api through local auth config) or VM (which logs into ECR to pull docker image) but for Docker service that runs all the virtualizations. This is in order for us to be able to, for instance, stream all logs from various docker instances to CloudWatch. Why do we need this… I am guessing Amazon has let too many people assume the mantle of security standard creator.

After adding this, just go on and add log group named nodejs to CloudWatch. When your instances are started, you’ll be able to find your logs there.

The CI/CD ability

Systemd setup described in this section is big step towards something that every new project needs - ability for continuous integration/delivery.

I other words - your workflow at start, as one man team (or multiple people team but w/o CI server) would be:

Finish feature
Test locally on Docker
Building Docker and pushing to ECR
sshing into remote machine and restarting service that runs your stuff.

Since this is not ideal, you’ll probably want solution using some CI. I’ll only describe steps that deal with code delivery:

Code passes pre build stuff (linters, unit tests etc)
Docker image is built
System pushes built image to container repository (ECR in our case, but you can go with anything)
By executing remote SSH command to restart service on server that runs your code, CI triggers systemd that downloads docker image, removes any previous version that might be there and runs docker image.

Further on

There it is - CI ability using single EC2 instance, docker, some linux magic and ECR. There are few logical future paths to explore to improve this:

Use CI tool - Jenkins or GoCD should easily integrate in this flow, building docker images and pushing them to ECR and for deployment just remotely restarting systemd on EC2
Look into scalability - this would in some cases mean removing systemd depending on what you use (docker swarm, kubernetes, mesos…)
Use provisioning tool - Ansible, puppet, chef….
SSL - you need elastic load balancer in front of your

I may write about some of this stuff in the future, but, next stop - running and automating tasks.