This table of contents is under construction. It will get updated as it reflects the project's progress.
- Presentation
- Preparing to deploy
- Deployment
- Project topology
- Application environment
- Terraformation
- Dockerizing
- Ansible in action
- Database running
- Database replication
- The application server
- The service host
- Load balancing with HAProxy
- Database failover
- Region failover
- Running tests
- 👉 Monitoring with Zabbix and Grafana
- CI/CD with Jenkins
- Orchestration with Kubernetes
- Personel considerations
- References
This project consists in a high availability cluster running on two AWS Regions. I have chose to use non cloud native tools in order to reduce coupling among project and cloud provider.
The infrastructure was provisioned with Terraform 1.1.5, whereas the software layer was deployed using Ansible v. 2.12.1.
The Web application was built into a Docker image with Docker v. 20.10.12 and is available on Docker hub, from where the playbook gets it.
Moreover, this project is divided into tree subprojects:
- Application project: includes the Web application code and Dockerfile
- Terraform project: creates the infrastructure resources
- Ansible project: handles the software layer deployment
- An EC2 service host with HAProxy load balancer
- An EC2 application host running 3 instances of the Web application (Django/Python v. 3.2.5)
- An EC2 database host running the main PostgreSQL v. 13.5 database
- A seccond service host with HAProxy load balancer
- Another application host running 3 more instances of the Web application
- A database host running the Standby database
The cluster takes advantage of the low latency offered by a VPC Peer connection between two Regions.
It enables not only the database replication but also scales the aplication among six docker images (three in each Region).
After pulling this project into your local machine, you will need to take two steps in order to deploy it:
- Create the "inventories" directory into the "ansible" directory
- Create the "variables.auto.tfvars" into the ./terraform/aws/ directory, and set values for the project variables
At this point, I assume that you have Terraform and Ansible installed and configured in your machine already.
variables.auto.tfvars
terraform_access_key = "..." # insert here your access key for terraform
terraform_secret_key = "..." # insert here your secret key for terraform
application_ports = [22, 8001, 8002, 8003]
database_ports = [22, 5432]
service_ports = [22, 81, 8000]
ansible_inventories = "../../ansible/inventories"
ssh_public_key = "..." # insert here the ssh public key for remote hosts' admin user
appserver_secret_key = "django-..." # insert here the django server secret key
dbport = 5432
dbname = "revolutdb" # set a name for the application database as you wish
dbuser = "dbuser" # set a name for the application's user
dbpass = "..." # set a pasword for the application's user
dbappname = "Birthday Application"
haproxy_conf = "../../ansible/roles/haproxy/files"
Make sure you have created the "variables.auto.tfvars" file as described in the topic above before you run the deployment commands.
Use the following Python scripts to easy the deployment.
To deploy the application and get it runnning:
❯ puthon3 run_deploy.py
To proceed the failover:
❯ puthon3 run_failover.py
To destroy the environment:
❯ puthon3 destroy_all.py
To run all processes manually, you will need to create the "inventories" directory.
Create the "inventories" directory into the "ansible" directory.
❯ cd ansible
❯ mkdir inventories
❯ cd ..
Into the "terraform/aws" directory, initialize Terraform, create the project plan, and apply it to provision the infrastructure.
❯ cd terraform/aws
❯ terraform init
❯ terraform plan -out "hometask_plan"
❯ terraform apply "hometask_plan"
❯ cd ../../
To install the entire software layer, including Database engine cluster, the Web application cluster, and load balancer cluster, go to the ansible directory and run the "deploy.yml" playbook:
❯ cd ansible
❯ ansible-playbook -i inventories deploy.yml
❯ cd ../
To destroy the infrastructure you have created, go to the Terraform directory.
Destroying the environment
❯ cd terraform/aws
❯ terraform destroy
❯ cd ..
When asked if you really want to destroy all resources, just type "yes" and press return to proceed.
The project follows this topology. Items dimmed in gray are to be implemented yet.
The .env file currently present in the application directory exists just to build the Docker image. It will be replaced during the deployment process.
After provisioning the infrastructure, among other files Terraform will create "site1.env" and "site2.env" files, which will later be copied to the application host.
Ansible will then replace the .env file by the "site1.env" file content into all application containers.
The "site2.env" file will be kept to be used in case of Site1 gets unavailable.
All the project's "hardware-representing" components are created with Terraform.
Some configuration parameters - security groups inbound ingress ports for instance - are defined into the "variables.auto.tfvars" file.
Other parameters are set into configuration files used by Ansible playbooks. The way those files are filled or created differs from one each other purposely.
The application image was built and deployed into my Docker hub repository.
The application's Dockerfile
FROM python:3
LABEL maintainer="Valerio Oliveira <https://github.com/valerio-oliveira>"
LABEL build_date="2022-01-29"
EXPOSE 8000
WORKDIR /usr/src/app
COPY . .
RUN pip3 install --no-cache-dir -r requirements.txt
RUN python3 manage.py makemigrations
RUN python3 manage.py migrate
CMD [ "python3", "manage.py", "runserver", "0.0.0.0:8000" ]
Building the application Docker image locally
❯ docker build -t valerionet/haproxyht:latest .
Deploying to Docker hub
❯ docker push valerionet/haproxyht:latest
As the most of the work is performed by Ansible playbook, details for every role are not written yet. It will be updated here little by little in near future.
To start deploying the application, Run the following command into the ./ansible directory:
❯ ansible-playbook -i inventories --forks 1 deploy.yml
* The "--forks 1" directive will only be needed if in your local machine Ansible is configured to ask to confirm "yes" at first ssh access to remote servers.
Validating PostgreSQL instalation and the database creation after deployment.
❯ ssh -i ./REVOLUT/exam_01/PEM/aws admin@x.x.x.x
admin@site1-db-x:~$ sudo su - postgres
postgres@site1-db-x:~$ psql -d revolutdb -c "select * from base.users;"
username | birthday
----------+----------
(0 rows)
After the deployment, you may validate that the database cluster is working by running the following commands:
On the main host:
❯ ssh -i ./REVOLUT/exam_01/PEM/aws admin@x.x.x.x
admin@site1-db-x:~$ sudo su - postgres
postgres@site1-db-x:~$ psql -d revolutdb -c "select * from pg_stat_replication;"
On the standby:
❯ ssh -i ./REVOLUT/exam_01/PEM/aws admin@x.x.x.x
admin@site1-db-x:~$ sudo su - postgres
postgres@site1-db-x:~$ psql -d revolutdb -c "select \* from pg_stat_wal_receiver;"
The application server is designed to host the Web application cluster instances. Besides, a Zabbix client is active collecting usage statistics.
In case of the main database gets unavailable for any reason, the DataOps team will run the database-failover playbook.
The failover process consists of two steps:
- promoting the standby to main database; and
- redirecting all application requests to the new main database server.
In case of the entire main Region gets unavailable, the DataOps team shall run the same database-failover playbook as before.
In addition, the Region failover will require DNS redireting.
About DNS
As DNS management itself is not part of the scope of this project, it is important to mention that in case of a Region gets down, redirecting the DNS to the seccond load balancer is part of the failover process.
The main resource on the service host is HAProxy load balancer. All requests to the application cluster are made through it.
In this project there are six application instances, three in each Region. It is possible to monitor the application cluster health by using the HAProxy statistics report. In this project it is on "http://ServiceHostAddress:81/stats".
As a way to validate the the cluster's efficiency, I have written a simple test application running on multiple threads, each one sending a certain number of requests to the Web application.
HAProxy load average test with 4 threads:
HAProxy load average test with 50 threads:
HAProxy load average test with 150 threads:
After scaling tests to 150 threads, an issue was detected. As each thread sends hundreds of requests, at some point some of those requests started to get connection timeout.
At first I thought that I'd have to improve queue control for the application. I figured out later that the bottleneck is in the 1-Code CPU on my Free Tier AWS instance.
Load bottleneck on database CPU:
As monitoring is one of database administrator's main responsibilities, I'm currently working on Zabbix and Grafana instalation playbooks.
A next step will be creating a Jenkins pipeline to deploy new versions of the application image.
Once this aproach will demand a Web hook on the server side, I've configured a local Gitlab service where application's source code is getting commited to.
On the GitLab repository, a Web hook will be set to trigger the pipeline, which will start the building new image, uploading to Git hub, and triggering the Ansible playbook. The Ansible playbook, in its turn, will update the application containers.
Another next step in the near future on my learning path will be implementing container orchestration using Kubernetes.
This project is a landmark on my career as a Software Developer and Database Administrator since it helped me to expand my competences as a DevOps practitioner. It filled the gaps I had on undestanding the full development life cycle. Putting in practice the knowledge aquired is part and parcel on validating it, and I strongly recommend anyone who want to master a tech role to create their own Tech-Trail just as i did.
These are just a few of the many references I made used of:
Ansible official docs for PostgreSQL
Ansible playbook for PostgreSQL
Docker container pull creation
Install Docker on Debian with Ansible
Zabbix Docs