The Components of Highly Available (HA) Architecture in AWS

9 min readJul 15, 2021

What is a highly available architecture?

As businesses are growing, continuous and uninterrupted operations are critical to every business. Your application needs to cater to varying traffic loads. But what happens if your system breaks down and your applications suddenly become inaccessible to your clients? Your users will have to face unexpected situations, and you will lose a large amount of revenue during the downtime period. A highly available architecture is the best solution to avoid such sudden application downtimes and ensure the application is always available without impacting your customers.

Amazon web services provide essential components or services to build a highly available and fault-tolerant architecture when designing a solution to your cloud application. Let’s get to know about what those components are in this article.

Elastic Load Balancing

Elastic Load Balancing is an essential process of HA architecture. Elastic load balancing automatically balances incoming traffic to your application by distributing them across multiple AWS targets like instances, IP addresses, Microservices, containers, and lambda functions. Thus, you can automatically scale the capacity of the infrastructure according to the incoming traffic. In addition, you can secure the load balancers using security groups.

It provides four load balancers for different network layers to balance the traffic load in angle or multiple availability zones.

Types of Load Balancers

Application Load Balancers (ALB)

This type of load balancer is most suitable for load balancing HTTP and HTTPS traffic which operates at the network layer seven or the application layer. You can configure this as an internet-facing (you should have an internet gateway in your VPC for this) or an internal LB without a public IP address. Following are the targets to you can register to an ALB.

Instance

Facilitates load balancing to instances inside a VPC

enables load balance applications hosted in any IP address or an interface of an instance or applications in on-premises locations.

Lambda function

Lambda functions can serve HTTP requests through an ALB to access serverless applications from HTTP clients.

Network Load Balancers (NLB)

NLBs are suitable for load balancing TCP and UDP traffic. It operates at network layer four or the connection level. It can handle millions of requests per second with low latencies. Using NLBs, you can route connections to the following targets.

Instance
microservices
containers

AWS services like Route 53, Auto Scaling, CloudFormation, Elastic BeanStalk, Cloud Watch, Cloud Trail, Code Deploy integrate NLB. One of the prominent features of an NLB is its support for DNS failover. Meaning that if none of the targets registered with the NLB are healthy or NLBs in the availability zone (AZ) are unhealthy, then AWS Route 53 can route traffic to load balancer nodes in another AZ.

Gateway Load Balancers (GLB)

Customers use third-party appliances like firewalls, Intrusion Detection, DDOS prevention systems, analytics tools, etc., in the cloud. A GLB allows them to deploy, distribute traffic across such virtual appliances that support GENEVE and scale them according to the demand. Gateway Load Balancers use Gateway Load Balancer endpoints to access external services securely. It only allows using GENEVE protocol-supported target groups.

Classic Load Balancers

If you have applications built using the EC2 classic network, then you can use this load balancer.

It provides load balancing for layer seven traffic with features like X-forwarded and sticky sessions and layer four traffic. Additionally, it supports both IPV4 and IPV6 traffic in EC2-classic networks.

Valuable Features of Load Balancers

Sticky Sessions

Allows binding a specific user’s session to one particular target, like for one EC2 instance. Thus, during that user’s session, all of his requests will be routed to that instance only. It would help if you decided for how long the load balancer should route traffic.

Cross-Zone Load Balancing

You can deploy applications across different Availability Zones (AZ). Cross-Zone load balancing enables to balance load across different AZs. You can allow this after creating any load balancer except for the application load balancer. If it is disabled, each load balancer will distribute traffic only to the targets registered in the AZ it is situated.

Let’s take the following example; there are two load balancers in two AZs. The AZ one has four instances, and the second has only two instances registered with the respective load balancers.

When the client sends requests, Amazon route 53 distributes 50% of traffic to each load balancer. Then each load balancer distributes traffic equally to each instance, as shown in the following diagram.

If cross-zone load balancing is enabled, each load balancer will route traffic to all the instances of other AZs. Therefore, six targets receive 16.66% of the traffic.

So, if one of the load balancers fails or becomes unhealthy, its instances can still get traffic from the load balancer in the other AZ.

Path-Based Routing

Auto Scaling

Suppose your application servers crashed and become suddenly unavailable. It would help if you planned to permanently replace those instances and spin up new models with the application. Also, if the demand for your application suddenly increased, then increasing your resource capacity will be the ideal solution. AWS auto-scaling lets you adjust your resources automatically based on need and constantly maintain the performance of your applications at a lower cost. Based on your defined scaling plan, you can scale different resources like EC2 instances, DynamoDB tables, and Aurora replicas.

Auto Scaling Launch Configuration

To device a scaling plan, you need to define the configurations in a launch configuration. For example, If you are scaling EC2 instances, you can specify the following formats for the EC2 you want to spin up when climbing.

The ID of the Amazon Machine Image(AMI)
The Instance Type (Ex: t2.micro, m1.large)
EC2 key pair
Security Groups
EBS

Auto-scaling groups will use these configurations defined in the Launch configuration (LC) to execute the auto-scaling. One auto-scaling group can have only one launch configuration. But one launch configuration can be used by multiple auto-scaling groups. Once you have created an LC, you cannot modify it. If you want to change it, you need to create a new LC and attach it to the Auto Scaling Group. Amazon now advises avoiding using configurations and use only launch templates.

If you are scaling EC2s, it will automatically generate an LC once you create an auto-scaling group.

Auto Scaling Launch Template

A launch template is as same as a launch configuration in which you need to specify the configuration mentioned above information for the resources you want to scale. However, unlike launch configurations, you can create multiple versions of the launch template. For example, you can create a default launch template including the standard designs then create another version of it applying configurations specific to that version.

Auto Scaling Group

Auto-scaling Group contains the desired, maximum, and minimum number of resources you want to scale. Auto-scaling groups make sure you always have the desired capacity. It carries out health checks with the instances periodically. If it found any instance unhealthy, it will terminate that and automatically scale to the desired capacity.

A scaling policy lets you auto-scale dynamically based on particular conditions. It will adjust the desired capacity between min and max capacities by launching and terminating instances. You have several scaling options to best suit your requirements.

scaling Options

Maintain A fixed number of instances at all time

You can configure the Auto-scaling groups to make sure you always have the desired capacity. It carries out health checks with the instances periodically. If it found any instance unhealthy, it will terminate that and automatically scale to the desired capacity.

Manual Scaling

You have to specify only your Auto Scaling group’s maximum, minimum, or desired capacity. Amazon EC2 Auto Scaling manages the process of creating or terminating instances to maintain the updated capacity.

Scheduled Scaling

You can use scheduled scaling to increase or decrease the capacity according to the desired date.

On-demand Scaling

You can scale the resources based on the demand. For example, lets’ say you want to keep the CPU utilization of your EC2 instances 50% every time, then you can scale if it exceeds because of the increase in demand. This is particularly useful when the demand changes frequently.

Predictive scaling

Combine predictive scaling with dynamic scaling to scale your Amazon EC2 capacity faster.

CloudFormation

Creating and Managing AWS resources can be a very hectic task. Not only that, sometimes, you may have to adjust and repeat the entire stack of resources. CloudFormation lifts that burden from you by taking care of the provisioning and managing your infrastructure according to the architecture you define.

In CouldFormation, you have to model your stack or collect resources for your infrastructure in a JSON or YAML format template as a single stack. Then, you can deploy that entire stack using one function call. Because of CloudFormation, it is easier to focus on your application development, and it simplifies provisioning and managing infrastructure.

You can create your template describing your stack or create a template using an existing stack. A stack can include multiple EC2 instances with security groups, Load Balancers, Auto Scaling Groups, RDS instances with security groups, Elastic IP Addresses, SNS, SQS, etc. For example, the LAMP stack includes Amazon RDS instances as the back-end store.

You can browse sample templates given by AWS.

Elastic Beanstalk (EB)

Elastic Beanstalk (EB) is another AWS service that handles the deployment and management of cloud infrastructure required for an application on your behalf. It will automatically do the actions required to deploy that application, like provisioning the required capacity, load balancing, auto-scaling, and health monitoring. You have to write the code and upload it to Elastic Beanstalk.

Elastic Beanstalk supports applications written in GO, .NET, Java, Python, PHP, Docker, GlassFIsh, Ruby, Tomcat, and Node Js. There are sample applications for each language for you to deploy and create a test environment. Once you selected the application type you want to deploy, you can click on the deploy button. EB will create the environments deploying that application by creating the necessary infrastructure use.

For example, let’s create and deploy a sample application written in NodeJs.

Specify a name for the application and select the platform, platform branch, and version for Node Js.
Then choose application code as a sample application
Optionally, you can define an application tag.
Then you can either deploy it with the default settings for a load-balanced test environment or edit and configure the default application infrastructure.
Then select ‘Create Application.’

EB will automatically start creating the infrastructure for the test environment and deploy the application for it. The EB creates the resources according to the CloudFormation stack specified in a JSON template. It will store your application code, logs, and artifacts in an S3 bucket. This will take a few minutes.

Once the deployment is completed, you will be redirected to its dashboard to see every infrastructure settings for your application.

This is how your application will look like in the test environment.

High availability with Bastions

Bastion hosts can be used to access Linux instances deployed in private or public subnets securely. You can use two bastion hosts in two availability zones with a network load balancer to ensure higher availability. Then attach an auto-scaling group that includes health checks. If the bastion host fails, the auto-scaling group will launch a new instance in that availability zone.

High availability with Multiple Availability Zones (AZ)

Multiple AZs let your applications automatically failover between multiple AZz without impacting your applications and databases. For example, Amazon provides Multi-AZ deployments for databases — Oracle, MySQL, PostgreSQL, and MariaDB. In a failover, the fail-over technology automatically switched to a standby database replica in another AZ.