[[TOC]]
Regional and Global AWS Architecture
- Globally, DNS is used for service discovery and regional based health checks and request routing.
- Content Delivery Networks, CDN's are used to cache content globally - as close to the end users as possible to improve performance.
- Customers will enter your region usually at the web tier.
- the compute tier supports the web tier.
- compute tier consumes storage services
- databases are accessed via a caching tier from the compute tier.
- app services add additional functionality to all of the above.
Evolution of the Elastic Load Balancer
3 Types of ELBs
- v1 Classic load balancers - old (avoid these)
- v2 - Application Load Balancer and Network Load Balancers
Use ALB when you are using HTTP/S/Websocket Use NLB when you use TCP, TLS, UDP.
Elastic Load Balancer Architecture
Load balancers accept connections from the user and distribute them across accepting front ends.
- Each ELB is configured with an A record DNS name. This resolves to the ELB node.
Internet facing or internal facing
- internet facing has public IPs
- internal facing has only private IPs.
- Internet facing can access public and private EC2 instances.
ELB needs 8+ free IPs so... /27 network sizes are the usual minimum size for an ELB.
You can somewhat loosely couple your architecture by putting ELBs in front of Auto Scaling groups.
Cross Zone Load Balancing
This solves the problem where if there were 4 EC2 instances in one AZ and one EC2 in another, where the load gets split 50/50 into one and then 12.5 into the 4 EC2 instances. Cross Zone Load Balancing allows all EC2 instances that are registered with the ELB to receive equal traffic, regardless of the AZ they are in.
Application Load Balancer vs Network Load Balancer
Application Load balancer
Layer 7 load balancer - HTTP or HTTPS only.
- can understand Layer 7 content - cookies, headers, user location, app behavior
- SSL/TLS is always terminated on the ALB - no unbroken SSL
- ALBs must have SSL certs if HTTPS is used
- ALBs are slower than NLBs.
- health checks can evaluate app health at layer 7.
Rules direct connections that arrive at a listener
- processed in priority order
- default rule = catchall
- Rule conditions:
- host-header, http-header, http-request-method path-pattern, query-string and source-ip
- Actions:
- forward, redirect, fixed-response, authenticate-oidc, and authenticate-cognito
Network Load Balancer
Layer 4 load balancer - TCP, TLS, UDP, TCP_UDP
- cannot interpret headers, cookies or session stickiness
- faster than ALBs
- no understanding of HTTP or HTTPS
- Healthchecks are not app aware, they do simple ICMP handshakes
- NLBs can have static IP's so you can whitelist them
- They also forward TCP to instances giving unbroken encryption.
- privatelink
Launch Configuration and Templates
These allow you to configure the EC2 instances in advance
- AMI, instance type, storage, key pair
- networking and security groups
- Userdata and IAM roles
Launch templates can be versioned and have newer features such as placement groups, capacity reservations, Elastic Graphics.
Autoscaling groups use these Launch Configurations to create EC2 instances They can also use the Launch templates, and you can also use the launch template individually.
Autoscaling groups
Autoscaling groups are responsible for automatic scaling and self healing for EC2. They use launch templates or configurations to deploy EC2 instances.
- 3 options, minimum, desired, maximum.
- this keeps running instances at the desired capacity by provisioning or terminating instances.
- scaling policies automate based on metrics.
Policies
Manual scaling - manually adjusting the desired capacity Scheduled Scaling - time based adjustment (like 8 am, spool up 4 EC2 instances) Dynamic Scaling - based on metrics
- Simple - CPU above 50, add 1 instance,
- Stepped - bigger number of instances based on difference. If cpu is 80%, launch 4 EC2 instances
- Target tracking - desired Aggregate CPU
Cooldown period is also needed so that you can give it some time to balance.
ASGs and Load Balancers
This is an example of elasticity.
Processes
Launch and Terminate - suspend and resume AddToLoadBalancer - add to the LB on launch AlarmNotification - accept notification from CloudWatch AZRebalance balances instances evenly across AZs HealthCheck - Instance health checks on or off ReplaceUnhealthy - terminate an unhealthy instance and replace ScheduledActions - schedule these on or off Standby - use this for instances InService vs Standby
Points
- Autoscaling groups are free
- only the resources created are billed
- think about using more smaller instances
- use with ALBs for elasticity
- ASG defines When and Where, LT defines what.
ASG Scaling Policies
ASG's don't need scaling policies, but are powerful when added
- Manual sets a min, max, and desired, this is useful for testing and urgent change scenarios.
Dynamic Scaling Policies
- Simple - if alarm, do something
- Step - if bounds, do something. This one is more specific than simple scaling.
- Target tracking - target value tells the ASG to keep the specific instances to keep the target.
- Scaling based on SQS. IF there are more messages, scale out.
ASG Lifecycle Hooks
These are custom actions on instances during ASG actions. Instances launch and then terminate, and that is the lifecycle of an instance.
- Instances are instead paused in that flow until a timeout that tells them to Continue or Abandon. Once the Scale Out or Scale In actions are received, these pause a bit to do some things before the instances are created or terminated
Notifications can be sent to an SNS topic or EventBridge can be used to initiate other processes based on hooks.
ASGs and Health Checks
EC2, ELB that can be enabled and Custom health checks
- EC2 - Stopping, Stopped, Terminated, Shutting down or Impared (not a 2/2 status) is considered unhealthy.
- ELB - Healthy needs to be running and passing ELB health check These can be application aware.
- Custom - these are instances marked healthy or unhealthy by an external system.
Health Check grace period is default 300s. This creates a useful delay before starting checks. This allows for launch, bootstrapping and app start to finish before the instance is terminated. A bad example of this if it takes 400ms for the instance to start and it terminates at 300ms, your instances could start and stop without finishing initializing.
SSL Offload and Session Stickiness
Bridging
ELB needs a certificate for the domain name. SSL is terminated here at the certificate and new SSL connections are established for the backend instances. These instances need SSL certificates and the appropriate compute required for cryptographic operations. AWS needs to have access to this cert in order to create these connections.
Pass Through
The NLB simply passes the encrypted connection to the backend EC2 instances. Each instance needs to have an SSL cert installed and there is no certificate exposure to AWS. Listener is configured for TCP. No decryption occurs.
Offload
ELB needs a SSL certificate to decrypt, but then sends the data to the backend EC2 instances unencrypted, which reduces the cryptographic compute load on those EC2 instances.
Session Stickiness
Users connect to EC2 instances through the LB. Lets imagine Amazon.com and if you add something to the cart. If you reconnect without session stickiness, the cart would be empty if you attached to another EC2 instance OR, the possibility to add different things to different carts on different instances. Session stickiness is a way to keep your session on one EC2 instance. This is not the same as adding your cart to a backend database like DynamoDB for persistent carts.
If enabled, a cookie is created with a defined duration. Any subsequent connections will use that cookie to send the user to the same EC2 instance.
- If the instance fails, this fails
- if the cookie expires, this fails.
This creates uneven load because everyone on EC2 instance A might log on at the same time in 3 days to check their cart and not be distributed across the ASG.
Demo - Advanced Architecture Evolution
Demo
Gateway Load Balancer
GWLBs are a way to split traffic coming into the VPC to a separate VPC or other location that contains a security appliance.
- help you run those 3rd party appliances
- monitor inbound and outbound traffic transparently
GWLB endpoints - traffic enters and leaves via these endpoints
The GWLB balances across multiple backend appliances.
Packets need to remain untouched, so they are tunneled using GENEVE protocol.