Skip to main content

[[TOC]]

Bootstrapping EC2 using User Data


  • EC2 Bootstrapping is the process of configuring an EC2 instance to perform automated install & configuration steps - post deploy configuration after deployment It is the entire process from deployment to when it is ready for consumption by the consumer. Lets say you want to deploy a web server, and so you deploy an instance via the gui and then remote into it and then install the webserver and the database manager and then install WordPress. All manually. Now, compare this to a form of automation (baking the image).

Boot time to service time


  1. AMI ----minutes----> Instance ready -----manualpostdeploy----> ready for use
  2. AMI ----minutes----> Instance Ready -----bootstrapping --->Ready for use
  3. BakedAMI ----minutes----> Instance ---bootstrapping --->Ready for use Note: Optimal config is to bake AND bootstrap at the same time

Architecture


AMI is used to create an instance. The instance then pulls in the user data and then executes that user data to make the entire instance available for service

  • if there is a bad config, you'll have a bad config and a bad instance

User data is

  • not secure - don't use passwords
  • unique to EC2
  • limited to 16kb in size
  • can be modified when the instance is stopped
  • only executed once - at launch (remember this)

Demo: Bootstrapping Wordpress directly, with CFN and Terraform


Demo

Enhanced Bootstrapping with CFN INIT


AWS Documentation CloudFormation Documentation cfn-init is a desired state configuration tool for AWS Procedural (user-data) vs Desired State (cfn-init) Can install packages, groups, users, sources, files, commands and services.

The Template calls the cfn-init that uses the user-data to complement it.

  • cfn-init can work with stack updates as well., it's kind of idempotent, and runs multiple times other than the userdata which runs at launch only.

CreationPolicy and Signals


cfn-signal relays messages from inside the EC2 instance back to AWS to confirm that the deployment is conplete. image.png

Demo: CFMinit and CFN Creation Policies


Demo

Instance Role


An instance role is what allows the EC2 instance itself (via service) to assume a role. This role is assumed by the instance profile, not the role itself. THe temporary credentials for this are delivered via the metadata of the instance. Since these credentials are delivered via the metadata, they are automatically renewed (rotated) so that there is no possibility of lost or expired credentials.

CLI tools within the instance will always use these credentials.

Instance Roles are:

  • automatically rotated
  • should always be used rather than adding access keys into the instance
  • CLI tools will use these tools automatically.

Demo:


AWS Systems Manager Parameter store


The SSM store is a service that is part of the Systems Manager that allows for the storage and retrieval of parameters. The service supports encryption with KMS.

Passing through secrets via the user data as learned previously is bad practice. Anyone with access to the EC2 instance can use these credentials.

  • Can store license codes, database strings, full configs and passwords
  • Hierarchies and versioning
  • Plaintext and ciphertext using KMS
  • Public Parameters - could be the latest AMI's per region

Parameters are authenticated by IAM and then decrypted via KMS using the roles in IAM

Three types:

  • String
  • StringList
  • SecureString

Demo: Parameter Store


Demo

System and Application Logging on EC2


CloudWatch is:

  • used for metrics
  • CloudWatch Logs is for logging

Neither of these natively capture data from within an EC2 instance. This is not an agentless configuration, must install the CloudWatch Agent.

Log groups:

  • one log group
  • one log for each EC2 Instance

image.png

Demo: Logging and Metrics with CloudWatch Agent


Demo

EC2 Placement Groups


This controls the physical location of the EC2 instance within the AWS datacenters. Why this matters, is that if all of your EC2 instances are on one host in one datacenter, it really defeats the point of the cloud even though you may believe that your EC2 instances ARE in the cloud. In reality, if your EC2 instance is set up completely incorrectly, there's a possibility that your perceived high availability cloud infrastructure is no more high availability than a physical server running in your office.

Types:

  • Cluster - instances are kept close together for maximum performance
  • Spread - instances are...separated
  • Partition - groups of instances are spread apart

Cluster


If you spin up 6 instances in a cluster and the max the hardware can hold is 9 instances, you won't be able to scale this cluster up to 12 instances. Cluster groups also live in one availability zone. Cannot span multiple AZ's Can span VPC peering. Only supports specific instances. Use the same type of instance within a cluster. Instances within a cluster achieve max connections because they are all directly connected to each other. Same location, low latency, high bandwidth. If the hardware fails however, it could take down your complete cluster as these by themselves are not highly resilient.

Spread


These are placed in separate racks inside the same availability zone. You are limited to 7 instances per AZ. This is an AWS limitation on spread placement groups. Provides infrastructure isolations 7 instances, again, hard limit.

Partition


More than 7 instances Specify multiple partitions, each with a max of 7 partitions per AZ. Each partition has isolated infrastructure You can then spin up as many instances within each individual partition as you want. This allows you to exceed the max 7 instances.

  • Max 7 partitions
  • Instances can be placed into specific partition
  • great for topology aware applications such as HDFS, HBase and Cassandra

image.png

Dedicated Hosts


Very narrow use case in the real world applications. No one really pays for remote hosts in the cloud.

  • No instances charges as you're renting the entire host
  • On Demand and Reserved Options available.
  • Host hardware has physical sockets and cores - good for software licenses
  • Host only allows for certain quantities of instances. 1 4xlarge might be 16 medium instances

Not supported:

  • AMI limits - RHEL, SUSE Linux and Windows AMI's unsupported
  • RDS instances
  • Placement Groups

Hosts can be shared across ORG accounts using Resource Access Manager. Can only see the EC2 instances that you create on these hosts.

image.png

Enhanced Networking and EBS Optimized.


Networking


SR-IOV - Makes the NIC aware of the virtualization

  • available for no extra charge and is available on most EC2 types.
  • makes for lower CPU usage, better throughput, lower and more consistent latency
  • More bandwidth
  • Lower latency
  • higher packets per second (PPS) with that lower latency.

EBS Optimized


  • block storage over the network
  • dedicated capacity has been allocated for EBS
  • supported and enabled by default unless it isn't (older instance types)