AWS EC2 in Depth + IAM Console: A Practical Guide with Examples
Who this is for: Developers and platform engineers who want a clear, practical path to launching, securing, automating, and troubleshooting EC2 using strong IAM fundamentals.
What you will learn:
- The EC2 building blocks (compute, networking, storage) and how they fit together
- IAM users, roles, policies, and how to apply least privilege
- Launching EC2 securely via Console, CLI, and Terraform
- Managing access with IAM conditions, ABAC, and policy simulator
- Troubleshooting, cost, and security best practices
1) EC2: The Essentials You Actually Use
- AMIs: Machine images that define your OS and base software. Use Amazon Linux 2/2023, your hardened AMIs, or marketplace AMIs.
- Instance types: General purpose (t3/t4g/m7g/m7i), compute (c6/7), memory (r6/7), storage (i3/i4), GPU (g/p). Pick based on CPU, memory, network.
- Storage:
- EBS: Persistent block storage. Prefer gp3; encrypt by default with KMS.
- Instance store: Ephemeral, very fast, data lost on stop/terminate.
- Networking:
- VPC: Your private network. Subnets are AZ-scoped.
- Internet access: Public subnet + Internet Gateway + public IP or private subnet + NAT for egress.
- Security Groups: Stateful firewall. Allow inbound only what you need. Outbound default allow.
- NACLs: Stateless subnet ACLs. Usually keep default unless you really need them.
- Elastic IPs: Static public IPv4. Avoid unless necessary; consider SSM instead of SSH + public IP.
- IMDS: Instance Metadata Service. Use IMDSv2 only.
2) IAM: Just Enough to Be Dangerous (Safely)
Key concepts:
- Users: Human identities. Put them in groups. Require MFA. Use access keys sparingly.
- Roles: Machine identities (including EC2) assumed via a trust policy. EC2 gets an Instance Profile that contains a role.
- Policies: JSON documents with Allow/Deny on actions + resources + conditions. Identity policies attach to users/groups/roles. Resource policies attach to AWS resources.
- Permission boundaries: Upper limit on permissions for a principal. Helpful for delegation.
- Deny beats Allow. Service Control Policies (SCPs) in AWS Organizations can apply account-wide limits.
3) Example: Launch a Secure, Manageable EC2 Instance (Console)
Goal: Launch an instance without opening SSH (22) to the world and manage it via AWS Systems Manager (SSM).
High-level steps:
- Create an IAM role for EC2 with SSM access + least-privilege S3 read.
- Launch an instance with that role attached.
- Use SSM Session Manager to connect (no inbound ports needed).
3.1 Create the IAM role (Console)
- Go to IAM Console > Roles > Create role
- Trusted entity type: AWS service
- Use case: EC2
- Attach policies: AmazonSSMManagedInstanceCore (managed)
- Create a custom inline policy for S3 read-only on a specific bucket (optional):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ReadSpecificBucket",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-app-bucket",
"arn:aws:s3:::my-app-bucket/*"
]
}
]
}
- Name the role: ec2-ssm-s3-read-role
3.2 Launch the instance (Console)
- EC2 Console > Launch instance
- AMI: Amazon Linux 2023 (or your standard)
- Instance type: t3.small (right-size for your workload)
- Key pair: None (we'll use SSM)
- Network: Choose your VPC
- Subnet: Private subnet (recommended) or public if needed
- Auto-assign public IP: Disable (if private subnet + NAT)
- Security group: Allow only what your app needs. If using SSM only, you can allow no inbound.
- Advanced details:
- IAM instance profile: ec2-ssm-s3-read-role
- Metadata options: IMDSv2 required, hop limit 2
- User data (optional):
#!/bin/bash
set -euxo pipefail
yum -y update
# CloudWatch/SSM Agent preinstalled on Amazon Linux 2023. Ensure SSM connectivity:
# If in a private subnet, make sure NAT GW or VPC endpoints for SSM/EC2Messages/SSM Messages are configured.
- Launch. After the instance is running and SSM is healthy, connect via:
- EC2 Console > Connect > Session Manager
4) Same Flow via AWS CLI
Prereqs: AWS CLI v2 configured, an existing VPC+subnet, and the role created above.
# Variables
VPC_ID=vpc-0123456789abcdef0
SUBNET_ID=subnet-0123456789abcdef0
SG_NAME=ec2-ssm-sg
ROLE_NAME=ec2-ssm-s3-read-role
PROFILE_NAME=$ROLE_NAME
INSTANCE_NAME=dev-ec2-ssm
# 1) Security group with no inbound (SSM only)
SG_ID=$(aws ec2 create-security-group \
--group-name $SG_NAME \
--description "SG for SSM-managed EC2 (no inbound)" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
# Optional: Restrict outbound instead of allowing all
# aws ec2 revoke-security-group-egress --group-id $SG_ID --protocol all --port all --cidr 0.0.0.0/0
# aws ec2 authorize-security-group-egress --group-id $SG_ID --ip-permissions '[]'
# 2) Minimal user data
cat > user-data.sh <<'EOF'
#!/bin/bash
set -euxo pipefail
yum -y update
EOF
# 3) Launch instance (use your AMI)
AMI_ID=$(aws ec2 describe-images --owners amazon \
--filters Name=name,Values="al2023-ami-*-kernel-6.1-x86_64" Name=state,Values=available \
--query 'reverse(sort_by(Images,&CreationDate))[:1].ImageId' --output text)
aws ec2 run-instances \
--image-id $AMI_ID \
--instance-type t3.small \
--subnet-id $SUBNET_ID \
--security-group-ids $SG_ID \
--iam-instance-profile Name=$ROLE_NAME \
--metadata-options HttpTokens=required,HttpPutResponseHopLimit=2 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value='"$INSTANCE_NAME"'}]' \
--user-data file://user-data.sh \
--count 1 \
--query 'Instances[0].InstanceId' --output text
# 4) Connect via SSM (requires SSM/VPC endpoints or NAT for private subnets)
# aws ssm start-session --target <instance-id>
Cleanup (CLI):
INSTANCE_ID=$(aws ec2 describe-instances --filters Name=tag:Name,Values=$INSTANCE_NAME \
--query 'Reservations[].Instances[?State.Name!="terminated"].InstanceId' --output text)
[ -n "$INSTANCE_ID" ] && aws ec2 terminate-instances --instance-ids $INSTANCE_ID
aws ec2 wait instance-terminated --instance-ids $INSTANCE_ID || true
aws ec2 delete-security-group --group-id $SG_ID
5) IaC with Terraform (Minimal, Copy-Paste)
Note: Replace placeholder IDs and names. This example creates: SG, role, instance profile, and EC2.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 5.0"
}
}
}
provider "aws" {
region = var.region
}
variable "region" { default = "us-east-1" }
variable "vpc_id" {}
variable "subnet_id" {}
resource "aws_iam_role" "ec2_role" {
name = "ec2-ssm-s3-read-role"
assume_role_policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Effect = "Allow",
Principal = { Service = "ec2.amazonaws.com" },
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy_attachment" "ssm_core" {
role = aws_iam_role.ec2_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
resource "aws_iam_policy" "s3_read" {
name = "s3-read-specific-bucket"
policy = jsonencode({
Version = "2012-10-17",
Statement = [{
Effect = "Allow",
Action = ["s3:GetObject","s3:ListBucket"],
Resource = [
"arn:aws:s3:::my-app-bucket",
"arn:aws:s3:::my-app-bucket/*"
]
}]
})
}
resource "aws_iam_role_policy_attachment" "s3_read_attach" {
role = aws_iam_role.ec2_role.name
policy_arn = aws_iam_policy.s3_read.arn
}
resource "aws_iam_instance_profile" "ec2_profile" {
name = "ec2-ssm-s3-read-profile"
role = aws_iam_role.ec2_role.name
}
resource "aws_security_group" "ec2_sg" {
name = "ec2-ssm-sg"
description = "No inbound; SSM only"
vpc_id = var.vpc_id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
data "aws_ami" "al2023" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["al2023-ami-*-kernel-6.1-x86_64"]
}
}
resource "aws_instance" "ec2" {
ami = data.aws_ami.al2023.id
instance_type = "t3.small"
subnet_id = var.subnet_id
vpc_security_group_ids = [aws_security_group.ec2_sg.id]
iam_instance_profile = aws_iam_instance_profile.ec2_profile.name
metadata_options {
http_tokens = "required"
http_put_response_hop_limit = 2
}
user_data = <<-EOT
#!/bin/bash
set -euxo pipefail
yum -y update
EOT
tags = { Name = "dev-ec2-ssm" }
}
6) Boto3: Start/Stop EC2 by Tag
import boto3
ec2 = boto3.client('ec2')
def ids_with_tag(tag_key, tag_value):
resp = ec2.describe_instances(
Filters=[
{"Name": f"tag:{tag_key}", "Values": [tag_value]},
{"Name": "instance-state-name", "Values": ["pending","running","stopping","stopped"]}
]
)
ids = []
for r in resp['Reservations']:
for i in r['Instances']:
ids.append(i['InstanceId'])
return ids
ids = ids_with_tag("Environment", "dev")
if ids:
print("Stopping:", ids)
ec2.stop_instances(InstanceIds=ids)
7) IAM Console Tips That Save Hours
- Policy Simulator: Test whether a principal can perform an action on a resource with given conditions. IAM > Policies > Policy Simulator.
- Access Advisor: On roles and users, see last-accessed services. Remove unused permissions.
- Inline vs Managed: Prefer customer managed policies for reuse and version control; avoid inline where possible.
- Conditions and ABAC: Use tags to control access. Example policy allowing users with Project tag to start/stop only instances tagged with the same value:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:StartInstances",
"ec2:StopInstances",
"ec2:DescribeInstances"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Project": "${aws:PrincipalTag/Project}"
}
}
}
]
}
- Permission boundaries: Let teams create roles/users but cap their max permissions with a boundary policy.
- Deny precedence: An explicit Deny anywhere (SCP, permission boundary, identity, or resource policy) overrides Allows.
8) Security Best Practices for EC2 + IAM
- Prefer SSM Session Manager over SSH/RDP. If you must use SSH, restrict Security Group to your IP and use short-lived certs (EC2 Instance Connect).
- Enforce IMDSv2 and set hop limit to the minimum needed (usually 1–2).
- Encrypt EBS (default KMS key or a customer-managed key). Apply backup lifecycle policies to snapshots.
- Use roles for applications instead of embedding long-lived access keys.
- Minimal IAM for instance profiles. Start with SSM core and add narrowly scoped policies.
- Log everything: Enable CloudTrail Org trails, VPC Flow Logs, and SSM session logging to CloudWatch or S3.
- Patch baseline and inventory via Systems Manager.
- Avoid public IPs for servers. Use ALB/NLB or API Gateway at the edge, private instances behind.
- Rotate AMIs and use golden images/hardening (CIS) with Packer.
9) Cost and Performance
- Right-size: Track CPU/mem/disk/network with CloudWatch metrics and AWS Compute Optimizer.
- Use Auto Scaling for spiky workloads; use Spot for stateless/batch to save 70%+.
- Prefer gp3 volumes and tune IOPS/throughput only when needed.
- Clean up: Unused EBS volumes, snapshots, and Elastic IPs cost money.
- Network: Enable enhanced networking (ENA) for high throughput; place chatty services in the same AZ.
10) Troubleshooting Checklist
Connectivity:
- Public access: Check IGW, route tables (0.0.0.0/0), public IP/Elastic IP, Security Groups, NACLs, OS firewall.
- Private access: Check NAT GW or VPC endpoints (for SSM: com.amazonaws.<region>.ssm, ec2messages, ssmmessages), routes.
- SSM not connecting: Ensure role has AmazonSSMManagedInstanceCore and the SSM Agent is running; DNS works; endpoints reachable.
Boot/user data:
- Linux: /var/log/cloud-init-output.log and /var/log/messages
- Windows: C:\ProgramData\Amazon\EC2-Windows\Launch\Log\UserdataExecution.log
IAM/metadata:
- Is IMDSv2 required and your SDKs/agents using it?
- Does the role have permission for the exact resource ARN and region? Check policy simulator.
Storage:
- EBS attach failures: AZ mismatch; limits; missing IAM permissions.
11) Clean Up (IAM + Networking) via CLI
# Terminate instances with a specific Name tag
aws ec2 describe-instances --filters Name=tag:Name,Values=dev-ec2-ssm \
--query 'Reservations[].Instances[?State.Name!="terminated"].InstanceId' --output text \
| xargs -r aws ec2 terminate-instances --instance-ids
# Delete SG by name (be sure no refs remain)
SG_ID=$(aws ec2 describe-security-groups --filters Name=group-name,Values=ec2-ssm-sg \
--query 'SecurityGroups[0].GroupId' --output text)
[ "$SG_ID" != "None" ] && aws ec2 delete-security-group --group-id $SG_ID || true
# Detach and delete custom IAM policy, then role and profile
POL_ARN=$(aws iam list-policies --scope Local --query 'Policies[?PolicyName==`s3-read-specific-bucket`].Arn' --output text)
[ -n "$POL_ARN" ] && aws iam detach-role-policy --role-name ec2-ssm-s3-read-role --policy-arn $POL_ARN || true
[ -n "$POL_ARN" ] && aws iam delete-policy --policy-arn $POL_ARN || true
aws iam delete-instance-profile --instance-profile-name ec2-ssm-s3-read-profile || true
aws iam delete-role --role-name ec2-ssm-s3-read-role || true
References
- EC2 user guide: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
- IAM user guide: https://docs.aws.amazon.com/IAM/latest/UserGuide/
- Systems Manager: https://docs.aws.amazon.com/systems-manager/latest/userguide/
- Terraform AWS provider: https://registry.terraform.io/providers/hashicorp/aws/latest/docs
You now have a repeatable, secure pattern for EC2 using strong IAM foundations, with console, CLI, Terraform, and Boto3 examples you can adapt to production.