AWS: Monitor EC2 instance with cloudwatch alarms and email alerts
Step-by-Step Guide: Setup CloudWatch Agent on Linux EC2 with Alerts and SNS Notifications
1. Prerequisites
Before starting the setup, ensure the following requirements are met.
1.1 EC2 Linux Instance
You must have a running Linux EC2 instance. Supported operating systems include:
- Amazon Linux
- RHEL / Rocky Linux / CentOS
- Ubuntu
Example configuration:
OS: Amazon Linux / RHEL / Ubuntu
Instance type: t2.micro or above
Internet access: Required
1.2 AWS Permissions
The EC2 instance must have permission to push metrics and logs to CloudWatch. This is done using an IAM role.
---
2. Create IAM Role for CloudWatch Agent
1. Go to AWS Console
2. Open IAM
3. Click Roles
4. Click Create Role
Select Trusted Entity
Trusted entity: AWS Service
Use case: EC2
Attach Permission Policy
Attach the following policy:
CloudWatchAgentServerPolicy
Role Name
EC2-CloudWatch-Agent-Role
Create the role.
---
3. Attach IAM Role to EC2 Instance
1. Go to EC2 Dashboard
2. Select your EC2 instance
3. Click Actions
4. Select Security
5. Click Modify IAM Role
6. Choose:
EC2-CloudWatch-Agent-Role
Save changes.
Now the EC2 instance has permission to send logs and metrics to CloudWatch.
---
4. Install CloudWatch Agent
Login to the EC2 instance via SSH.
For Amazon Linux / RHEL / Rocky
sudo yum install amazon-cloudwatch-agent -y
For Ubuntu
sudo apt update
sudo apt install amazon-cloudwatch-agent -y
Verify Installation
rpm -qa | grep cloudwatch
---
5. Configure CloudWatch Agent
Run the configuration wizard.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
You will be prompted with several questions.
Example Configuration
Operating system: Linux
Run agent as root: Yes
Collect metrics: Yes
Collect logs: Yes
---
Metrics to Enable
Typical system metrics to collect:
CPU usage
Memory usage
Disk usage
Disk path example:
/
---
Logs to Monitor
Common Linux system logs:
/var/log/messages
/var/log/secure
If running a web server such as Nginx:
/var/log/nginx/access.log
/var/log/nginx/error.log
---
Configuration File Location
After the wizard completes, the configuration file is created at:
/opt/aws/amazon-cloudwatch-agent/bin/config.json
---
6. Start CloudWatch Agent
Start the agent using the configuration file.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json \
-s
---
Verify Agent Status
sudo systemctl status amazon-cloudwatch-agent
Expected output:
active (running)
---
7. Verify Metrics in CloudWatch
1. Open AWS Console
2. Navigate to CloudWatch
3. Click Metrics
4. Select
CWAgent → InstanceId
You should see metrics such as:
CPU usage
Memory usage
Disk usage
This confirms the agent is successfully sending metrics to CloudWatch.
---
8. Create CloudWatch Alarm
1. Go to CloudWatch
2. Click Alarms
3. Click Create Alarm
Select Metric
CWAgent → InstanceId → CPU Usage
Set Condition
Example:
CPUUtilization > 80%
Duration: 5 minutes
---
9. Create SNS Topic for Notifications
1. Go to SNS
2. Click Topics
3. Click Create Topic
Topic Type
Standard
Topic Name
ec2-alert-topic
Create the topic.
---
10. Create SNS Subscription
Inside the topic, click Create Subscription.
Protocol
Email
Endpoint
your-email@example.com
After creating the subscription, confirm it using the email you receive.
---
11. Attach SNS Topic to Alarm
While configuring the alarm, choose the action:
Send notification to SNS topic
Select the topic:
ec2-alert-topic
Save the alarm.
---
12. Test the Alarm
Generate CPU load on the EC2 server.
Install stress tool:
sudo yum install stress -y
Run CPU load:
stress --cpu 4 --timeout 300
If CPU usage crosses the defined threshold:
CloudWatch Alarm → SNS Topic → Email Notification
You will receive an alert email.
---
Monitoring Flow Architecture
EC2 Instance
│
CloudWatch Agent
│
CloudWatch Metrics
│
CloudWatch Alarm
│
SNS Topic
│
Email Notification
---
Conclusion
Using CloudWatch Agent with SNS alerts enables proactive monitoring of Linux EC2 instances. Administrators can monitor system metrics such as CPU, memory, and disk usage, and receive real-time notifications when thresholds are exceeded. This setup is widely used in production environments to ensure system reliability and quick response to infrastructure issues.
Comments
Post a Comment