Open menu
-->

Overutilized AWS EC2 Instances

Cloud Conformity allows you to automate the auditing process of this resolution page. Register for a 14 day evaluation and check your compliance level for free!

Start a Free Trial Product features
Last updated: 15 September 2017
Performance
efficiency

Risk level: High (not acceptable risk)

Identify any Amazon EC2 instances that appear to be overutilized and upgrade (resize) them in order to help your EC2-hosted applications to handle better the workload and improve the response time. By default, an EC2 instance is considered "overutilized" when matches the following criteria:

The average CPU utilization has been more than 90% for the last 7 days.

The average memory utilization has been more than 90% for the last 7 days. By default, AWS CloudWatch cannot record an EC2 instance memory utilization because the necessary metric cannot be implemented at the hypervisor level, therefore to be able to report the memory utilization using CloudWatch you need to install an agent (PERL script) on the instance that you want to monitor and create a custom metric (we`ll name it EC2MemoryUtilization) on the CloudWatch dashboard. The instructions required for installing the monitoring agent, based on the Operating System used by instance, are available at this URL.


Note: You can change the default threshold values for this rule on the Cloud Conformity console and set your own values for the CPU (percent), memory utilization (percent) and the preferred number of days for each condition to configure a custom overuse level for your EC2 instances. You can also change the default name for the memory utilization metric (i.e. EC2MemoryUtilization) and use a custom name for this metric. The console also provides information about each EC2 instance marked as overutilized, details such as region, ID, instance type, launch time and operating system to help you perform the EC2 right-sizing analysis.

Overutilized instances could indicate that the applications running on these machines do not have enough hardware resources to perform optimally. Upgrading (upsizing) overutilized EC2 instances (vertical scaling) or adding more instances to your Auto Scaling Groups (horizontal scaling) to meet the load needs will improve directly the health and success of your applications, resulting in a more stable environment and a faster response time.

Audit

To identify any overutilized EC2 instances that could benefit from a more efficient hardware configuration, perform the following:

Using AWS Console

01 Sign in to the AWS Management Console.

02 Navigate to EC2 dashboard at https://console.aws.amazon.com/ec2/.

03 In the left navigation panel, under INSTANCES section, choose Instances.

04 Select the EC2 instance that you want to examine.

05 Select the Monitoring tab from the dashboard bottom panel.

06 Within the CloudWatch metrics section, perform the following actions:

  1. Click on the CPU Utilization (Percent) usage graph thumbnail to open the instance CPU usage details box. Inside the CloudWatch Monitoring Details dialog box, set the following parameters:
    • From the Statistic dropdown list, select Average.
    • From the Time Range list, select Last 1 Week.
    • From the Period dropdown list, select 1 Hour.
  2. Once the monitoring data is loaded, verify the instance CPU usage for the last 7 days. If the average usage (percent) has been more than 90%, e.g. Cloud Watch Monitoring Details, the selected EC2 instance qualifies as candidate for the overused instance. Click X (close) to return to the dashboard.

07 Now determine the EC2 instance memory utilization by reading the EC2MemoryUtilization metric data reported by the CloudWatch agent (script) installed on the selected EC2 instance (this rule assumes that the script has been successfully installed and it has returned memory usage data within the past 7 days). To verify the instance memory usage reported by the custom CloudWatch metric, perform the following actions:

  1. Navigate to Cloudwatch dashboard at https://console.aws.amazon.com/cloudwatch/.
  2. In the navigation panel, select Metrics to access your existing Cloudwatch metrics.
  3. Choose All metrics tab from the dashboard bottom panel, click Linux System then select InstanceId to list any custom metrics installed on your EC2 instances.
  4. Select the right EC2 instance from the list (see Audit section part I, step no. 4), click the Action dropdown button from the dashboard top-right menu then choose Add to dashboard option.
  5. On Add to dashboard dialog box, perform the following:
    • Under Select a dashboard, click Create new button and provide a unique name for the new dashboard inside the Dashboard name box.
    • Within Select a widget typesection, choose Number.
    • Review the settings then click Add to dashboard to create the required Cloudwatch dashboard and redirect to Dashboards page.
  6. Click the Custom dropdown button from the dashboard top-right menu, select Relative then choose the 1 Weeks option to return the data recorded in the week.
  7. Once the monitoring data is loaded within the widget, verify the instance memory usage for the last 7 days. If the average usage (percent) has been above 90%, e.g. Memory Utilization, the selected EC2 instance qualifies as candidate for the overused instance.

If the rule conditions are met, based on the usage data outlined at step no. 6 and 7, the selected AWS EC2 instance is considered "overutilized" and should be upgraded to a better hardware configuration in order to meet the workload needs.

08 Repeat steps no. 4 – 7 to verify the CPU and memory usage data available in the last 7 days for the rest of the EC2 instances provisioned in the current region.

09 Change the AWS region from the navigation bar and repeat the audit process for other regions.

Using AWS CLI

01 Run describe-instances command (OSX/Linux/UNIX) using necessary filtering to list the IDs of all active (running) EC2 instances provisioned in the selected region:

aws ec2 describe-instances
	--region us-east-1
	--filters Name=instance-state-name,Values=running
	--output table
	--query 'Reservations[*].Instances[*].InstanceId'

02 The command output should return a table with the requested instance IDs:

-------------------------
|   DescribeInstances   |
+-----------------------+
|  i-0127275f7c1643a13  |
|  i-0be498a00d01f7c1b  |
+-----------------------+

03 Run get-metric-statistics command (OSX/Linux/UNIX) to get the statistics recorded by AWS CloudWatch for the CPUUtilization metric representing the CPU usage of the selected EC2 instance. The following command example returns the average CPU utilization for an EC2 instance identified by the ID i-0127275f7c1643a13, usage data captured during a 7-day time frame, using a time interval of 1 hour as the granularity of the returned datapoints:

aws cloudwatch get-metric-statistics
	--region us-east-1
	--metric-name CPUUtilization
	--start-time 2017-04-21T14:21:00
	--end-time 2017-04-28T14:21:00
	--period 3600
	--namespace AWS/EC2
	--statistics Average
	--dimensions Name=InstanceId,Value=i-0127275f7c1643a13

04 The command output should return the CPU usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2017-04-21T14:21:00Z",
            "Average": 153.2085333333333333,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-21T15:21:00Z",
            "Average": 137.03345,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-21T16:21:00Z",
            "Average": 131.4999999999999993,
            "Unit": "Percent"
        },

        ...

        {
            "Timestamp": "2017-04-28T12:21:00Z",
            "Average": 312.0365,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-28T13:21:00Z",
            "Average": 290.0283,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-28T14:21:00Z",
            "Average": 227.0278,
            "Unit": "Percent"
        }
    ],
    "Label": "CPUUtilization"
}

If the average CPU usage data returned is above 90%, the selected EC2 instance qualifies as candidate for the overused instance.

05 Determine the EC2 instance memory usage by querying the EC2MemoryUtilization metric data (or whatever name you have used for your custom metric) reported by the CloudWatch script installed on the selected EC2 instance (this rule assumes that the script has been successfully installed and it has recorded memory usage data within the past 7 days). To verify the instance memory usage reported by your custom CloudWatch metric, run get-metric-statistics command (OSX/Linux/UNIX) using your custom metric name as identifier. The following command example returns the average memory utilization for an EC2 instance identified by the ID i-0127275f7c1643a13 from the usage data captured by a CloudWatch metric named EC2MemoryUtilization during a 7-day time frame, using a time interval of 1 hour as the granularity of the returned datapoints:

aws cloudwatch get-metric-statistics
	--region us-east-1
	--metric-name EC2MemoryUtilization
	--start-time 2017-04-21T15:10:00
	--end-time 2017-04-28T15:10:00
	--period 3600
	--namespace AWS/EC2
	--statistics Average
	--dimensions Name=InstanceId,Value=i-0127275f7c1643a13

06 The command output should return the memory usage details requested:

{
    "Datapoints": [
        {
            "Timestamp": "2017-04-21T15:10:00Z",
            "Average": 97.2085,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-21T16:10:00Z",
            "Average": 95.0334,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-21T17:10:00Z",
            "Average": 95.1062,
            "Unit": "Percent"
        },

        ...

        {
            "Timestamp": "2017-04-28T13:10:00Z",
            "Average": 98.03999999999999993,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-28T14:10:00Z",
            "Average": 98.02833333333333333,
            "Unit": "Percent"
        },
        {
            "Timestamp": "2017-04-28T15:10:00Z",
            "Average": 93.18783333333333333,
            "Unit": "Percent"
        }
    ],
    "Label": "EC2MemoryUtilization"
}

If the average memory utilization recorded in the past 7 days is more than 90%, the selected EC2 instance qualifies as candidate for the overused instance.
If the usage data returned at steps no. 3 - 6 satisfy the conditions set by the conformity rule (i.e. average CPU and memory usage above 90%), the selected EC2 instance is considered "overutilized" and should be upsized in order to efficiently handle the workload.

07 Repeat steps no. 3 – 6 to verify the required CPU and memory usage data for the rest of the EC2 instances available within the current region.

08 Change the AWS region by updating the --region command parameter value and repeat steps no. 1 - 7 to perform the entire audit process for other regions.

Remediation / Resolution

Case A: Upgrade (upsize) the overused EC2 instances provisioned within your AWS account by adding more hardware resources (CPU and RAM memory) to the existing instances (vertical scaling). To resize an overutilized EC2 instance, perform the following commands:

(!) Important note: the following process assumes that the EC2 instances selected for upgrade are NOT currently used in production or for critical operations. To resize production instances without any downtime, you should create a snapshot of your current image and launch a new instance from that snapshot using the required instance type.

Using AWS Console

01 Sign in to the AWS Management Console.

02 Navigate to EC2 dashboard at https://console.aws.amazon.com/ec2/.

03 In the navigation panel, under INSTANCES section, choose Instances.

04 Select the overused EC2 instance that you want to resize (see Audit section part I to identify the right resource).

05 Click Actions button from the dashboard top menu, select Instance State, then select Stop.

06 Inside the Stop Instances dialog box, review the action details and click Yes, Stop to confirm the action.

07 Once the instance is stopped (i.e. Instance State set to stopped), use again the Actions button from the dashboard top menu, select Instance Settings, then select Change Instance Type.

08 In the Change Instance Type dialog box, perform the following:

  1. From the Instance Type dropdown list, select the instance type to upgrade to (e.g. m3.large – see EC2 Instance Types page available at this URL to help you choose the right instance type).
  2. (Optional) Select EBS-optimized to enable EBS optimization or deselect EBS-optimized to disable EBS optimization. This feature provides dedicated throughput to your AWS EBS volumes for best I/O performance (additional charges apply).
  3. Click Apply to resize the selected instance.

09 Now click Actions button from the dashboard top menu, select Instance State, then select Start.

10 In the Start Instances dialog box, click Yes, Start to restart the instance. Once the booting process is complete, the EC2 instance status should change from pending to running (this may take few minutes).

11 Repeat steps no. 4 - 10 to upgrade (upsize) any other overutilized EC2 instances provisioned in the current region.

12 Change the AWS region from the navigation bar and repeat the remediation process for other regions.

Using AWS CLI

01 Run stop-instances command (OSX/Linux/UNIX) using the resource ID as identifier to stop the overused EC2 instance that you want to resize (see Audit section part II to identify the right instance):

aws ec2 stop-instances
	--region us-east-1
	--instance-ids i-0127275f7c1643a13

02 The command output should return the stop request metadata:

{
    "StoppingInstances": [
        {
            "InstanceId": "i-0127275f7c1643a13",
            "CurrentState": {
                "Code": 64,
                "Name": "stopping"
            },
            "PreviousState": {
                "Code": 16,
                "Name": "running"
            }
        }
    ]
}

03 Once the instance is stopped (should take few minutes), run modify-instance-attribute command (OSX/Linux/UNIX) to resize the selected EC2 instance to the desired type. The following command example upgrade an EC2 instance identified by the ID i-0127275f7c1643a13 by resizing it from a m3.medium-type instance to a m3.large-type instance (no output is returned):

aws ec2 modify-instance-attribute
	--region us-east-1
	--instance-id i-0127275f7c1643a13
	--instance-type "{\"Value\": \"m3.large\"}"

04 Run start-instances command (OSX/Linux/UNIX) to restart the EC2 instance resized at the previous step (it may take few minutes until the instance enters the running state):

aws ec2 start-instances
	--region us-east-1
	--instance-ids i-0127275f7c1643a13

05 The command output should return the start request information:

{
    "StartingInstances": [
        {
            "InstanceId": "i-0127275f7c1643a13",
            "CurrentState": {
                "Code": 0,
                "Name": "pending"
            },
            "PreviousState": {
                "Code": 80,
                "Name": "stopped"
            }
        }
    ]
}

06 Repeat steps no. 1 - 5 to upgrade (upsize) any other overused EC2 instances available within the current region.

07 Change the AWS region by updating the --region command parameter value and repeat the entire process for other regions.

Case B: Increase the capacity of the Auto Scaling Group (ASG) consisting of overused EC2 instances by adding more machines (instances) to the existing group (horizontal scaling). To upgrade an overutilized AWS Auto Scaling Group, perform the following commands:

Using AWS Console

01 Sign in to the AWS Management Console.

02 Navigate to EC2 dashboard at https://console.aws.amazon.com/ec2/.

03 In the left navigation panel, under AUTO SCALING section, choose Auto Scaling Groups.

04 Select the AWS ASG that you want to upgrade.

05 Select the Details tab from the dashboard bottom panel and click the Edit button:

Edit button

to edit the selected ASG configuration.

06 Raise the number of EC2 instances that run within the selected group by incrementing the existing number available in the Desired and Max fields, e.g.

Desired Max Field

07 Click Save to apply the configuration changes. Once the ASG configuration is updated, the service will launch new EC2 instances and add them to the group, upgrading the total capacity of the ASG cluster in order to handle better the workload.

08 Repeat steps no. 4 - 7 to upgrade other overutilized AWS ASGs by increasing the number of instances within the cluster provisioned in the current region (horizontal scaling).

09 Change the AWS region from the navigation bar and repeat the remediation process for other regions.

Using AWS CLI

01 Run update-auto-scaling-group command (OSX/Linux/UNIX) to increase the capacity of the AWS Auto Scaling Group consisting of overutilized EC2 instances by updating the group configuration. The following command example increase the number of instances within an AWS ASG named CloudConformityASG to 3 (the command does not produce an output):

aws aws autoscaling update-auto-scaling-group
	--region us-east-1
	--auto-scaling-group-name CloudConformityASG
	--desired-capacity 3
	--max-size 3

Once the ASG configuration is updated, the service will launch new EC2 instances within the group, upgrading the total capacity of the cluster to handle better the workload, process also known as horizontal scaling,

02 Repeat step no. 1 to upgrade other overutilized AWS ASGs by increasing the number of instances within the cluster, available in the selected region.

References

Publication date May 2, 2017