Enable Automatic Instance Repairs

Trend Micro Cloud One™ – Conformity is a continuous assurance tool that provides peace of mind for your cloud infrastructure, delivering over 750 automated best practice checks.

Risk level: Medium (should be achieved)
Rule ID: VirtualMachines-026

Ensure that unhealthy virtual machine instances are automatically deleted from the scale sets and new ones are created, using the latest instance model settings. Automatic Instance Repairs feature relies on health checks performed for individual instances running in a scale set. These virtual machine instances can be configured to emit an application health status using the Azure Application Health extension or a load balancer health probe. If a VM instance is found to be unhealthy, as reported by the Application Health extension or by the associated load balancer health probe, then the scale set performs the repair action by deleting the unhealthy instance and creating a new one to replace it.

This rule resolution is part of the Cloud Conformity Security & Compliance tool for Azure

Reliability

Enabling automatic instance repairs for Microsoft Azure virtual machine scale sets helps achieve high availability for your cloud applications by maintaining the scale set instances healthy.


Audit

To determine if automatic repairs policy is enabled for the instances within your Azure virtual machine scale sets, perform the following actions:

Using Azure Portal

01 Sign in to Azure Management Console.

02 Navigate to All resources blade at https://portal.azure.com/#blade/HubsExtension/BrowseAll to access all your Microsoft Azure resources.

03 Choose the Azure subscription that you want to access from the Subscription filter box.

04 From the Type filter box, select Virtual machine scale set to list only the Azure virtual machine scale sets created in the selected subscription.

05 Click on the name of the virtual machine scale set that you want to examine.

06 In the navigation panel, under Settings, select Health and repair to view the automatic repair policy configured for the selected VM scale set.

07 On the Health and repair page, check the Automatic repairs configuration attribute value. If the attribute value is set to Off, the Automatic repairs feature is not enabled for the selected Microsoft Azure virtual machine scale set.

08 Repeat steps no. 5 – 7 for each Azure virtual machine scale set available in the selected subscription.

09 Repeat steps no. 3 – 8 for each subscription created in your Microsoft Azure cloud account.

Using Azure CLI

01 Run account list command (Windows/macOS/Linux) using custom query filters to list the IDs of the subscriptions available in your Azure account:

az account list
    --query '[*].id'

02 The command output should return the requested subscription identifiers (IDs):

[
  "abcd1234-abcd-1234-abcd-abcd1234abcd",
  "abcd1234-abcd-1234-abcd-abcd1234abcd",
]

03 Run vmss list command (Windows/macOS/Linux) using custom query filters to list the name and the associated resource group of each virtual machine scale set deployed in the selected Azure subscription:

az vmss list
    --subscription abcd1234-abcd-1234-abcd-abcd1234abcd
    --output table
    --query '[*].{name:name, resourceGroup:resourceGroup}'

04 The command output should return the requested virtual machine scale set identifiers (IDs):

Name                    ResourceGroup
---------------------   ------------------------------
cc-web-prod-scale-set   cloud-shell-storage-westeurope
cc-web-test-scale-set   cloud-shell-storage-westeurope

05 Run vmss show command (Windows/macOS/Linux) using the name of the virtual machine scale set that you want to examine as identifier parameter, to describe the Automatic repairs feature status, available for the selected VM scale set:

az vmss show
    --name cc-web-prod-scale-set
    --resource-group cloud-shell-storage-westeurope
    --query '{"AutomaticRepairsPolicyEnabled": automaticRepairsPolicy.enabled}'

06 The command output should return the requested feature status:

{
  "AutomaticRepairsPolicyEnabled": false
}

If the "AutomaticRepairsPolicyEnabled" configuration attribute value is set to null or false, as shown in the output example above, the Automatic repairs feature is not currently enabled for the selected Microsoft Azure virtual machine scale set.

07 Repeat step no. 5 and 6 for each Azure virtual machine scale set provisioned in the selected subscription.

08 Repeat steps no. 3 – 10 for each subscription created in your Microsoft Azure cloud account.

Remediation / Resolution

To enable the Automatic Instance Repair feature for your Microsoft Azure virtual machine scale sets, perform the following actions:

Note: Before enabling automatic repairs policy within an existing scale set, ensure that all the requirements for opting in to this feature are met. The application endpoint should be correctly configured for scale set instances to avoid triggering unintended repairs while the endpoint is getting configured.

Using Azure Portal

01 Sign in to Azure Management Console.

02 Navigate to All resources blade at https://portal.azure.com/#blade/HubsExtension/BrowseAll to access all your Microsoft Azure resources.

03 Choose the Azure subscription that you want to access from the Subscription filter box.

04 From the Type filter box, select Virtual machine scale set to list only the virtual machine scale sets deployed in the selected subscription.

05 Click on the name of the virtual machine scale set that you want to reconfigure.

06 In the navigation panel, under Settings, select Health and repair to access the automatic repair policy configured for the selected VM scale set.

07 On the Health and repair page, perform the following:

  1. Ensure that Monitor application health option is configured based on your application requirements.
  2. In the Automatic repair policy section, under Automatic repairs, select On to enable the Automatic Instance Repair feature for the selected virtual machine scale set.
  3. In the Grace period (min) box, enter the appropriate grace period in minutes. The grace period is the amount of time for which automatic repairs are suspended due to a state change on the virtual machine. The grace time starts after the state change is completed. The grace period helps avoid premature or accidental repairs. Allowed values for this setting are between 30 and 90 minutes.
  4. Click Save to apply the configuration changes.

08 Repeat steps no. 5 – 7 to enable automatic instance repairs for other Azure virtual machine scale set created within the selected subscription.

09 Repeat steps no. 3 – 8 for each subscription available in your Microsoft Azure cloud account.

Using Azure CLI

01 Run vmss update command (Windows/macOS/Linux) using the name of the virtual machine scale set that you want to reconfigure as identifier parameter, to enable the Automatic Instance Repair feature for the selected VM scale set. The following command request example enables automatic repairs for a scale set named "cc-project5-scale-set", using a grace period of 30 minutes. The grace period is the amount of time (in minutes, between 30 and 90) for which automatic repairs are suspended due to a state change on the virtual machine:

az vmss update
    --name cc-project5-scale-set
    --resource-group cloud-shell-storage-westeurope
    --enable-automatic-repairs true
    --automatic-repairs-grace-period 30
    --query 'automaticRepairsPolicy'

02 The command output should return the feature configuration metadata:

{
  "enabled": true,
  "gracePeriod": "PT30M"
}

03 Repeat step no. 1 and 2 to enable automatic instance repairs for other Azure virtual machine scale set deployed in the selected subscription.

04 Repeat steps no. 1 – 3 for each subscription created in your Microsoft Azure cloud account.

References

Publication date Oct 26, 2020

Unlock the Remediation Steps


Gain free unlimited access
to our full Knowledge Base


Over 750 rules & best practices
for AWS and Azure

You are auditing:

Enable Automatic Instance Repairs

Risk level: Medium