Ensure that all Amazon EMR cluster log files are periodically archived and uploaded to S3 in order to keep the logging data for historical purposes or to track and analyze the EMR clusters behavior for a long period of time.
This rule can help you with the following compliance standards:
This rule can help you work with the AWS Well-Architected Framework
This rule resolution is part of the Cloud Conformity Security & Compliance tool for AWS
excellence
By default, all EMR log files are automatically deleted from the clusters after the retention period ends. With this feature enabled, Elastic MapReduce uploads the log files from the cluster master instance(s) to Amazon S3 so the logging data (step logs, Hadoop logs, instance state logs, etc) can be utilized later for troubleshooting or compliance purposes. Once active, the EMR service archives and sends the log files to Amazon S3 at 5 minute intervals.
Audit
To determine if Amazon EMR clusters captures log data to S3, perform the following:
Remediation / Resolution
To enable Amazon EMR cluster logging to S3 you need to clone the required cluster and change its logging configuration by performing the following commands:
References
- AWS Documentation
- Amazon EMR FAQs
- Configure Cluster Logging and Debugging
- View Log Files
- AWS Command Line Interface (CLI) Documentation
- emr
- list-clusters
- describe-cluster
- create-cluster
- terminate-clusters
Unlock the Remediation Steps
Gain free unlimited access
to our full Knowledge Base
Over 750 rules & best practices
for and
Get started for FREE
You are auditing:
EMR Cluster Logging
Risk level: Low