DevOps.dev

Devops.dev is a community of DevOps enthusiasts sharing insight, stories, and the latest…

Follow publication

Managing Amazon ECR Repository Limits: A DevOps Perspective

--

As a DevOps engineer, I’m always looking for ways to optimize and streamline our deployment processes. Recently, I encountered a challenge that highlighted the importance of managing our resources efficiently, specifically with Amazon Elastic Container Registry (ECR). In this blog post, I want to share an issue we faced, how we discovered the root cause, and the solution we implemented using ECR lifecycle policies. My hope is that this will help other DevOps professionals avoid similar pitfalls and maintain smooth CI/CD operations.

The Issue: Hitting the ECR Image Limit

Our team relies heavily on Amazon ECR to store Docker images for various services. Everything was running smoothly until our GitHub Actions pipeline started failing with an obscure error message:

failed commit on ref "manifest-sha256:xxxxx": unexpected status from PUT request

Initially, I suspected the usual culprits: network issues, authentication problems, or even a misconfiguration in the GitHub Actions workflow. However, after some investigation, I realized that the problem was due to an ECR repository limit that we had unknowingly hit.

Discovering the Root Cause

To diagnose the issue, I followed these steps:

  1. Checked the Image Count: I logged into the AWS Management Console and navigated to our ECR repositories. There, I found a notification indicating that we had reached the 10,000-image limit for one of our repositories.
  2. Reviewed Error Logs: The detailed error logs from the GitHub Actions workflow confirmed that the push operation failed due to an “unexpected status” from ECR. This was a clear sign of a quota-related issue.
  3. Consulted AWS Documentation: A quick review of the AWS documentation confirmed that each ECR repository can store a maximum of 10,000 images. This was the cap we had inadvertently hit.

The Solution: Implementing a Lifecycle Policy

To prevent future disruptions, I decided to implement a lifecycle policy for our ECR repositories. Lifecycle policies allow us to define rules for automatically managing the lifecycle of images, such as expiring older ones, thus keeping the repository size manageable.

Steps to Implement a Lifecycle Policy

  1. Open the Amazon ECR Console & Navigate to the Amazon ECR console.
  2. Select the Repository & Choose the repository that had hit the image limit.
  3. Access Lifecycle Policies & Go to the “Lifecycle policies” tab.
  4. Create a New Policy by Click on “Create lifecycle policy” and set up the policy with the following JSON configuration and Save the Policy
{
"rules": [
{
"rulePriority": 1,
"description": "Keep only the last 50 images",
"selection": {
"tagStatus": "any",
"countType": "imageCountMoreThan",
"countNumber": 50
},
"action": {
"type": "expire"
}
}
]
}

NOTE: This policy ensures that only the latest 50 images are retained, and older images are automatically deleted, thus preventing the repository from reaching the image limit.

Benefits of Lifecycle Policies

  • Automated Management: No more manual cleanups. The policy automatically expires old images, keeping our repositories clean.
  • Cost Efficiency: By removing unnecessary images, we save on storage costs.
  • Operational Continuity: Ensures that our CI/CD pipeline runs without interruptions due to hitting repository limits.

Conclusion

As DevOps engineers, we often juggle multiple responsibilities, from infrastructure management to deployment optimization. Implementing lifecycle policies in Amazon ECR is a simple yet effective way to manage repository sizes, ensuring that we stay within AWS limits and maintain efficient operations.

This experience reminded me of the importance of regularly reviewing and optimizing our resource usage. I hope this post helps others in the DevOps community manage their ECR repositories more effectively. If you have any questions or suggestions, feel free to reach out. Let’s continue to learn and grow together!

Happy DevOps-ing!

Sign up to discover human stories that deepen your understanding of the world.

--

--

Published in DevOps.dev

Devops.dev is a community of DevOps enthusiasts sharing insight, stories, and the latest development in the field.

Written by Chetan Bothra

AWS Certified | GCP | DevOps | SRE | Docker | DevSecOps | Kubernetes | Automation | Terraform | Serverless | Blockchain

No responses yet

Write a response