Make Your Release Smarter

5 min readSep 11, 2023

Mastering Stability and Risk Mitigation with Slow Injection Blue-Green Deployment

Introduction

Deploying changes in a complex and dynamic software environment can be a daunting task. When it comes to migrating backend infrastructure or upgrading critical components, the stakes are even higher. Any disruption to user-facing services or scheduled jobs can have severe consequences for a business. Enter Slow Injection Blue-Green Deployment, a deployment strategy that allows you to upgrade your infrastructure and move services to production with confidence and minimal risk.

Let’s discuss that!

Brief on my architecture:

We have a polyglot architecture with two types of services :
1. Exposed services (interacts with the users)
2. Non-exposed services (Scheduled jobs)

Aim: To migrate from EKS to EKS fargate.

“Blue-green deployment would be the suitable approach after testing the infrastructure in Dev and QA. However, I desire greater control over the release process to ensure a safe release and treat it as a typical release, even though we are implementing significant infrastructure-level changes.”

Slowly Injecting Deployment

Slow Injection Blue-Green Deployment combines the principles of blue-green deployments (switching between two separate environments) with the gradual release approach of canary deployments. It’s designed to ensure the stability of your system, particularly when dealing with the migration of scheduled jobs and upgrading backend infrastructure.

Advantages of Slow Injection Blue-Green Deployment:

Risk Mitigation: Slow Injection Blue-Green Deployment is all about risk reduction. By gradually moving services to the new environment, you can identify and address issues early in the migration process, reducing the risk of critical failures.
Stability: Users and scheduled jobs continue to run on the old infrastructure until you’re confident that the new environment is stable and performant. This minimizes the chances of disruptions and downtime.
Easy Rollback: Should problems arise during the migration, you can quickly switch back to the old infrastructure, minimizing downtime and impact on users and business operations.
Isolation: Exposed services, those interacting with users, remain on the old infrastructure, ensuring minimal risk to the user experience.
Hassle-Free Release: Gradually migrating and thoroughly testing critical services in the new environment results in a smooth and hassle-free release day. There’s less stress and last-minute problem-solving.
No Chaos/Blame Game on Release Day: Slow Injection Blue-Green Deployment minimizes the chaos and blame game on release day. With critical services already moved to production and tested, there’s less room for finger-pointing or emergency fixes.
Critical Services in Production: The approach ensures that critical services are already running in the new production environment, providing essential functionality to users and maintaining business continuity.
Architecture Change as a Non-Incident Cause: When executed correctly, the change in architecture itself becomes a non-issue on the release day. Any incidents or problems that might occur are less likely to be attributed to the architecture change, as its stability has already been validated.

When to Use Slow Injection Blue-Green Deployment:

Backend Infrastructure Upgrade: This approach is particularly useful when upgrading your backend infrastructure, such as moving to a serverless platform or transitioning to a new cloud provider.
Scheduled Jobs: It’s an excellent choice when you have scheduled jobs that are critical to your business, as it ensures their uninterrupted execution.
Large Applications: For complex, large-scale applications, this approach allows you to manage the migration of different components separately, reducing complexity and risk.

Process :

Preparation Phase:
a) Identify Services: Categorize your services into exposed (user-facing) and non-exposed (e.g., scheduled jobs).
b) Environment Setup: Provision the new environment (e.g., EKS Fargate) alongside the existing one (e.g., EKS).
c)Configuration Management: Ensure that configurations for both environments match as closely as possible.
Gradual Migration Phase:
a) Start with Non-Exposed Services: Begin by migrating non-exposed services (e.g., scheduled jobs) to the new environment. This ensures that critical backend processes are running smoothly.
b) Testing and Validation: Thoroughly test the migrated services in the new environment, checking for compatibility, performance, and stability.
c) Monitoring: Implement monitoring and alerting to closely monitor the behaviour of the migrated services.
d) Iterative Approach: Gradually migrate services one by one or in small groups. Test and validate each migration before moving to the next.
Parallel Operation Phase:
a) Continue Operating Old Services: Keep the old environment (e.g., EKS) operational for exposed services, ensuring minimal disruption to users.
b) Traffic Routing: Use load balancers or routing mechanisms to direct user traffic to the old environment while routing non-exposed service traffic(scheduled triggers) to the new one.
c) Monitoring and Metrics: Continue monitoring both environments to identify any discrepancies or issues.
Validation and Confidence Phase:
a) Stress Testing: Conduct stress and load testing to ensure that the new environment can handle production-level traffic and workloads.
b) User Acceptance Testing (UAT): If applicable, involve users or stakeholders in testing the exposed services in the new environment.
c) Performance Benchmarks: Compare the performance and stability of the old and new environments to ensure parity or improvements.
d) Gradual Transition: As confidence builds, gradually increase the traffic directed to the new environment.
Full Transition and Final Checks Phase:
a) Final Service Migration: Move all remaining services, including exposed ones, to the new environment.
b) Final Testing: Conduct comprehensive testing, including end-to-end testing of the entire system in the new environment.
c) Data Migration: If applicable, ensure that data is migrated seamlessly to the new environment.
Release and Monitoring Phase:
a) Full Deployment: Complete the transition by directing all traffic to the new environment.
b) Intensive Monitoring: Monitor intensively during the initial hours and days after the full release.
c) Alerting and Incident Response: Set up alerts and establish incident response procedures for any unforeseen issues.
Rollback Plan:
a) Emergency Rollback: Have a well-defined rollback plan in case of critical issues. This should include procedures to quickly switch traffic back to the old environment.
b) Data Synchronization: Ensure data synchronization between old and new environments to facilitate rollback if needed.
Post-Release Evaluation:
a) Post-Migration Review: Conduct a post-release review to evaluate the success of the migration, gather lessons learned, and document improvements for future deployments.
b) Optimization: There will be always scope to optimize and fine-tune the new environment based on performance data and user feedback.

Thanks for reading our blog. Feel free to hit me up for any AWS/DevOps/Open Source-related discussions.

Manoj Kumar — LinkedIn.

Poonam Pawar — LinkedIn

Happy Releases!!