AWS Deployment Guide

Here you'll get instructions and learn best practices for deploying Highflame within your Amazon Web Services (AWS) environment.

HA Setup across AWS

High Availability (HA) / Disaster Recovery (DR) Mode

The Highflame HA/DR Mode is designed for enterprise-grade applications where continuous operation is paramount. This mode ensures resilience against regional failures.

Characteristic

Description

Configuration

Active-Passive deployment architecture spanning two distinct geographical regions.

Suitable For

Production and mission-critical systems requiring maximum uptime and fault tolerance.

Redundancy

Built-in redundancy with mechanisms for automatic cross-region failover.

Implication

Enables seamless business continuity by rerouting traffic to the passive region during disasters or major outages in the active region.

Core Configuration: Active-Passive Model

This model focuses around two key regions: a Primary (Active) region that handles all live production traffic, and a Secondary (Passive) region that remains on standby, ready to take over in case of failure.

  • Primary Region (Active): This region is responsible for handling all live, incoming production traffic.

  • Secondary Region (Passive/Warm Standby): This region remains ready on standby. It typically runs minimal compute resources (or scaled-down instances) but maintains up-to-date state data via replication.

Example: Deploying across East US and West US regions.

Essential Prerequisites for the setup

A successful HA/DR deployment hinges on symmetry and operational readiness across both cloud regions. Two foundational pillars ensure seamless failover and minimal disruption:

  1. Symmetrical Infrastructure: Both the Primary and Secondary regions must be configured to support identical resources.

  2. State Replication: Robust replication mechanisms are mandatory to ensure that all stateful components (e.g., persistent databases) are synchronized between the Active and Passive regions. This minimizes data loss (low RPO).

Active-Passive Environment Setup

This setup is driven by two primary phases: Build for deployment, and Operate for failover execution.

Deployment: Active-Passive Environment Setup (Build Phase)

This phase focuses on provisioning symmetrical infrastructure across two AWS regions.

Step 1: Define Target Regions

  • Select two geographically distinct AWS regions (e.g., East US as R1 and West US as R2) and ensure all the resources support those two regions

Step 2: Prepare Scripts for Resource Creation

  • Prepare custom scripts to provision required infrastructure components across both regions.

  • If applicable, reuse or adapt existing terraform modules from javelin-iac repository to ensure consistency and reduce duplication.

Step 3: Provision Core Resources in Both Regions

Provision the following infrastructure components uniformly in Region 1 (R1) and Region 2 (R2) to maintain symmetry and enable seamless failover.

  • Virtual Private Cloud (VPC)

  • Amazon EKS (Elastic Kubernetes Service)

  • Application Load Balancer (ALB)

  • Redis (e.g., ElastiCache)

  • AWS Secrets Manager

Step 4: Validate Configuration Parity

  • Ensure all resource parameters (e.g., instance types, subnet CIDRs, security groups) match across both regions.

Step 5: Tag and Document Resources

  • Apply consistent tagging for auditability and environment identification.

  • Document region-specific endpoints and failover readiness.

DR Runbook: Failover Execution (The Operate Phase)

Failover can be triggered manually or automatically, depending on your setup. Below are the operational steps to execute a successful failover:

Step 1: Trigger Global Database Failover (Aurora)

  • Initiate a managed failover within the Aurora Global DB cluster.

  • This promotes the Passive region (R2) to become the new READ/WRITE Primary, while the Active region (R1) is demoted to a READ-only Replica.

Step 2: Redirect Traffic via Global Accelerator

  • Update the Traffic Dial settings in Global Accelerator’s Endpoint Groups.

  • This reroutes all consumer traffic to the newly promoted region.

  • Configuration targets:

    • New Active Region (R2): Set traffic dial to 100

    • Original Passive Region (R1): Set traffic dial to 0

    Execution can be done via:

    • AWS Management Console

    • Automated IaC scripts

Step 3: Validate Application Health

  • Confirm that health checks (e.g., /healthz endpoint) return a healthy status in Region R2.

Verify that:

  • Global traffic is flowing through Global Accelerator

  • Requests are reaching the public ALB endpoint in R2

  • End-to-end application functionality is intact.

What's Next?

Last updated