> For the complete documentation index, see [llms.txt](https://docs.highflame.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.highflame.ai/deployment-guides/aws/single-region.md).

# Single Region

## Single Region Deployment on EKS

### **⚠️ Reference Only** (If you are using Terraform)

The [`highflame-iac`](https://github.com/highflame-ai/highflame-iac/blob/main/terraform/README.md) repository contains reference Terraform configurations for managing Highflame cloud resources. **Do not** copy these directly into your IaC code or run them against your cloud environment without first reviewing and patching them to match your account, networking, and security requirements.

### Benefits of Single-Region Deployment

* **Simplified Management**: Easier oversight and maintenance due to a single, centralized location for all resources.
* **Cost Efficiency**: Potentially lower data transfer costs between services within the same region.
* **Consistency**: Ensures uniformity of services and offers within a specific AWS region.

### Prerequisites <a href="#prerequisites" id="prerequisites"></a>

Ensure that the following tools and resources are installed and available:

* Access to AWS Management Console
* Set of domain names for Highflame services (Highflame Team will share the list of services that require the ingress)
* EKS cluster with at least 6 worker nodes (**Best Practice:** 3 zones for the VPC and use 6 nodes, with 2 nodes in each Availability Zone, to ensure high availability)
* EKS worker nodes should have the following IAM permissions
  * arn:aws:iam::aws:policy/CloudWatchFullAccess
  * arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
  * arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
* EKS addons to be installed
  * coredns
  * kube-proxy
  * vpc-cni
  * aws-ebs-csi-driver
  * amazon-cloudwatch-observability
* EKS can be configured with [cluster-autoscaler](https://kubernetes.github.io/autoscaler) for managing the worker node group scaling
* EKS should be configured with [metrics-server](https://kubernetes-sigs.github.io/metrics-server) for managing the Highflame service HPA
* Aurora Postgres cluster from RDS
* Redis cluster from Elasticache
* ALB ingress controller for EKS and SSL certificates in the ACM
* Helm v3
* Kubectl utility
* All the cloud resources (managed services such as Postgres, Redis etc) should be in the same VPC, or those should be accessible from the Kubernetes Cluster

### Cloud Resources and Sizing

| Highflame components  | Cloud Resources                                                           | Size                                                                                                                                                                                                                                                                                                                                                                                                  |
| --------------------- | ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Highflame services    | Deploying the services into your EKS cluster with help of the helm charts | <ul><li><strong>CPU nodes</strong>: Setup atleast 3 nodes across different zones and the worker node type can be <code>c6a.2xlarge</code> with ami type <code>AL2023\_x86\_64\_STANDARD</code></li><li><strong>GPU nodes</strong>: Setup atleast 4 nodes across different zones and the worker node type can be <code>g4dn.xlarge</code> with ami type <code>AL2023\_x86\_64\_NVIDIA</code></li></ul> |
| Trascational Database | Aurora database compatible with postgres                                  | The Aurora server can be single node or cluster and the node type can be `db.r6g.large`                                                                                                                                                                                                                                                                                                               |
| Analytical Database   | Clickhouse database server                                                | Deploying into the EKS                                                                                                                                                                                                                                                                                                                                                                                |
| Cache                 | ElastiCache for Redis OSS                                                 | Redis cluster of min 2 nodes with node type `cache.m5.large`                                                                                                                                                                                                                                                                                                                                          |
| Object Store          | S3 bucket                                                                 | Highflame services required to store files and data in object store, such as Store the clickhouse db backup in the S3 for DR as clickhouse is deploying into the EKS cluster                                                                                                                                                                                                                          |
| Logging               | Cloudwatch                                                                | The EKS service logs can be pushed to the Cloudwatch with help of fluent-bit                                                                                                                                                                                                                                                                                                                          |
| Authentication        | Clerk                                                                     | External - Managed by Highflame                                                                                                                                                                                                                                                                                                                                                                       |

### Highflame AWS IAM Role

#### Prerequisites <a href="#prerequisites" id="prerequisites"></a>

Ensure that the following tools and resources are installed and available:

* Access to the EKS cluster setup above
* AWS CLI
* eksctl

#### IAM Role

1. Setting up the environment vars

```bash
export K8S_CLUSTER_NAME="highflame-poc-eks"    ## AWS EKS cluster name
export K8S_NAMESPACE="highflame-poc"           ## Highflame k8s namespace
export IAM_ROLE_NAME="highflame-svc-role"      ## Name given for the AWS IAM role
export K8S_SA_NAME="highflame-k8s-sa"          ## Name for K8S service account
```

2. Get the AWS Account ID

```bash
AWS_ACC_ID=$(aws sts get-caller-identity --query Account --output text)
```

3. Fetch the EKS cluster's OIDC issuer

```bash
OIDC_ISSUER=$(aws eks describe-cluster --name ${K8S_CLUSTER_NAME} --query "cluster.identity.oidc.issuer" --output text | sed -e "s~https://~~")
```

4. Create an IAM OIDC provider if it does not exist

```bash
# Confirm whether the OIDC provider is available
aws iam list-open-id-connect-providers | grep ${OIDC_ISSUER}
# If no OIDC provider is present, then create an IAM OIDC provider
eksctl utils associate-iam-oidc-provider \
    --cluster ${K8S_CLUSTER_NAME} --approve
```

5. Create a Trust Relationship

```bash
cat >iam_trust.json <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACC_ID}:oidc-provider/${OIDC_ISSUER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_ISSUER}:aud": "sts.amazonaws.com",
          "${OIDC_ISSUER}:sub": "system:serviceaccount:${K8S_NAMESPACE}:${K8S_SA_NAME}"
        }
      }
    }
  ]
}
EOF

aws iam create-role \
  --role-name ${IAM_ROLE_NAME} \
  --assume-role-policy-document file://iam_trust.json
```

6. Get the ARN of the IAM role

```bash
IAM_ROLE_ARN=$(aws iam get-role --role-name ${IAM_ROLE_NAME} --query "Role.Arn" --output text)
echo "${IAM_ROLE_ARN}"
```

#### Resource Access - Amazon S3

There will be a few S3 bucket requirements for HighFlame, and those were mentioned in the [highflame service variables](https://github.com/highflame-ai/highflame-iac/blob/main/docs/service-vars.md) list.

For each S3 bucket, this policy needs to be added to the IAM role created above

<pre class="language-bash"><code class="lang-bash"><strong>{
</strong>  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [ "s3:*" ],
      "Resource": [
        "arn:aws:s3:::${S3_BUCKET_NAME}/*",
        "arn:aws:s3:::${S3_BUCKET_NAME}" 
        ]
    }
  ]
}
</code></pre>

### Logging Setup

Fluent Bit is deployed as a **DaemonSet** in the EKS cluster to collect logs from each node. It is configured to:

* Tail logs from `/var/log/containers/*.log` for each microservice deployed in the EKS
* Parse Kubernetes logs using the Kubernetes filter and create an individual log group with the highflame service name
* Forward logs to **Amazon CloudWatch Logs** under the respective log stream name, the name matches the highflame service name

**IAM Permissions** for Fluent Bit (via IRSA) are configured to allow writing to CloudWatch Logs.

A production-ready guide to deploying **Fluent Bit** as a DaemonSet on an Amazon EKS cluster using **Helm 3**, with logs shipped to **Amazon CloudWatch Logs**

#### Overview

Fluent Bit is a lightweight log processor and forwarder built for containerized environments. On EKS, we deploy it as a **DaemonSet,** so a single pod runs on every node and tails:

* `/var/log/containers/*.log` — all container stdout/stderr
* `/var/log/messages` — node systemd / kubelet logs (optional)
* `/var/log/kube-apiserver-audit.log` — audit logs (optional)

Logs are enriched with Kubernetes metadata (pod, namespace, labels) and shipped to a configured **output** — CloudWatch Logs by default.

#### IAM Setup (IRSA)

Fluent Bit needs permission to write to CloudWatch Logs. We grant this via **IAM Roles for Service Accounts (IRSA)** — no static keys. Either you can use the EKS worker nodes permission (mentioned in the Prerequisites or create a custom IAM policy and role, then pass it to the fluent-bit helm deployment)

#### Add the Helm Repository

```bash
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update fluent

# See available chart versions
helm search repo fluent/fluent-bit --versions
```

#### Install Fluent Bit

A complete, annotated `values.yaml` for shipping EKS logs to CloudWatch:

Change the following according to your setup in the Helm values file

```bash
logNamespace: prod
projectName: highflame
region: us-east-1
logRetention: 365
k8sCluster: highflame-prod-eks
```

```bash
## Change it for each namespace
logNamespace: prod
projectName: highflame
region: us-east-1
logRetention: 365
k8sCluster: highflame-prod-eks

## Actual Helm chart Value file https://github.com/fluent/helm-charts/blob/main/charts/fluent-bit/values.yaml
## https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file
config:
  service: |
    [SERVICE]
        Daemon            Off
        Flush             {{ .Values.flush }}
        Log_Level         {{ .Values.logLevel }}
        Parsers_File      /fluent-bit/etc/parsers.conf
        Parsers_File      /fluent-bit/etc/conf/custom_parsers.conf
        HTTP_Server       On
        HTTP_Listen       0.0.0.0
        HTTP_Port         {{ .Values.metricsPort }}
        Health_Check      On

  ## https://docs.fluentbit.io/manual/pipeline/inputs
  inputs: |
    [INPUT]
        Name              tail
        Path              /var/log/containers/*_{{ .Values.logNamespace }}*.log
        Parser            docker
        Tag               {{ .Values.projectName }}.*
        multiline.parser  docker, cri
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On

  ## https://docs.fluentbit.io/manual/pipeline/filters
  filters: |
    [FILTER]
        Name                kubernetes
        Match               {{ .Values.projectName }}.*
        Kube_Tag_Prefix     {{ .Values.projectName }}.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [FILTER]
        Name                grep
        Match               {{ .Values.projectName }}.*
        Exclude             log /.*/health*./
        Exclude             msg /.*/health*./

  ## https://docs.fluentbit.io/manual/pipeline/outputs
  outputs: |
    [OUTPUT]
        Name                cloudwatch_logs
        Match               {{ .Values.projectName }}.*
        region              {{ .Values.region }}
        log_group_name      {{ .Values.k8sCluster }}-service-log
        auto_create_group   On
        log_retention_days  {{ .Values.logRetention }}
        log_stream_name     {{ .Values.projectName }}

  ## https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/upstream-servers
  ## This configuration is deprecated, please use `extraFiles` instead.
  upstream: {}

  ## https://docs.fluentbit.io/manual/pipeline/parsers
  customParsers: |
    [PARSER]
        Name docker_no_time
        Format json
        Time_Keep Off
        Time_Key time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
```

Deploy the fluent bit

```bash
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  --version 0.46.2 \
  --values values.yaml \
  --wait --timeout 5m
```

#### References

* Fluent Bit Helm chart: <https://github.com/fluent/helm-charts/tree/main/charts/fluent-bit>
* Fluent Bit docs: [https://docs.fluentbit.io](https://docs.fluentbit.io/)
* EKS logging best practices: <https://docs.aws.amazon.com/eks/latest/best-practices/logging.html>
* CloudWatch output plugin: <https://docs.fluentbit.io/manual/pipeline/outputs/cloudwatch>
* IRSA: <https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html>

### Analytical Database setup - Clickhouse on EKS

This document is the long-form runbook for installing ClickHouse on an EKS cluster using the [Altinity ClickHouse Operator](https://github.com/Altinity/clickhouse-operator), and an S3-backed [clickhouse-backup](https://github.com/Altinity/clickhouse-backup) sidecar.

#### What gets deployed <a href="#id-1-what-gets-deployed" id="id-1-what-gets-deployed"></a>

The full stack lives entirely inside a dedicated `clickhouse` namespace and consists of five logical components:

<table><thead><tr><th width="187">Component</th><th>Kind</th><th>Source</th><th>Notes</th></tr></thead><tbody><tr><td>Altinity operator</td><td>Helm release <code>ch-operator</code></td><td>Upstream chart</td><td>Watches <code>ClickHouseInstallation</code> (CHI) and <code>ClickHouseKeeperInstallation</code> (CHK) CRDs</td></tr><tr><td>Service account, ConfigMap, Secret</td><td>k8s resources</td><td><code>highflame-clickhouse-deps.yml</code></td><td>IRSA-annotated <code>clickhouse-sa</code>, <code>clickhouse-cm</code>, and <code>clickhouse-secrets</code></td></tr><tr><td>ClickHouse Keeper</td><td><code>ClickHouseKeeperInstallation/ch</code></td><td><code>highflame-clickhouse-values.yml</code></td><td>1 replica, 20Gi PVC. Replaces ZooKeeper</td></tr><tr><td>ClickHouse cluster</td><td><code>ClickHouseInstallation/ch</code></td><td><code>highflame-clickhouse-values.yml</code></td><td>1 shard × 1 replica, 4 CPU / 12 Gi RAM, 512 Gi PVC, <code>clickhouse-backup</code> sidecar</td></tr><tr><td>Backup config</td><td>k8s resources</td><td><code>highflame-clickhouse-backup-config.yml</code></td><td>Backup config for S3 integration</td></tr><tr><td>CronJob + Job</td><td>k8s resources</td><td><code>highflame-clickhouse-addons-values.yml</code></td><td>12-hourly remote backups, 60-backup retention; setup Job applies 3-day TTL on <code>system.*_log</code> tables</td></tr></tbody></table>

Pinned versions (verify before installing):

* `Operator chart`: `0.25.5`
* `clickhouse/clickhouse-server`: `25.10.2`
* `clickhouse/clickhouse-keeper`: `25.10.2`
* `altinity/clickhouse-backup`: `2.6.39`

After installation, the in-cluster service endpoints are:

* HTTP: `clickhouse-ch.clickhouse.svc.cluster.local:8123`
* Native TCP: `clickhouse-ch.clickhouse.svc.cluster.local:9000`
* Keeper: `keeper-ch.clickhouse.svc.cluster.local:2181`

These match the `CLICKHOUSE_HOST` ConfigMap value and the `zookeeper.nodes` block in the CHI spec

#### Prerequisites

Before you start, you need:

1. **An EKS cluster** with `kubectl` context pointing at it (`kubectl config current-context`).
2. **A storage class** that supports `ReadWriteOnce` PVCs — `gp3` is recommended. Both the Keeper and the ClickHouse server use the cluster default unless you uncomment `storageClassName` in the CHI/CHK templates.
3. **An S3 bucket** for remote backups, in the same AWS account as the cluster. Versioning + lifecycle rules are recommended but not required.
4. **An IAM role for the backup Service Account (IRSA)** with permission to `s3:GetObject`, `s3:PutObject`, `s3:DeleteObject`, `s3:ListBucket` on that bucket. The role's trust policy must allow the OIDC provider of the cluster to assume it for `system:serviceaccount:clickhouse:clickhouse-sa`.
5. **`helm` ≥ 3.10**

#### Config files

{% hint style="info" %}
highflame-clickhouse-deps.yml
{% endhint %}

```bash
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    eks.amazonaws.com/role-arn: ""
  name: clickhouse-sa
  namespace: clickhouse
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: clickhouse-cm
  namespace: clickhouse
data:
  CLICKHOUSE_HOST: "clickhouse-ch.clickhouse.svc.cluster.local"
  BACKUP_KEEP: "60"
  CREATE_DB_LIST: ""
  BACKUP_DB_LIST: "highflame"
---
apiVersion: v1
kind: Secret
metadata:
  name: clickhouse-secrets
  namespace: clickhouse
type: Opaque
stringData:
  CH_ADMIN_USERNAME: "highflame_admin"
  CH_ADMIN_PASSWORD: ""
  CH_READONLY_USERNAME: "highflame_readonly"
  CH_READONLY_PASSWORD: ""
  CH_BACKUP_USERNAME: "highflame_backup"
  CH_BACKUP_PASSWORD: ""
```

Defines:

* `ServiceAccount/clickhouse-sa` — annotation `eks.amazonaws.com/role-arn` must be set to the IRSA role ARN created in the prerequisites.
* `ConfigMap/clickhouse-cm` — `CLICKHOUSE_HOST`, `BACKUP_KEEP=60`, `CREATE_DB_LIST` (comma-separated DBs to create on first boot), `BACKUP_DB_LIST=highflame` (comma-separated DBs to back up).
* `Secret/clickhouse-secrets` — admin / readonly / backup usernames + passwords.

You must edit:

| Field                        | Required value                                         |
| ---------------------------- | ------------------------------------------------------ |
| `eks.amazonaws.com/role-arn` | IRSA role ARN                                          |
| `CH_ADMIN_PASSWORD`          | strong password for admin user `highflame_admin`       |
| `CH_READONLY_PASSWORD`       | strong password for readonly user `highflame_readonly` |
| `CH_BACKUP_PASSWORD`         | strong password for backup user `highflame_backup`     |

{% hint style="info" %}
highflame-clickhouse-values.yml
{% endhint %}

Defines the `ClickHouseKeeperInstallation/ch` and `ClickHouseInstallation/ch` custom resources.

```bash
apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"

metadata:
  name: "ch"
  namespace: clickhouse

spec:
  configuration:
    clusters:
      - name: cluster
        layout:
          replicasCount: 1
    settings:
      keeper_server/tcp_port: "2181"
      keeper_server/raft_port: "9444"

  defaults:
    templates:
      podTemplate: keeper
      volumeClaimTemplate: keeper

  templates:
    podTemplates:
      - name: keeper
        metadata:
          labels:
            app: clickhouse-keeper
        spec:
          affinity:
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                - labelSelector:
                    matchExpressions:
                      - key: "app"
                        operator: In
                        values:
                          - clickhouse-keeper
                  topologyKey: "kubernetes.io/hostname"
          containers:
            - name: keeper
              image: "clickhouse/clickhouse-keeper:25.10.2"
              command: [ "/usr/bin/clickhouse-keeper", "start" ]
              resources:
                requests:
                  memory: "512Mi"
                  cpu: "500m"
                limits:
                  memory: "1024Mi"
                  cpu: "1000m"
          securityContext:
            fsGroup: 101

    volumeClaimTemplates:
      - name: keeper
        spec:
          # storageClassName: storage_classname
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 20Gi
---
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"

metadata:
  name: "ch"
  namespace: clickhouse

spec: 
  configuration:
    zookeeper:
      nodes:
        - host: "keeper-ch.clickhouse.svc.cluster.local"
          port: 2181
    clusters:
      - name: cluster
        layout:
          shardsCount: 1
          replicasCount: 1
    users:
      highflame_admin/password:
        valueFrom:
          secretKeyRef:
            name: clickhouse-secrets
            key: CH_ADMIN_PASSWORD
      highflame_admin/networks/ip:
        - "0.0.0.0/0"
        - "::/0"
      highflame_admin/profile: default
      highflame_admin/quota: default
      highflame_readonly/password:
        valueFrom:
          secretKeyRef:
            name: clickhouse-secrets
            key: CH_READONLY_PASSWORD
      highflame_readonly/networks/ip:
        - "0.0.0.0/0"
        - "::/0"
      highflame_readonly/profile: readonly
      highflame_readonly/quota: readonly
      highflame_backup/password:
        valueFrom:
          secretKeyRef:
            name: clickhouse-secrets
            key: CH_BACKUP_PASSWORD
      highflame_backup/networks/ip:
        - "0.0.0.0/0"
        - "::/0"
      highflame_backup/profile: backup
      highflame_backup/quota: backup      
    profiles:
      readonly/readonly: 2
      readonly/max_execution_time: 120
      readonly/max_memory_usage: 1000000000
      readonly/max_rows_to_read: 100000000
      readonly/max_result_rows: 10000
      readonly/read_overflow_mode: throw
      readonly/result_overflow_mode: throw
      readonly/timeout_overflow_mode: throw
      backup/readonly: 0
    quotas:
      readonly/interval/duration: 3600
      backup/interval/duration: 3600

  defaults:
    templates:
      podTemplate: clickhouse
      dataVolumeClaimTemplate: clickhouse-data
      # logVolumeClaimTemplate: clickhouse-log

  templates:
    podTemplates:
      - name: clickhouse
        spec:
          serviceAccountName: clickhouse-sa
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:25.10.2
              resources:
                requests:
                  cpu: "4"
                  memory: "12Gi"
                limits:
                  cpu: "4"
                  memory: "12Gi"
              volumeMounts:
                - name: clickhouse-data
                  mountPath: /var/lib/clickhouse
                # - name: clickhouse-log
                #   mountPath: /var/log/clickhouse-server
            - name: clickhouse-backup
              image: altinity/clickhouse-backup:2.6.39
              command:
                - /bin/bash
                - -c
                - "/bin/clickhouse-backup -c /opt/backup-config.yml server"
              volumeMounts:
                - name: clickhouse-data
                  mountPath: /var/lib/clickhouse
                - name: clickhouse-backup-config
                  mountPath: /opt/backup-config.yml
                  subPath: backup-config.yml
                  readOnly: true
          volumes:
            - name: clickhouse-backup-config
              secret:
                secretName: clickhouse-backup-config
                optional: false
          # nodeSelector:
          #   kube/nodegroup: "general"

    volumeClaimTemplates:
      - name: clickhouse-data
        spec:
          # storageClassName: storage_classname
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 512Gi
      # - name: clickhouse-log
      #   spec:
      #     storageClassName: storage_classname
      #     accessModes:
      #       - ReadWriteOnce
      #     resources:
      #       requests:
      #         storage: 100Gi
```

Other useful knobs in this file (edit if your environment differs):

* `templates.podTemplates[clickhouse].spec.containers[clickhouse].resources` — default request and limit are both `cpu: 4`, `memory: 12Gi`.
* `templates.volumeClaimTemplates[clickhouse-data].resources.requests.storage` — default `512Gi`.
* `templates.podTemplates[clickhouse].spec.nodeSelector` — uncomment to pin to a specific nodegroup.

{% hint style="info" %}
highflame-clickhouse-backup-config.yml
{% endhint %}

```bash
general:
  remote_storage: s3
  disable_progress_bar: false
  backups_to_keep_local: 1
  backups_to_keep_remote: 10
  log_level: info
  allow_empty_backups: false
clickhouse:
  username: highflame_backup
  password: ""
  host: clickhouse-ch.clickhouse.svc.cluster.local
  port: 9000
  disk_mapping: {}
  skip_tables:
    - system.*
  timeout: 5m
  freeze_by_part: false
  secure: false
  skip_verify: false
  sync_replicated_tables: true
  log_sql_queries: false
s3:
  bucket: "S3_BUCKET"
  endpoint: "https://s3.S3_REGION.amazonaws.com"
  region: S3_REGION
  path: highflame-clickhouse
  force_path_style: true
  compression_level: 1
  compression_format: tar
  disable_cert_verification: false
  storage_class: STANDARD
```

Renders into `Secret/clickhouse-backup-config` (mounted into the sidecar at `/opt/backup-config.yml`). Update:

* `S3_BUCKET` — bucket name.
* `S3_REGION` — AWS region (drives both `region` and the `endpoint` URL).
* `clickhouse.username` / `clickhouse.password` to match `CH_BACKUP_USERNAME` / `CH_BACKUP_PASSWORD` from the previous step
* The default `path: highflame-clickhouse` is a key prefix inside the bucket — change it per environment if you share a bucket between environments.

{% hint style="info" %}
highflame-clickhouse-addons-values.yml
{% endhint %}

```bash
apiVersion: batch/v1
kind: CronJob
metadata:
  name: clickhouse-backup
  namespace: clickhouse
spec:
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 1
  failedJobsHistoryLimit: 1
  schedule: "0 */12 * * *"
  jobTemplate:
    spec:
      activeDeadlineSeconds: 14400
      backoffLimit: 1
      template:
        spec:
          serviceAccountName: clickhouse-sa
          restartPolicy: Never
          containers:
            - name: clickhouse-backup
              image: altinity/clickhouse-backup:2.6.39
              envFrom:
                - configMapRef:
                    name: clickhouse-cm
              command:
                - /bin/bash
                - -c
                - |
                  set -e
                  BACKUP_CFG="/opt/backup-config.yml"
                  BACKUP_TIME="$(date +%F-%H-%M-%S)"

                  if [[ "$${BACKUP_DB_LIST}" != "" ]] ; then
                    IFS=',' read -r -a DBS <<< "$${BACKUP_DB_LIST}"
                    for db in "$${DBS[@]}" ; do
                      BACKUP_NAME="$${db}--$${BACKUP_TIME}"
                      echo "Creating backup $${BACKUP_NAME}"
                      /bin/clickhouse-backup -c $${BACKUP_CFG} create_remote --tables="$${db}.*" $${BACKUP_NAME}

                      echo "Deleting all local backups"
                      rm -rf /var/lib/clickhouse/backup/$${BACKUP_NAME}

                      echo "Rotating old remote backups - keeping last $${BACKUP_KEEP}"
                      BACKUPS=$$(clickhouse-backup -c $${BACKUP_CFG} list remote | grep -i "$${db}--" | awk '{print $$1}')
                      BKP_COUNT=$$(echo "$${BACKUPS}" | wc -l)

                      if [[ "$${BKP_COUNT}" -le "$${BACKUP_KEEP}" ]] ; then
                        echo "Nothing to delete...($${BKP_COUNT} backups, keep $${BACKUP_KEEP})"
                      else
                        DELETE_COUNT=$$((BKP_COUNT - BACKUP_KEEP))
                        DELETE_LIST=$$(echo "$${BACKUPS}" | head -n $${DELETE_COUNT})

                        for bkp in $${DELETE_LIST} ; do
                          echo "Deleting remote backup: $${bkp}"
                          clickhouse-backup -c $${BACKUP_CFG} delete remote "$${bkp}"
                        done
                      fi
                      echo "Backup job completed for database $${db}"
                    done
                  else
                    echo "Backup is disabled. Current BACKUP_DB_LIST is empty"
                  fi
              volumeMounts:
                - name: clickhouse-data
                  mountPath: /var/lib/clickhouse
                - name: clickhouse-backup-config
                  mountPath: /opt/backup-config.yml
                  subPath: backup-config.yml
                  readOnly: true
          volumes:
            - name: clickhouse-data
              persistentVolumeClaim:
                claimName: clickhouse-data
            - name: clickhouse-backup-config
              secret:
                secretName: clickhouse-backup-config
                optional: false
          affinity:
            podAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                - labelSelector:
                    matchExpressions:
                      - key: clickhouse.altinity.com/chi
                        operator: In
                        values:
                          - ch
                  topologyKey: "kubernetes.io/hostname"
---
apiVersion: batch/v1
kind: Job
metadata:
  name: clickhouse-setup
  namespace: clickhouse
spec:
  completions: 1
  parallelism: 1
  backoffLimit: 0
  template:
    spec:
      restartPolicy: Never
      initContainers:
        - name: wait
          image: curlimages/curl:8.17.0
          command:
            - /bin/sh
            - -c
            - |
              echo "Waiting for ClickHouse to be ready..."

              until curl -s "http://$${CLICKHOUSE_HOST}:8123/ping" | grep -q "Ok" ; do
                echo "ClickHouse not ready yet. Waiting..."
                sleep 5
              done

              echo "ClickHouse is ready!"
          envFrom:
            - configMapRef:
                name: clickhouse-cm
      containers:
        - name: setup
          image: clickhouse/clickhouse-server:25.10.2
          command:
            - /bin/bash
            - -c
            - |
              set -e

              export SYS_DB_NAME="system"
              export LOG_TABLES=($(clickhouse-client \
                  --host="$${CLICKHOUSE_HOST}" \
                  --user="$${CLICKHOUSE_USER}" \
                  --password="$${CLICKHOUSE_PASSWORD}" \
                  --query="SELECT name FROM $${SYS_DB_NAME}.tables WHERE database = '$${SYS_DB_NAME}' AND (engine LIKE '%Log' OR name LIKE '%_log')"))

              if [[ "$${CREATE_DB_LIST}" != "" ]] ; then
                IFS=',' read -r -a DBS <<< "$${CREATE_DB_LIST}"
                for db in "$${DBS[@]}" ; do
                  echo "Creating DB: $${db}"
                  clickhouse-client \
                    --host="$${CLICKHOUSE_HOST}" \
                    --user="$${CLICKHOUSE_USER}" \
                    --password="$${CLICKHOUSE_PASSWORD}" \
                    --query="CREATE DATABASE IF NOT EXISTS $${db}"
                done
              else
                echo "Database creation is disabled. Current CREATE_DB_LIST is empty"
              fi

              echo "DB creation is completed...!"

              for log_tbl in $${LOG_TABLES[@]} ; do
                echo "Applying $${SYS_DB_NAME} log TTL to $${log_tbl}"

                table_name="$${log_tbl##*.}"

                exists=$$(clickhouse-client \
                          --host="$${CLICKHOUSE_HOST}" \
                          --user="$${CLICKHOUSE_USER}" \
                          --password="$${CLICKHOUSE_PASSWORD}" \
                          --query="SELECT count() FROM $${SYS_DB_NAME}.tables WHERE database='$${SYS_DB_NAME}' AND name='$${table_name}'")

                if [ "$${exists}" -eq 1 ]; then
                  clickhouse-client \
                    --host="$${CLICKHOUSE_HOST}" \
                    --user="$${CLICKHOUSE_USER}" \
                    --password="$${CLICKHOUSE_PASSWORD}" \
                    --query "
                              SET alter_sync = 0;
                              ALTER TABLE $${SYS_DB_NAME}.$${log_tbl} MODIFY TTL event_date + INTERVAL 3 DAY;
                              ALTER TABLE $${SYS_DB_NAME}.$${log_tbl} MATERIALIZE TTL;
                            "
                else
                  echo "Table $${log_tbl} does not exist yet. Skipping."
                fi
              done

              for log_tbl in $${LOG_TABLES[@]} ; do
                echo "Validating the table size of the table $${log_tbl}"
                table_name="$${log_tbl##*.}"

                clickhouse-client \
                  --host="$${CLICKHOUSE_HOST}" \
                  --user="$${CLICKHOUSE_USER}" \
                  --password="$${CLICKHOUSE_PASSWORD}" \
                  --query "
                            SELECT name, engine_full
                            FROM $${SYS_DB_NAME}.tables
                            WHERE database='$${SYS_DB_NAME}' AND name IN ('$${table_name}');
                          "
              done
          envFrom:
            - configMapRef:
                name: clickhouse-cm
          env:
            - name: CLICKHOUSE_USER
              valueFrom:
                secretKeyRef:
                  name: clickhouse-secrets
                  key: CH_ADMIN_USERNAME
            - name: CLICKHOUSE_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: clickhouse-secrets
                  key: CH_ADMIN_PASSWORD
```

Defines:

* `CronJob/clickhouse-backup` — runs every 12h (`0 */12 * * *`), iterates `BACKUP_DB_LIST` from the ConfigMap, calls `clickhouse-backup create_remote` per DB, and prunes old remote backups beyond `BACKUP_KEEP`.
* `Job/clickhouse-setup` — one-shot job that waits for ClickHouse to answer `/ping`, creates databases listed in `CREATE_DB_LIST`, and applies `MODIFY TTL event_date + INTERVAL 3 DAY` to every `system.*_log` table.

#### Install — step by step

Run from the repo root unless noted otherwise.

Namespace

```bash
kubectl create ns clickhouse
```

Operator

```bash
helm repo add altinity https://helm.altinity.com
helm repo update

helm install ch-operator altinity/altinity-clickhouse-operator \
  --namespace clickhouse \
  --version 0.25.5
```

Verify the operator pod is running, and the CRDs are installed:

```bash
kubectl -n clickhouse get pods -l app=clickhouse-operator
kubectl get crds | grep -E 'clickhouseinstallation|clickhousekeeper'
```

You should see

* `clickhouseinstallations.clickhouse.altinity.com`
* `clickhouseinstallationtemplates.clickhouse.altinity.com`
* `clickhousekeeperinstallations.clickhouse-keeper.altinity.com`

Dependencies (SA + ConfigMap + Secret)

```bash
kubectl apply -f highflame-clickhouse-deps.yml
```

Keeper + ClickHouse cluster

```bash
kubectl apply -f highflame-clickhouse-values.yml
```

Verify the deployment:

```bash
kubectl -n clickhouse get pods
```

Backup secret

```bash
kubectl apply -f highflame-clickhouse-addons-values.yml
```

The setup Job runs once and exits `Completed`. The CronJob fires every 12h

#### Clickhouse S3 backup management (From the backup containers)

List down the remote backup

```bash
kpe chi-ch-cluster-0-0-0 /bin/bash -n clickhouse -c clickhouse-backup
/bin/clickhouse-backup -c /opt/backup-config.yml list remote
```

Restore from remote backup (For DR)

```bash
/bin/clickhouse-backup -c /opt/backup-config.yml restore_remote BACKUP_NAME
```

### Highflame service deployment

Follow this documentation to deploy Highflame services to your Kubernetes cluster using Helm charts.

Whether you're standing up a fresh environment or upgrading an existing one, this guide has you covered end to end.<br>

{% content-ref url="/pages/SOHIdzebu16rS55b9pZg" %}
[Highflame Services](/deployment-guides/highflame-services.md)
{% endcontent-ref %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.highflame.ai/deployment-guides/aws/single-region.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
