> For the complete documentation index, see [llms.txt](https://docs.highflame.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.highflame.ai/deployment-guides/support.md).

# Support

## BYOC Customer Access via Tailscale

> Operating `kubectl` against private customer clusters across AWS, GCP, and Azure — without VPNs, public endpoints, or per-customer firewall rules.

### Why this exists

Highflame is deployed into customer-owned VPCs across AWS, GCP, and Azure (BYOC). The K8s API server and Studio UI are private — reachable only from inside the customer's network. Our support and SRE teams still need:

* `kubectl` against each customer's private cluster
* Browser access to each customer's Studio UI - Optional for debugging

The Tailscale Kubernetes operator solves all. The customer installs **one Helm chart**. From there, every Highflame-internal service (k8s resources) can be exposed by a single Service annotation — no new NLBs, no PrivateLink endpoints, no inbound firewall rules.

### Architecture

<figure><img src="/files/Oj0IY61UX4YQnkOq8wqe" alt=""><figcaption></figcaption></figure>

### How it works

The Tailscale operator runs as a pod inside the customer's cluster. It dials **outbound** to Tailscale (UDP 41641 or TCP 443 DERP relay) and registers as a node on **our** tailnet, tagged with a per-customer tag. With `apiServerProxyConfig.mode="true"`, it also exposes the cluster's K8s API at a stable tailnet hostname like `customer-kube.tail-scale.ts.net` and proxies authenticated requests into the API server using our tailnet identity. Customer revokes by removing the tag; we revoke by deleting the operator. No inbound firewall rules ever touch the customer's network.

### Security properties — what a customer's security team needs to know

**Customer network — perimeter stays untouched**

* **No inbound firewall rule is ever added on the customer side.** The operator pod dials outbound only; there is no listener on the customer's VPC reachable from the public internet.
* **No public IP, NLB, or PrivateLink endpoint is provisioned** for the K8s API server or Studio. They remain on private cluster addresses.
* **Egress is narrowly scoped** — only `*.tailscale.com:443/tcp` (plus optional UDP for direct peer-to-peer). See Customer egress requirements. The operator does not make arbitrary internet calls.
* **End-to-end WireGuard encryption.** Tailscale's coordination and DERP servers route ciphertext only; they cannot decrypt the kubectl.
* **No third-party data plane.** Traffic flows through Tailscale's coordination/DERP infrastructure only. No CDN, no inspection proxy, no shared bastion.

**Access — minimum-privilege, identity-bound**

* **Only named Highflame SRE engineers** can reach the cluster, authenticated via their Tailscale identity. No shared bearer tokens, no static API keys the customer has to rotate.
* **Read-only by default.** Highflame SRE is bound to a custom `highflame-sre` ClusterRole that excludes `secrets`, RBAC objects, and `pods/exec` / `pods/portforward`. The scope is configurable per cluster.
* **Per-customer isolation.** Each customer has its own OAuth clients and per-customer tag chain (`tag:k8s-operator-<customer>`, `tag:k8s-<customer>`, `tag:cust-<customer>`) — shared across that customer's environments (dev/staging/prod), distinct between customers.
* **Short-lived auth keys.** The operator's OAuth client mints ephemeral, single-use auth keys per device registration. No long-lived credential is ever stored on a registered tailnet node — only the OAuth client secret (held in a single in-cluster Kubernetes Secret).
* **All access is logged in the K8s audit log** as the impersonated group `highflame-sre`. Customer's existing audit pipeline captures every request without changes.

**Visibility — opt-in service exposure**

* **Nothing is exposed by default.** A Service inside the cluster is only reachable from Highflame's tailnet if you opt it in with either an `tailscale.com/expose: "true"` annotation on the Service or a Tailscale `Ingress` resource pointing at it. Removing the annotation / deleting the Ingress removes the exposure.
* **Authenticated K8s API access only.** The operator's API-server proxy rejects connections without a valid Tailscale identity and a `tailscale.com/cap/kubernetes` impersonation grant. Anonymous TCP from the tailnet is dropped, not just unauthorized API calls.

**Revocation — either side can cut access independently**

* **Customer kill switch:** `helm uninstall tailscale-operator -n tailscale`. Every Highflame device on the tailnet loses routing to the cluster within seconds. No Highflame coordination required.
* **Highflame kill switch:** deleting the customer's tag entries from Highflame's tailnet ACL revokes access for all SRE engineers in one edit. No customer coordination required.
* **Per-engineer revocation:** removing an SRE's email from `group:highflame-sre` in the tailnet ACL revokes their access to every customer cluster simultaneously, in one edit, without touching any customer.

**Audit & monitoring**

* **K8s API server audit log** — every `kubectl` call shows up under impersonated identity `highflame-sre` on the customer's existing audit pipeline.
* **Tailscale audit log** — every Highflame-side connection (source identity → destination tag, timestamp) is recorded in Highflame's tailnet (shareable on request).
* **Tailscale device registration log** — every operator/proxy pod registration is logged with its OAuth client ID, so an unexpected device join can be traced to a specific credential.

**Supply chain**

* **Open-source operator.** Image source: <https://github.com/tailscale/tailscale/tree/main/cmd/k8s-operator>. Auditable, reproducible.
* **Pinned image tag.** The helm install pins both operator and proxy images to a specific Tailscale release (currently `v1.96.5`). No `latest`, no auto-upgrade — upgrades require an explicit `helm upgrade` with a new tag.
* **Single namespace footprint.** All operator and proxy pods run in the `tailscale` namespace. The customer can apply a NetworkPolicy to restrict operators' in-cluster reach further, if desired (e.g., limit to specific namespaces).
* **Single chart, single Service annotation.** No CRDs the customer has to maintain, no sidecars injected into customer workloads.

**Cryptographic & identity properties**

* **WireGuard (Noise IK)** for the data plane between tailnet peers.
* **TLS 1.3 with publicly-trusted (Let's Encrypt) certificates** for the API-server proxy hostname. The certificate is renewed automatically by the operator; no manual rotation.

### Customer egress requirements

There are **no inbound** firewall rules to add on the customer side, but the operator pod must reach Tailscale outbound. Most BYOC customers run default-deny egress, so they need to allowlist the destinations below before the helm install will succeed.

**Minimum allowlist (TCP only — works in every environment):**

| Destination                                    | Port | Protocol | Purpose                                                               |
| ---------------------------------------------- | ---- | -------- | --------------------------------------------------------------------- |
| `controlplane.tailscale.com`                   | 443  | TCP      | Coordination server (key exchange, peer discovery)                    |
| `login.tailscale.com`                          | 443  | TCP      | OAuth auth during operator startup                                    |
| `derp1.tailscale.com` … `derp30.tailscale.com` | 443  | TCP      | DERP relay (encrypted WebSocket fallback)                             |
| `pkgs.tailscale.com`                           | 443  | TCP      | Operator image pulls (skip if image is mirrored to customer registry) |

A single wildcard rule — `*.tailscale.com:443/tcp` — covers all four and is what most customers ultimately write.

### One-time setup (Highflame side)

The Highflame team will manage it and can refer to this [doc](https://github.com/highflame-ai/highflame-cloud/blob/main/reference/docs/tailscale-k8s-access.md)

### Install the operator

#### Prerequisites <a href="#prerequisites" id="prerequisites"></a>

You will need the following before starting.

#### Tooling on your workstation

| Tool      | Minimum version | Notes                                 |
| --------- | --------------- | ------------------------------------- |
| `kubectl` | 1.32            | Configured against the target cluster |
| `helm`    | 3.x             | For helm charts deployment            |

#### Credentials and licenses

You will need the following from your Highflame representative:

* **Highflame Tailscale Credentials** — `OAUTH_CLIENT_ID`, `OAUTH_CLIENT_SECRET`

#### Helm deployment *(per environment)*

All values are passed inline — no value files to manage on the customer side. Customer runs into their target `kube-context`:

```bash
helm repo add tailscale https://pkgs.tailscale.com/helmcharts
helm repo update

CUSTOMER=customer                                          # ← customer name (used in TAGS, same for all envs)
DEPLOY_ENV=prod                                            # ← per-env identifier (used in HOSTNAMES, unique per env)
OAUTH_CLIENT_ID="<customer-prod oauth client id>"
OAUTH_CLIENT_SECRET="<customer-prod oauth client secret>"

helm upgrade --install tailscale-operator tailscale/tailscale-operator \
  --namespace tailscale --create-namespace \
  --version 1.96.5 \
  --set-string oauth.clientId="${OAUTH_CLIENT_ID}" \
  --set-string oauth.clientSecret="${OAUTH_CLIENT_SECRET}" \
  --set apiServerProxyConfig.mode="true" \
  --set operatorConfig.hostname="${CUSTOMER}-${DEPLOY_ENV}-kube" \
  --set operatorConfig.defaultTags[0]="tag:cust-${CUSTOMER}" \
  --set-string proxyConfig.defaultTags="tag:k8s-${CUSTOMER}" \
  --set operatorConfig.image.tag="v1.96.5" \
  --set proxyConfig.image.tag="v1.96.5" \
  --wait
```

| Variable                 | Used in                                    | Same across envs of the customer?                                                  |
| ------------------------ | ------------------------------------------ | ---------------------------------------------------------------------------------- |
| `CUSTOMER`               | `tag:cust-…`, `tag:k8s-…` (Tailscale tags) | ✅ Yes — `customer` for dev/prod/test                                               |
| `DEPLOY_ENV`             | `operatorConfig.hostname` (`<env>-kube`)   | ❌ No — each env gets a distinct tailnet hostname so SRE can target it specifically |
| `OAUTH_CLIENT_ID/SECRET` | helm `oauth.*` values                      | ❌ No — one per env (rotate independently)                                          |

Verify the operator pod is `Running`:

```bash
kubectl get pods -n tailscale
# operator-7c5b8f8b6c-xxxxx   1/1   Running
```

Verify in the Tailscale machine dashboard as well - the Highflame representative will do this for you.

#### Apply K8s RBAC for the tailnet identity

Customer applies (one file, **identical across all customers**):

```yaml
# rbac-highflame-sre.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: highflame-sre-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind:     ClusterRole
  name:     cluster-admin          # "cluster-admin" or "view"
subjects:
  - kind:    Group
    name:    highflame-sre
    apiGroup: rbac.authorization.k8s.io
```

{% hint style="info" %}
Choose the role according to the support requirements\
\
\- `cluster-admin` : Full permission to the K8s cluster\
\- `view` : Readonly permission to the K8s cluster
{% endhint %}

```bash
kubectl apply -f rbac-highflame-sre.yaml
```

This is what lets the K8s API server recognize a request from the operator's proxy as the group `highflame-sre`, which we mapped to our SRE team in the tailnet ACL grant.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.highflame.ai/deployment-guides/support.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
