# Setup Guide

This guide walks you through connecting Tailscale Aperture to Highflame. Highflame supports both synchronous `pre_request` enforcement and asynchronous completed-request observability through the same hook endpoint.

### Prerequisites

* A Tailscale tailnet with [Aperture](https://tailscale.com/docs/aperture) deployed and configured with at least one LLM provider
* A Highflame account with an active project
* A Highflame agent or service API key available for the Aperture hook
* Access to the Aperture settings UI at `http://ai/ui/`
* A Shield policy setup for the project

{% stepper %}
{% step %}
**Generate a Highflame API key**

1. Log in or sign up to the [Highflame Platform](https://console.highflame.ai/).
2. Navigate to **Code Agents > Getting Started p**age.
3. Click on the **Tailscale Aperture** card and generate the API key.

Keep the key secure. Paste it into the Aperture hook configuration.

<figure><img src="/files/cPhuiWwRPBza1dYCOsFv" alt=""><figcaption></figcaption></figure>
{% endstep %}

{% step %}
**Configure the Highflame hook endpoint**

In Aperture settings, add a `highflame` hook under the top-level `hooks` section:

```json
"hooks": {
  "highflame": {
    "url": "https://cerberus.api.highflame.ai/v1/agent/events",
    "apikey": "<HIGHFLAME_API_KEY>",
    "timeout": "30s",
    "fail_policy": "fail_open"
  }
}
```

<table><thead><tr><th width="150.7109375">Field</th><th>Description</th></tr></thead><tbody><tr><td><code>url</code></td><td>Highflame endpoint for Aperture hook events. Use <code>POST /v1/agent/events</code>.</td></tr><tr><td><code>apikey</code></td><td>Your Highflame agent or service API key. Aperture sends it with the hook request so Highflame can resolve the tenant and project.</td></tr><tr><td><code>timeout</code></td><td>How long Aperture waits before timing out the hook request.</td></tr><tr><td><code>fail_policy</code></td><td>Use <code>fail_open</code> so a temporary Highflame-side issue does not block your users' LLM requests.</td></tr></tbody></table>
{% endstep %}

{% step %}
**Choose your hook mode**

Configure `send_hooks` inside the Aperture capability grant that should send traffic to Highflame.

{% tabs %}
{% tab title="Sync mode" %}
Use `pre_request` when Highflame should make an inline allow/block decision before Aperture forwards the request to the model provider:

```json
"send_hooks": [
  {
    "name": "highflame",
    "events": [
      "pre_request"
    ],
    "send": [
      "user_message",
      "request_body",
      "tools"
    ]
  }
]
```

This mode evaluates prompt and request context before the model responds. It is the right path for enforcing Shield policies on prompts, secrets, injection attempts, and other pre-provider risks.
{% endtab %}

{% tab title="Async mode" %}
Use `tool_call_entire_request` when Highflame should observe completed requests that produced tool calls:

```json
"send_hooks": [
  {
    "name": "highflame",
    "events": [
      "tool_call_entire_request"
    ],
    "send": [
      "user_message",
      "tools",
      "request_body",
      "response_body",
      "raw_responses",
      "estimated_cost"
    ]
  }
]
```

This mode runs after the provider response completes. It does not block or modify the request, but it gives Highflame enough context to show tool calls, command-like actions, sessions, users, and policy telemetry in Code Agents.

If you want an audit event for every completed request, even when no tools are called, use `entire_request` instead of `tool_call_entire_request`.
{% endtab %}
{% endtabs %}

If you want to limit who or what triggers Highflame evaluation, narrow `src` or `models` in the surrounding grant instead of using wildcards.
{% endstep %}

{% step %}
**Configure Shield policy behavior**

Highflame does not force a mode from the Aperture hook. Your Shield policy configuration decides whether a request is monitored or enforced.

Use policy mode intentionally:

* Monitor mode logs would-block events while still returning `{"action":"allow"}` to Aperture for synchronous hooks.
* Enforce mode can return `{"action":"block"}` to Aperture and stop synchronous `pre_request` traffic before it reaches the provider.
* Asynchronous hooks always run after the provider response, so they are for visibility and audit rather than inline blocking.

This lets you start with observation-only policies and later move selected policies to blocking mode without changing the Aperture hook endpoint.
{% endstep %}

{% step %}
**Save and verify**

1. Save your Aperture configuration.
2. Send an LLM request through your Aperture proxy.
3. In Highflame, confirm that the request appears in the dashboard with the expected Shield decision.

For monitor-mode policies, the hook response should allow the request while Highflame records the would-block decision in policy telemetry.
{% endstep %}
{% endstepper %}

### Example request through Aperture

Use your Aperture base URL as the OpenAI-compatible endpoint. For clients running inside your tailnet, Aperture URLs commonly use `http://` because Tailscale already protects the connection.

```bash
curl -i http://<aperture-host>/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Summarize why pre-request hooks are useful."
      }
    ]
  }'
```

A successful request means Aperture forwarded the provider request. To verify the Highflame evaluation, check the Highflame dashboard for a `process_prompt` event from source `aperture`.

### Example tool-call request

Use a tool definition when you want to verify asynchronous tool-call observability:

```bash
curl -i http://<aperture-host>/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "What is the weather in Delhi?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get weather for a city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {
                "type": "string"
              }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'
```

With `pre_request`, Highflame evaluates the prompt before the provider generates a tool call. With `tool_call_entire_request`, Highflame receives the completed request and extracted tool calls after the provider response.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.highflame.ai/integrations/tailscale/setup-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
