Setup Guide

This guide walks you through connecting Tailscale Aperture to Highflame. Highflame supports both synchronous pre_request enforcement and asynchronous completed-request observability through the same hook endpoint.

Prerequisites

  • A Tailscale tailnet with Aperture deployed and configured with at least one LLM provider

  • A Highflame account with an active project

  • A Highflame agent or service API key available for the Aperture hook

  • Access to the Aperture settings UI at http://ai/ui/

  • A Shield policy setup for the project

1

Generate a Highflame API key

  1. Log in or sign up to the Highflame Platform.

  2. Navigate to Code Agents > Getting Started page.

  3. Click on the Tailscale Aperture card and generate the API key.

Keep the key secure. Paste it into the Aperture hook configuration.

2

Configure the Highflame hook endpoint

In Aperture settings, add a highflame hook under the top-level hooks section:

"hooks": {
  "highflame": {
    "url": "https://cerberus.api.highflame.ai/v1/agent/events",
    "apikey": "<HIGHFLAME_API_KEY>",
    "timeout": "30s",
    "fail_policy": "fail_open"
  }
}
Field
Description

url

Highflame endpoint for Aperture hook events. Use POST /v1/agent/events.

apikey

Your Highflame agent or service API key. Aperture sends it with the hook request so Highflame can resolve the tenant and project.

timeout

How long Aperture waits before timing out the hook request.

fail_policy

Use fail_open so a temporary Highflame-side issue does not block your users' LLM requests.

3

Choose your hook mode

Configure send_hooks inside the Aperture capability grant that should send traffic to Highflame.

Use pre_request when Highflame should make an inline allow/block decision before Aperture forwards the request to the model provider:

"send_hooks": [
  {
    "name": "highflame",
    "events": [
      "pre_request"
    ],
    "send": [
      "user_message",
      "request_body",
      "tools"
    ]
  }
]

This mode evaluates prompt and request context before the model responds. It is the right path for enforcing Shield policies on prompts, secrets, injection attempts, and other pre-provider risks.

If you want to limit who or what triggers Highflame evaluation, narrow src or models in the surrounding grant instead of using wildcards.

4

Configure Shield policy behavior

Highflame does not force a mode from the Aperture hook. Your Shield policy configuration decides whether a request is monitored or enforced.

Use policy mode intentionally:

  • Monitor mode logs would-block events while still returning {"action":"allow"} to Aperture for synchronous hooks.

  • Enforce mode can return {"action":"block"} to Aperture and stop synchronous pre_request traffic before it reaches the provider.

  • Asynchronous hooks always run after the provider response, so they are for visibility and audit rather than inline blocking.

This lets you start with observation-only policies and later move selected policies to blocking mode without changing the Aperture hook endpoint.

5

Save and verify

  1. Save your Aperture configuration.

  2. Send an LLM request through your Aperture proxy.

  3. In Highflame, confirm that the request appears in the dashboard with the expected Shield decision.

For monitor-mode policies, the hook response should allow the request while Highflame records the would-block decision in policy telemetry.

Example request through Aperture

Use your Aperture base URL as the OpenAI-compatible endpoint. For clients running inside your tailnet, Aperture URLs commonly use http:// because Tailscale already protects the connection.

curl -i http://<aperture-host>/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": "Summarize why pre-request hooks are useful."
      }
    ]
  }'

A successful request means Aperture forwarded the provider request. To verify the Highflame evaluation, check the Highflame dashboard for a process_prompt event from source aperture.

Example tool-call request

Use a tool definition when you want to verify asynchronous tool-call observability:

With pre_request, Highflame evaluates the prompt before the provider generates a tool call. With tool_call_entire_request, Highflame receives the completed request and extracted tool calls after the provider response.

Last updated