Calling Guardrail APIs

REST API reference for Highflame Shield guardrails — evaluate prompts, tool calls, files, and responses against detection and Cedar policy.

All guardrail evaluation goes through a single endpoint. The SDK methods (client.guard.evaluate(), client.guard.evaluate_prompt(), etc.) are thin wrappers over these REST calls.

For the full endpoint reference including streaming and detection-only calls, see Shield REST APIs.


Base URL and Authentication

https://shield.api.highflame.ai

All requests require a Bearer JWT and account/project scope headers. Service keys (hf_sk_...) are exchanged for JWTs automatically by the SDK — for direct REST calls, exchange first via your token endpoint.

Authorization: Bearer <jwt>
X-Account-ID: <account-id>
X-Project-ID: <project-id>

POST /v1/guard

Runs the tiered detection pipeline then Cedar policy evaluation. Returns a structured allow/deny decision.

Required fields: content, content_type, action

Field
Type
Values

content

string

The text to evaluate

content_type

string

"prompt" | "response" | "tool_call" | "file"

action

string

"process_prompt" | "call_tool" | "read_file" | "write_file" | "connect_server"

mode

string

"enforce" | "monitor" | "alert"

session_id

string

Scoped session identifier for multi-turn tracking

explain

bool

Return Cedar policy trace and root causes

debug

bool

Return raw detector results

Evaluate a prompt

Response:

Evaluate a tool call

Response:

Evaluate a model response

Get the full policy trace

Add "explain": true and "debug": true to any request to get the Cedar policy trace, root causes, and raw detector output:

Response includes additional fields:


POST /v1/detect

Detection only — runs the detector pipeline without Cedar policy evaluation. Useful for observability, shadow mode, or building custom policy logic on top of raw signal scores.

Response:


Enforcement Modes

Mode

decision

alerted

Use for

"enforce"

"allow" or "deny"

false

Production blocking

"monitor"

Always "allow"

false

Shadow testing — logs decisions without blocking

"alert"

Always "allow"

true if violated

Alerting without blocking


SDK Equivalents

These REST calls map directly to SDK methods:

REST call
Python SDK
TypeScript SDK

POST /v1/guard with content_type: "prompt"

client.guard.evaluate_prompt(text)

client.guard.evaluatePrompt(text)

POST /v1/guard with content_type: "tool_call"

client.guard.evaluate_tool_call(name, args)

client.guard.evaluateToolCall(name, args)

POST /v1/guard (full)

client.guard.evaluate(GuardRequest(...))

client.guard.evaluate({...})

POST /v1/detect

client.detect.run(DetectRequest(...))

client.detect.run({...})

Last updated