FAQ

Common questions about deploying and operating Highflame.


General

What are the minimum infrastructure requirements?

Highflame Shield requires a container runtime (Docker or Kubernetes), at least 2 vCPUs and 2 GB RAM per instance for light workloads, and a PostgreSQL-compatible database for policy and audit storage. For production deployments, see the AWS Deployment Guide for recommended sizing.

Does Highflame require outbound internet access?

Shield can run fully self-hosted with no outbound internet access. Model provider calls go wherever you configure them. The only external dependency is your LLM provider endpoint. Telemetry and update checks are opt-in.

What databases does Highflame support?

PostgreSQL 14+ is the primary supported database. RDS (AWS), Azure Database for PostgreSQL, and Cloud SQL (GCP) are all supported managed options.


Authentication & API Keys

Where do I find my API key?

API keys are available in the Highflame console under Account → Developer Settings. Service keys use the hf_sk_... prefix. ZeroID API keys use the zid_sk_... prefix — these are separate credentials for the identity layer.

What is the difference between hf_sk_... and zid_sk_... keys?

hf_sk_... keys authenticate to the Shield API (guardrails, detection, policy evaluation). zid_sk_... keys authenticate to ZeroID (agent identity registry, token issuance). If you are using both products, you will have both key types.

How long do access tokens last?

Shield access tokens (exchanged from service keys) are short-lived JWTs, typically valid for 1 hour. ZeroID agent tokens default to the TTL configured in the credential policy, usually 15–60 minutes. Both are automatically refreshed by the SDK.


Deployment

Can I run Highflame in a private VPC with no public ingress?

Yes. Shield is designed for private VPC deployment. Route traffic from your agents and services through an internal load balancer. ZeroID similarly supports private deployment — configure your agents to point at the internal ZeroID endpoint.

How do I handle high availability?

For production, run at least two Shield instances behind a load balancer. Shield is stateless at the request level — policy state is in the database and in-memory cache. See the AWS Deployment Guide for an active-passive HA/DR configuration spanning two regions.

How do I upgrade Highflame?

Pull the new container image and perform a rolling restart. Shield instances are stateless and support zero-downtime rolling updates. Database migrations (if any) run automatically on startup and are backward-compatible with the previous version.

What ports does Shield listen on?

By default: 8080 (HTTP). TLS termination is handled at the load balancer or ingress layer. Configure the port via the SHIELD_PORT environment variable.


Networking & Latency

What latency should I expect from Shield?

Shield targets sub-10ms p99 latency for guardrail evaluation at steady state with warm caches. Cold-start (first request after deployment) may be higher due to policy loading. Detection-only calls (POST /v1/detect) are typically faster than full guard evaluations.

Can I use Shield as a forward proxy?

Shield is not a forward proxy — it is an inline evaluation API. Your application calls Shield before (and optionally after) calling the LLM provider. For a gateway pattern, call client.guard.evaluate_prompt() before forwarding the request to the provider.


Policies & Guardrails

Where are Cedar policies stored?

Cedar policies are stored in the Highflame database and loaded into memory at startup. Use the client.debug.policies() SDK call or GET /v1/debug/policies to inspect the currently loaded policy set.

How do I test policies without enforcing them?

Set mode: "monitor" in your guard request. Shield evaluates the request fully and returns the decision, but does not block. Use this during rollout to validate policy coverage before switching to enforce.

Can I write custom detectors?

Custom detector configuration is available via the template API. The built-in detector library covers prompt injection, jailbreaks, PII, toxicity, and code patterns. Contact the Highflame team for guidance on integrating custom detection models.


ZeroID

How is ZeroID different from a standard OAuth 2.0 server?

ZeroID is purpose-built for non-human workloads. It adds a first-class identity registry (agents, services, MCP servers), a WIMSE/SPIFFE URI per identity, RFC 8693 token exchange for agent delegation chains, and Continuous Access Evaluation (CAE) signals for real-time revocation. It speaks standard OAuth 2.0 grant types so existing tooling works with it.

Do I need ZeroID to use Shield?

No. Shield and ZeroID are independent products. Shield evaluates requests against your guardrail policies regardless of how the caller authenticated. ZeroID adds structured agent identity to the mix — useful when you need delegation chains, per-agent revocation, or identity-aware policy decisions.

How do I revoke an agent's access immediately?

Issue a CAE signal via client.signals.ingest() with signal_type: "session_revoked". ZeroID propagates the signal and any subsequent token introspection for that identity will return active: false. For full immediate revocation, also rotate or revoke the agent's API key.


What's Next?

  • Learn how to configure your platform in the Quick Start Guide for Administrators.

  • See cloud-specific deployment instructions in the AWS, Azure, and GCP deployment guides.

Last updated