The Security Layer AI Agents Actually Need

The Security Layer AI Agents Actually Need

AI agents are making real decisions — calling APIs, moving money, filing compliance reports. Most of them run with a static API key that never expires and has no scope limits. Caracal is the open-source system built to fix this: pre-execution authority enforcement, short-lived tokens, real-time revocation, and a tamper-proof audit trail built on Merkle trees. Here is a deep technical look at how it works.

MU
Muhammad Umar Aziz
@ Umar-Aziz
8 min read

The Problem Nobody Is Talking About

Picture this: a financial company deploys an AI agent to handle their global payout cycle. The agent autonomously calls Mercury Bank, Stripe Treasury, SAP ERP, and a compliance filing service. It processes thousands of invoices. Nobody is watching every decision it makes.

Now ask: what stops that agent from calling Mercury Bank at 3am with no authorization? What stops a compromised sub-agent from escalating its own privileges and accessing systems it was never meant to touch? What stops a rogue agent from making an irreversible wire transfer that nobody approved?

In most deployments today — nothing.

The typical setup looks like this:

AI Agent → API_KEY=sk-abc123 (never expires, full access) → Bank API

That API key sits in a .env file. It has no expiry. It has no scope. If the agent is hijacked, the attacker has full access — forever. If the agent misbehaves, there is no kill switch. If you want to know what happened, you check application logs — if they exist, if they have not been tampered with, if anyone thought to build them.

This is the state of AI agent security in 2026. And it is not good enough.

Caracal is the open-source project that fixes this.

Post image

What Caracal Is

Caracal is a pre-execution authority enforcement system for AI agents. Built by Garudex Labs, it sits at the exact boundary where agent decisions become irreversible real-world actions — API calls, database writes, financial transactions, compliance filings.

Its core philosophy is uncompromising:

No action executes unless there is a cryptographically verified, time-bound mandate issued under a governing policy.

Not "probably authorized." Not "the key looks right." Explicitly, verifiably, cryptographically authorized — or blocked. Every single time, on every single request.

Caracal is open-source under the Apache 2.0 license. It was selected among the top 50 globally in the GitHub Secure Open Source Fund and was featured on GitHub's Open Source Friday. It operates under the LF Decentralized Trust ecosystem.

The 3 Core Primitives

Three concepts anchor every decision Caracal makes.

1. Machine-Enforceable Policy

Policies in Caracal are not dashboard checkboxes. They are rules written in Rego — a declarative policy language — evaluated by the Security Token Service on every token request. A human cannot override them at runtime. A misconfigured agent cannot bypass them. If the policy returns false, no token is issued. Period.

package caracal.policy
 
default allow = false
 
allow if {
    input.app_id != ""
    input.resource_id == "lynx/mercury-bank"
    input.scopes[_] == "write"
}

This is the difference between security as configuration and security as code.

2. Explicit Delegation

When an orchestrator spawns a sub-agent and delegates work, the sub-agent receives a scoped token — one with strictly less authority than the parent, never more. The delegation chain is preserved in full. If an orchestrator's grant is revoked, all child agents collapse immediately. There is no grace period.

3. Fail-Closed Enforcement

If the Security Token Service is unreachable, the Gateway does not fall back to allowing requests. It fails closed. No token, no access, no exceptions. In systems making irreversible financial or compliance decisions, this is the correct and only acceptable tradeoff.

The 5 Components

Caracal is not a single monolithic service. It is five specialized services, each with a distinct and non-overlapping responsibility.

Post image

Management API — The Control Plane

The Management API is where administrators define the entire security model before any agent runs. It exposes four key primitives:

A Zone is an isolated security boundary. Think of it as a walled city — everything inside belongs together, nothing leaks out to other zones. A company with multiple AI products gives each one its own zone.

An Application is the registered identity of an AI system within a zone. It holds a cryptographic credential used to authenticate with the STS.

A Resource is any external service or API endpoint that needs protecting. By registering a resource, you are explicitly telling Caracal that access to it must be authorized. An unregistered resource cannot be reached through the Gateway — it does not exist to the system.

A Grant is the explicit permission slip: this application is allowed to access this resource with these specific scopes. Without a grant, access is not just restricted — it is architecturally impossible.

# Building the security model in Caracal
 
caracal zone create --name "lynx"
# → id: 019e3ab8-db5c-7538-95ed-7917c03f20aa
 
caracal app create --name "lynx" --zone 019e3ab8...
# → application_id + client_secret (shown once)
 
caracal resource create \
  --zone 019e3ab8... \
  --identifier "lynx/mercury-bank" \
  --upstream-url "http://mercury-bank.internal" \
  --scopes "read,write"
# → resource_id: 019e3abe-3b48...
 
caracal grant create \
  --zone 019e3ab8... \
  --app 019e3ae9... \
  --resource 019e3abe-3b48... \
  --scopes "read,write"
# → grant active. Agent can now reach mercury-bank.

Security Token Service — The Heart

The STS is the most critical component in the entire system. It is the only entity that can issue tokens, and it does so only after completing a strict evaluation sequence.

Post image

Gateway — The Runtime Enforcer

The Gateway is the chokepoint. Every agent request to every protected resource passes through it. There is no side door.

When a request arrives, the Gateway performs three sequential checks:

Check 1 — Token present? No Authorization header → 401, event logged to Audit Ledger. Request dies here.

Check 2 — Token valid and not revoked? The Gateway verifies the cryptographic signature and simultaneously queries Redis for a revocation event. An invalid or revoked token → 401, logged. Request dies here.

Check 3 — Token scope covers this resource? Does the token's scope claim include the target resource with the required action? No match → 403, logged. Request dies here.

Only after all three pass does the Gateway forward the request upstream.

The Redis revocation check is what makes revocation genuinely real-time. When a grant is revoked by an administrator, the event propagates through Redis pub/sub to every connected Gateway instance within milliseconds. There is no window of continued access while waiting for token expiry. The agent is stopped immediately.

Agent Coordinator — The Multi-Agent Brain

The Coordinator manages agent lifecycle and the delegation graph between agents. This is where Caracal's approach becomes genuinely novel compared to any access control system that came before it.

Post image

When an orchestrator spawns a sub-agent and delegates a task, the Coordinator issues the child a scoped token — a token carrying strictly fewer permissions than the parent. The child cannot expand its own scope. It cannot grant itself access to resources the parent did not have. This constraint is enforced at the cryptographic level in the token structure, not by convention or runtime checks that could be bypassed.

This directly addresses ASI03 from the OWASP Top 10 for Agentic Applications 2026: Identity and Privilege Abuse — the exploitation of excessive permissions or impersonation leading to privilege escalation in multi-agent pipelines.

The revocation cascade is equally critical. When an orchestrator's grant is revoked, every child and grandchild agent in the delegation tree loses authority simultaneously. There is no window. There is no waiting for token expiry across the tree. The entire subtree collapses.

Audit Ledger — The Court Recorder

The Audit Ledger records every single decision in the system: every token issuance, every gateway allow, every gateway deny, every delegation, every revocation. What makes it different from application logs is the underlying data structure.

Post image

In a Merkle tree, each audit entry is individually hashed. Pairs of hashes are combined into branch nodes and hashed again, propagating upward to a single root hash. If anyone modifies, deletes, or reorders any entry — even flipping a single bit — the root hash changes. The tampering is mathematically detectable without comparing every record individually.

This is the same cryptographic structure used in Bitcoin to verify transactions. Applied to an audit log, it means the record of what every AI agent did is not just stored it is proven.

Video :

GitHub Open Source Friday Caracal: A runtime execution authority layer for AI agents in production. Presented by Garudex Labs.

Open video URL

Why This Architecture Matters in 2026

The timing of Caracal's design is not accidental. The OWASP Top 10 for Agentic Applications 2026, published in December 2025, formalized the threat taxonomy that has been building as AI agents move into production environments:

  • ASI01 — Goal Hijacking: agents manipulated into pursuing attacker objectives
  • ASI02 — Tool Misuse: agents exploited to perform unauthorized actions beyond intended scope
  • ASI03 — Identity and Privilege Abuse: excessive permissions and credential theft Caracal's five-component architecture directly addresses the structural causes of all three:

No ambient authority. An agent cannot act simply because it exists. It must hold an explicitly issued, time-bound, scoped token for every action. There is no default access.

Delegation collapse. A compromised parent agent cannot be used to maintain child agent access. Revoke the parent, lose the entire subtree. Instantly.

Fail-closed enforcement. If the STS is unreachable, the Gateway blocks everything. Not a degraded mode. Not a fallback. A hard stop.

Cryptographic proof at every layer. Tokens are signed and verified, not trusted by convention. Audit entries are Merkle-chained, not just stored. Every claim is proven, not assumed.

Real-time revocation. The millisecond a grant is revoked, every Gateway instance in the fleet stops accepting it. There is no window of continued access.

This is the difference between access control as a list of rules and access control as a continuous cryptographic proof of authorization required at every action boundary.

Explore Caracal

The full source code, architecture documentation, SDK references in TypeScript, Go, and Python, and a complete working demo application are available on GitHub:

github.com/Garudex-Labs/caracal

The repository includes the LynxCapital example a fully autonomous financial agent swarm processing global payout cycles under live Caracal authority enforcement. It is the most concrete demonstration of what pre-execution authority enforcement looks like in a real production scenario.


This is the first article in a series on Caracal. Coming next: a hands-on breakdown of the LynxCapital demo how autonomous AI agents process a global payout cycle under live authority enforcement, what the real-time delegation graph looks like in the TUI, and what happens when an orchestrator is revoked mid-run.

Subscribe to Updates

Get notified about new projects and articles.

16
0

Comments

Loading comments...