For FinOps and Engineering Leadership

Hound your AI bill.
Before it hounds you.

CapHound brings FinOps governance to AI spend. Attribute every dollar to the team, feature, or customer driving it. Enforce budgets in real time. End AI cost surprises at month-end.

No credit card · Up to $5K/mo on the free tier

CapHound·Control Center
Live

2 items need triage— click to jump to Open risk signals.

Decision engineToday
Blocked847
Modified2,340
Allowed38,412
Total spend · this month
Watch
$12,847
90% of $14,200
Top spend by feature
customer-support-chat
$5,142 · 40% · +18% PoP
Top spend by model
gpt-4o
54% of spend · +12% PoP
Recent decisionsStreaming
Blocked
gpt-4o → free-tier user
Budget exceeded · feature: chat
2s ago
Routed
gpt-4o → gpt-4o-mini
Dev environment · auto-downgrade
5s ago
Allowed
claude-3-5-sonnet
Customer: acme · within budget
9s ago
Blocked
gpt-4o → batch-job
Rate limit · staging
14s ago

Works with the providers you use

OpenAI
Anthropic
Gemini

The Decision Engine in production

Every LLM request flows through CapHound's decision engine.

Block, modify, or allow — recorded with full Access → Budget → Optimization reasoning. Customer-level decisions live in your dashboard; CapHound never aggregates tenant data publicly.

Decision engineA typical dayIllustrative
Blocked847
Modified2,340
Allowed38,412
Total41,599
Blockedbudget

growth-engineering hit its $3,000 monthly cap. Hard block at 429 — caller gets a clear error before any provider charge.

Modifiedoptimization

Workspace at 91% of budget. gpt-4o auto-downgraded to gpt-4o-mini for low-stakes traffic.

Allowedaccess

customer-support-chat within all guardrails. Logged with feature, customer, and team for chargeback.

AI costs don't drift.
They spike.

By the time the invoice arrives, the damage is already done.

4,000+calls / minute · undetected

Runaway requests, no early warning

A retry loop. A pagination bug. A misconfigured prompt. One mistake fires thousands of API calls before anyone sees it. Your first warning is the invoice.

cost overnight · zero alerts

Silent cost doubling on model swaps

An engineer swaps to a more capable model — for the right reasons. Per-request cost doubles overnight. No alert. No review. Just a bigger bill at month-end.

70%spend in one feature · unknown

Spend concentrated in one workflow

70% of your AI bill comes from a single feature or customer. Probably. You can't prove it — there's no per-feature attribution in your provider's dashboard.

0answers when finance asks

Finance asks. Engineering can't answer.

Your CFO sees a $40K AI line item and asks who's spending it. Engineering can't break it down. Both teams lose trust. The bill keeps growing.

Enforce. Attribute. Control.

CapHound doesn't just report on your AI usage — it governs it in real time.

Enforcement

Set hard limits per feature, team, or customer. When the budget hits, CapHound blocks the request. Not an alert — a block.

Attribution

Every request tagged by feature, team, environment, and customer. You know exactly what caused the spike — and who to talk to.

Routing

Define rules. CapHound routes automatically. Dev environments use cheap models. Free users hit the cheaper tier. No app changes.

Decision Engine

Every decision,
fully audited.

Watch each request flow through Access, Budget, and Optimization checks. See exactly what was blocked, what was modified, and why — with timestamps and reasoning that finance and engineering can both read.

Decision Log·Last 24h
Streaming
BLOCK
gpt-4o blocked for free-tier user · feature: chat
2s ago·$0.043 saved
Accessallow

API key validated

Budgetblock

Monthly limit exceeded · $5,000 / $5,000

Optimizationskipped

Pipeline halted at budget

Final action:BLOCKED
MODIFY
gpt-4o → gpt-4o-mini · dev environment · auto-downgrade
5s ago·$0.018 → $0.001
Accessallow

API key validated

Budgetallow

Within limit · $4,210 / $5,000

Optimizationmodify

Downgrade rule: env=dev → switch to gpt-4o-mini

Final action:MODIFIED
ALLOW
claude-3-5-sonnet · customer: acme-corp · feature: doc-analysis
9s ago·$0.062
Accessallow

API key validated · scope: production

Budgetallow

Within customer budget · $1,840 / $5,000

Optimizationallow

No routing rules matched

Final action:ALLOWED
Showing 3 of 1,247 decisionsOpen in Control Center
Integration

Drop in.
Two lines.

CapHound mirrors the OpenAI API exactly. Change two lines of config — your existing code stays the same.

main.py
Python · OpenAI SDK
1
from openai import OpenAI
2
 
3
client = OpenAI(
+
    api_key="caphound_live_...",
+
    base_url="https://api.caphound.ai/v1",
6
)
7
 
8
response = client.chat.completions.create(
9
    model="gpt-4o",
10
    messages=[...],
+
    extra_headers={"x-caphound-feature": "chat"},
12
)
2 lines required · 1 line optional+ 3 added · 0 removed

What changes

Two required lines. One optional.

1

Use your CapHound key

Replace your OpenAI API key with one CapHound generated for your workspace.

api_key="caphound_live_..."
2

Point base_url to CapHound

Your requests now route through us. We forward to OpenAI/Anthropic/Gemini behind the scenes.

base_url="https://api.caphound.ai/v1"
3

Tag with feature (optional)

Optional

Adds attribution metadata so you can break down spend by feature, team, or customer.

extra_headers={"x-caphound-feature": "chat"}

That's it. Same SDK calls. Same response shape. Same models. Zero refactor.

Python & Node.jsOpenAI-compatibleStreaming supported
Privacy by architecture

We never see
your prompts.

CapHound operates on metadata only — model, cost, tags, and decision context. Prompts, completions, and request bodies travel directly between your application and the LLM provider.

This isn't a policy. It's how the system is built.

No prompt content stored
Enforced by CI tests on every commit
No response body logged
Direct passthrough to your provider
Metadata only
Model, cost, tags, decision reasoning
Per-workspace isolation
Postgres RLS + app-level checks

Put your AI layer
under control.

Start free. Add enforcement as your spend grows. No credit card required.

Free tier · 100K events/moNo credit cardLive in 5 minutes