Technical Guide

One proxy between
you and every AI

Learn exactly how ProxyGuard helps teams cut AI incidents, control spend, and ship faster with one gateway.

Contents

IForeword
IIThe Gateway Pattern
IIIGetting Started
IVBuilt for Production
VWhat Comes Next

Chapters

Chapter I

Foreword

Every team shipping AI features hits the same wall: growth creates chaos. One provider becomes several, API keys spread across services, and costs become hard to explain. Engineering loses velocity while finance loses confidence in the numbers.

ProxyGuard sits between your app and every provider as a single control layer. You keep your SDK and product logic, while ProxyGuard handles routing, spend policy, and request-level observability.

“Swap your base URL. Keep your SDK. Gain complete control.”

This guide shows how it works, what it unlocks, and how to deploy it quickly.

Chapter II

The Gateway Pattern

Instead of scattering API keys and provider logic across your codebase, ProxyGuard acts as a single gateway that all requests flow through. Your application talks to one endpoint; ProxyGuard handles the rest — routing, retries, logging, and cost tracking.

You gain reliability and control immediately, without a platform rewrite.

< 5ms

Latency Overhead

per request

99.9%

Uptime

availability SLA

10+

Providers

supported today

// budget, alerts & tracking handled by proxy import OpenAI from "openai" const ai = new OpenAI({ baseURL: "https://api.proxyguard.dev/v1", apiKey: process.env.PROXYGUARD_API_KEY, }); export async function chat(msg: string) { return ai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: msg }], }); }

// manual overspend protection import Redis from "ioredis" import OpenAI from "openai" const redis = new Redis() const DAILY_CAP = 50_00 const MONTHLY_CAP = 500_00 async function checkBudget() { const today = new Date().toISOString().slice(0,10) const month = today.slice(0,7) const [d, m] = await Promise.all([ redis.get(`spend:${today}`), redis.get(`spend:${month}`), ]) const daily = parseInt(d ?? "0") const monthly = parseInt(m ?? "0") if (daily > DAILY_CAP) throw new Error("Daily budget exceeded") if (monthly > MONTHLY_CAP) throw new Error("Monthly budget exceeded") if (daily > DAILY_CAP * 0.9) sendAlert("90% daily") } async function trackSpend(cost: number) { const today = new Date().toISOString().slice(0,10) const month = today.slice(0,7) await redis.incrby(`spend:${today}`, cost) await redis.incrby(`spend:${month}`, cost) await redis.expire(`spend:${today}`, 86400) } export async function chat(msg: string) { await checkBudget() const ai = new OpenAI() const res = await ai.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: msg }], }); const tokens = res.usage?.total_tokens ?? 0 const cost = Math.ceil(tokens * 0.003) await trackSpend(cost) return res }

Without

With ProxyGuard

Drag to compare — ~35 lines vs ~6 lines

Because traffic flows through one gateway, you get a unified audit trail, consistent enforcement, and the freedom to switch providers without code churn.

Chapter III

Getting Started

Getting started takes three steps. No SDK migration, no lock-in, no new app-layer complexity.

Point your SDK

Replace your provider base URL with your ProxyGuard endpoint. Works with OpenAI-compatible SDKs in Python, Node, Go, or raw HTTP.

import OpenAI from "openai"

const client = new OpenAI({
  baseURL: "https://api.proxyguard.dev/v1",
  apiKey:  process.env.PROXYGUARD_API_KEY,
})

Configure your rules

Set spend caps, rate limits, and routing priorities in the dashboard. Define daily and monthly budgets per project and trigger alerts before overages.

// Budget rules are set in the dashboard,
// enforced at the proxy layer.
{
  "daily_budget":   50.00,
  "monthly_budget": 500.00,
  "rate_limit":     "100 req/min",
  "alert_at":       [75, 90]
}

Monitor everything

Capture full metadata for every request: tokens, latency, cost, provider, model, and status. Use live dashboards or export logs for compliance and reporting.

Chapter IV

Built for Production

Everything needed to run AI reliably in production, from spend controls and routing to governance and security.

Smart Routing

Route requests by cost, latency, or availability. Automatic failover keeps your app running through provider incidents.

Cost-optimized routing across providers
Automatic failover and retries
Model aliasing — swap models without code changes
Provider-specific rate limit awareness

Cost Controls

Set budgets at project, team, or org level. Track spend in real time with configurable alert thresholds.

Daily and monthly spending caps
Per-project budget isolation
Automatic enforcement — hard or soft limits
Alerts at 75% and 90% of budget

Real-Time Analytics

Use live dashboards with per-request detail. Track tokens, latency, error rates, and cost across every provider.

Per-request logging with full metadata
Token usage and cost breakdowns
Latency percentiles (p50, p95, p99)
Error rate monitoring and alerting

Security & Compliance

Keep provider keys out of client code with full audit trails, scoped access, and safe key rotation.

Secure key vault — provider keys never exposed
Full audit log for every request
API key rotation with zero downtime
Role-based access per project

Chapter V

What Comes Next

Whether you are shipping your first AI feature or operating at enterprise scale, the gateway pattern gives your team reliable controls without slowing product velocity.

“One URL change. That’s the distance between scattered keys and total visibility.”

Put AI spend and reliability on autopilot

Route every model call through one policy layer for budgets, failover, and request-level analytics.

Join the Waitlist See How It Works

One proxy betweenyou and every AI