Instrumentation Guide

The Cloudshrink SDK wraps your existing LLM client. Token counts, costs, and latency are captured automatically—you just add two tags:project and actor.

Python SDK

from cloudshrink import wrap
import openai

# Wrap your client once — all calls are tracked automatically
client = wrap(openai.OpenAI(), project="doc-summarizer", actor="carol@acme.com")

# Use it exactly like before. Tokens, cost, and model are captured for you.
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this document..."}],
)

# That's it. Check your dashboard.

Node.js SDK

import { wrap } from '@cloudshrink/sdk';
import OpenAI from 'openai';

// Wrap your client once
const openai = wrap(new OpenAI(), {
  project: 'doc-summarizer',
  actor: 'carol@acme.com',
});

// Use it exactly like before
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Summarize this document...' }],
});

// Done. Tokens, cost, model — all captured.

Per-call overrides

Tags set on the client are defaults. Override them per-call when needed:

# Python — override project for a specific call
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    cloudshrink={"project": "search-reranker", "actor": "rerank-svc"},
)

REST API (advanced)

If you prefer raw HTTP, POST events directly after each LLM call:

POST /api/ingest
Content-Type: application/json

{
  "events": [{
    "provider": "OpenAI",
    "model": "GPT-4o",
    "input_tokens": 1250,
    "output_tokens": 430,
    "project": "doc-summarizer",
    "actor": "carol@acme.com",
    "env": "production"
  }]
}

Best Practices