Kaizen SDK overview

Compress prompts, shrink latency, and decode large-model responses with the Kaizen Token Optimized Format (KTOF)—a lightweight layer that sits between your app and any LLM provider.

One sentence promise · Kaizen detects structured data inside prompts, compresses it, and restores it losslessly so you spend fewer tokens without changing the way you build.

Key capabilities

Prompt compression – compress, optimize, and prompts_encode routes turn large JSON/chat payloads into compact KTOF strings with byte + token stats.
Response hydration – decompress, prompts_decode, and optimize_response rebuild the original structure—including metadata—for safer downstream handling.
Provider adapters – Thin wrappers for OpenAI, Anthropic, and Gemini keep your existing SDK code while adding transparent encode/decode hooks.
Observability hooks – Every response returns stats, optional token_stats, and echoes your metadata so you can track savings per request or workflow.
Enterprise-ready deployment – Default SaaS endpoint (https://api.getkaizen.io/) plus dedicated, self-hosted, or air‑gapped options on request.

Supported platforms

Python (kaizen-client ≥ 0.1.0) – fully typed async client used throughout this guide.
REST/OpenAPI – openapi.json ships with the repo for custom client generation.
Coming soon – JavaScript/TypeScript, Go, Java, and CLI tooling follow the same schema; join the preview at [email protected].

Versioning & support

Current SDK release: v0.1.0 (python/pyproject.toml), targeting Python 3.10+.
Kaizen API is versioned at /v1/...—backwards-compatible changes are additive; breaking changes trigger a new SDK minor version with migration notes.
Report issues via GitHub or email [email protected]. Enterprise customers receive dedicated support channels.

Typical workflow

Install & configure – pip install kaizen-client[all], export KAIZEN_API_KEY, and optionally configure extras for OpenAI/Anthropic/Gemini.
Encode before provider calls – run client.prompts_encode() or client.optimize_request() to generate a compressed payload plus stats.
Send to any LLM – pass the returned result string to your provider SDK (or wrapper) like a normal prompt.
Decode responses – hand completion.output_text (or equivalent) to client.prompts_decode() / client.optimize_response() to recover structured JSON.
Track savings – log the byte/token deltas from the stats block, and forward them to observability tools for cost tracking.

Head to Installation next if you want the exact commands, or skip to Quick Start for a runnable script.

Overview

Installation

Authentication

Quick Start

Core Concepts

API Methods

Advanced Usage

Recipes

Use Cases

Case Study

Error Reference

Changelog

FAQ

Kaizen SDK overview

Key capabilities

Supported platforms

Versioning & support

Typical workflow

Overview

Installation

Authentication

Quick Start

Core Concepts

API Methods

Advanced Usage

Recipes

Use Cases

Case Study

Error Reference

Changelog

FAQ

​Kaizen SDK overview

​Key capabilities

​Supported platforms

​Versioning & support

​Typical workflow

Kaizen SDK overview

Key capabilities

Supported platforms

Versioning & support

Typical workflow