Streaming endpoints are on the roadmap (docs/TODO.md). Until native streaming ships, split prompts into manageable sections and call prompts_encode per chunk.
Use length_marker + delimiter options (see EncodeOptions) to mark chunk boundaries before sending them to a provider that expects streaming tokens.
If you need to apply custom logic before/after each call, wrap the client methods in your own helper—e.g., a decorator that logs stats, retries on 429, or pushes metrics to Prometheus.
Keep middleware lightweight; heavy processing should remain in your application layer to avoid slowing Kaizen requests.