Python Client SDK
The minima_client package is the official thin, typed Python client for the hosted Minima API, plus an optional zero-code intake helper.
from minima_client import MinimaClient, AsyncMinimaClient, MinimaError, autocaptureClients
Section titled “Clients”Both MinimaClient (sync) and AsyncMinimaClient (async) share the same surface.
from minima_client import MinimaClient
with MinimaClient("https://api.minima.sh", api_key="mbt_…", timeout=10.0) as minima: rec = minima.recommend("Summarize this incident report into 3 bullets.", cost_quality_tradeoff=3) print(rec.recommended_model.model_id)base_url— the Minima API base URL (https://api.minima.sh).api_key— your Mubit API key (mbt_…), sent asAuthorization: Bearer <key>and passed through to Mubit. Required.timeout— HTTP timeout in seconds.
The async client mirrors every method with await; use it inside FastAPI/async apps:
async with AsyncMinimaClient("https://api.minima.sh", api_key="mbt_…") as minima: rec = await minima.recommend(task)recommend(task, *, ...)
Section titled “recommend(task, *, ...)”Returns a RecommendResponse.
rec = minima.recommend( task, # str | TaskInput | dict cost_quality_tradeoff=5.0, # 0..10 constraints=None, # Constraints | None user_id=None, namespace=None, allow_llm_escalation=True, explain=True,)task is flexible:
minima.recommend("plain prompt text") # strminima.recommend({"task": "…", "task_type": "code"}) # dict
from minima_client import TaskInput, Constraintsminima.recommend( TaskInput(task="…", difficulty="hard"), constraints=Constraints(min_quality=0.85, max_cost_per_call=0.02),)recommend_workflow(req)
Section titled “recommend_workflow(req)”Takes a WorkflowRequest, returns a WorkflowResponse.
from minima_client import WorkflowRequest, WorkflowStep, TaskInput
req = WorkflowRequest( steps=[ WorkflowStep(step_id="extract", task=TaskInput(task="Extract entities from …", task_type="extraction")), WorkflowStep(step_id="reason", task=TaskInput(task="Decide next action given …", task_type="reasoning", difficulty="hard")), ], cost_quality_tradeoff=4,)wf = minima.recommend_workflow(req)print(wf.total_est_cost_usd, "vs", wf.total_est_cost_if_all_premium)feedback(recommendation_id, chosen_model_id, outcome, **kwargs)
Section titled “feedback(recommendation_id, chosen_model_id, outcome, **kwargs)”Returns a FeedbackResponse. outcome is "success" | "partial" | "failure". Pass realized numbers to power the observed/rescaled cost tiers:
minima.feedback( rec.recommendation_id, rec.recommended_model.model_id, "success", quality_score=0.95, input_tokens=180, output_tokens=640, actual_cost_usd=0.0034, latency_ms=2100, verified_in_production=True, idempotency_key="…", # optional)models(...), strategies(...), health()
Section titled “models(...), strategies(...), health()”catalog = minima.models(provider="anthropic", max_cost=10.0) # ModelsResponsestrat = minima.strategies(namespace="team-payments", max_strategies=5)status = minima.health() # dictErrors
Section titled “Errors”Non-2xx responses raise MinimaError (which carries the problem+json detail):
from minima_client import MinimaError
try: rec = minima.recommend(task)except MinimaError as exc: # fall back to a default model ...Zero-code intake: autocapture
Section titled “Zero-code intake: autocapture”minima_client.autocapture is a thin wrapper over mubit.learn. Calling enable() pins a learn session to the same memory lane Minima recalls from (minima:<namespace>) and monkeypatches your OpenAI/Anthropic/LiteLLM/Google-GenAI clients, so every LLM call auto-ingests its trace — no code changes at the call site. Requires mubit-sdk.
from minima_client import autocapture
autocapture.enable( api_key="<mubit-key>", endpoint="https://api.mubit.ai", namespace="team-payments", user_id="svc-router",)
# ... your normal OpenAI/Anthropic/LiteLLM calls happen here, auto-captured ...
# mubit.learn does NOT fabricate a success signal — close the loop explicitly:autocapture.feedback(good=True) # or score in [-1, 1]autocapture.disable() # restore original client behaviorOther helpers:
autocapture.wrap(client) # enrich one client instead of global patchingautocapture.capture(messages, response) # manual ingest for raw HTTP / unsupported libsSee the zero-code intake example.