Skip to content

Python Client SDK

The minima_client package is the official thin, typed Python client for the hosted Minima API, plus an optional zero-code intake helper.

from minima_client import MinimaClient, AsyncMinimaClient, MinimaError, autocapture

Both MinimaClient (sync) and AsyncMinimaClient (async) share the same surface.

from minima_client import MinimaClient
with MinimaClient("https://api.minima.sh", api_key="mbt_…", timeout=10.0) as minima:
rec = minima.recommend("Summarize this incident report into 3 bullets.",
cost_quality_tradeoff=3)
print(rec.recommended_model.model_id)
  • base_url — the Minima API base URL (https://api.minima.sh).
  • api_key — your Mubit API key (mbt_…), sent as Authorization: Bearer <key> and passed through to Mubit. Required.
  • timeout — HTTP timeout in seconds.

The async client mirrors every method with await; use it inside FastAPI/async apps:

async with AsyncMinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
rec = await minima.recommend(task)

Returns a RecommendResponse.

rec = minima.recommend(
task, # str | TaskInput | dict
cost_quality_tradeoff=5.0, # 0..10
constraints=None, # Constraints | None
user_id=None,
namespace=None,
allow_llm_escalation=True,
explain=True,
)

task is flexible:

minima.recommend("plain prompt text") # str
minima.recommend({"task": "", "task_type": "code"}) # dict
from minima_client import TaskInput, Constraints
minima.recommend(
TaskInput(task="", difficulty="hard"),
constraints=Constraints(min_quality=0.85, max_cost_per_call=0.02),
)

Takes a WorkflowRequest, returns a WorkflowResponse.

from minima_client import WorkflowRequest, WorkflowStep, TaskInput
req = WorkflowRequest(
steps=[
WorkflowStep(step_id="extract",
task=TaskInput(task="Extract entities from …",
task_type="extraction")),
WorkflowStep(step_id="reason",
task=TaskInput(task="Decide next action given …",
task_type="reasoning", difficulty="hard")),
],
cost_quality_tradeoff=4,
)
wf = minima.recommend_workflow(req)
print(wf.total_est_cost_usd, "vs", wf.total_est_cost_if_all_premium)

feedback(recommendation_id, chosen_model_id, outcome, **kwargs)

Section titled “feedback(recommendation_id, chosen_model_id, outcome, **kwargs)”

Returns a FeedbackResponse. outcome is "success" | "partial" | "failure". Pass realized numbers to power the observed/rescaled cost tiers:

minima.feedback(
rec.recommendation_id,
rec.recommended_model.model_id,
"success",
quality_score=0.95,
input_tokens=180,
output_tokens=640,
actual_cost_usd=0.0034,
latency_ms=2100,
verified_in_production=True,
idempotency_key="", # optional
)
catalog = minima.models(provider="anthropic", max_cost=10.0) # ModelsResponse
strat = minima.strategies(namespace="team-payments", max_strategies=5)
status = minima.health() # dict

Non-2xx responses raise MinimaError (which carries the problem+json detail):

from minima_client import MinimaError
try:
rec = minima.recommend(task)
except MinimaError as exc:
# fall back to a default model
...

minima_client.autocapture is a thin wrapper over mubit.learn. Calling enable() pins a learn session to the same memory lane Minima recalls from (minima:<namespace>) and monkeypatches your OpenAI/Anthropic/LiteLLM/Google-GenAI clients, so every LLM call auto-ingests its trace — no code changes at the call site. Requires mubit-sdk.

from minima_client import autocapture
autocapture.enable(
api_key="<mubit-key>",
endpoint="https://api.mubit.ai",
namespace="team-payments",
user_id="svc-router",
)
# ... your normal OpenAI/Anthropic/LiteLLM calls happen here, auto-captured ...
# mubit.learn does NOT fabricate a success signal — close the loop explicitly:
autocapture.feedback(good=True) # or score in [-1, 1]
autocapture.disable() # restore original client behavior

Other helpers:

autocapture.wrap(client) # enrich one client instead of global patching
autocapture.capture(messages, response) # manual ingest for raw HTTP / unsupported libs

See the zero-code intake example.