Skip to content

Examples

Copy-paste recipes against the hosted API, from a single curl to a production routing wrapper. Every snippet is self-contained.

Exercise the core endpoints with nothing but curl and jq — the fastest way to confirm your key works.

Terminal window
# Service health (the only keyless endpoint)
curl -s https://api.minima.sh/v1/health | jq
# A recommendation
REC=$(curl -s https://api.minima.sh/v1/recommend \
-H "authorization: Bearer $MUBIT_API_KEY" \
-H 'content-type: application/json' \
-d '{"task":{"task":"Classify this support ticket by urgency.","task_type":"classification"},
"cost_quality_tradeoff":2}')
echo "$REC" | jq '.recommended_model.model_id, .decision_basis, .recommendation_id'
# Close the loop
curl -s https://api.minima.sh/v1/feedback \
-H "authorization: Bearer $MUBIT_API_KEY" \
-H 'content-type: application/json' \
-d "{\"recommendation_id\":$(echo "$REC" | jq .recommendation_id),
\"chosen_model_id\":\"claude-haiku-4-5\",\"outcome\":\"success\",\"quality_score\":0.92}" | jq

The whole value loop with the Python SDK: recommend → (you run the model) → feedback. Report realized tokens and cost so the cost ranking sharpens.

from minima_client import MinimaClient
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
rec = minima.recommend(
"Summarize this incident report into 3 bullet points.",
cost_quality_tradeoff=3,
)
print(rec.recommended_model.model_id, rec.decision_basis)
# ... run rec.recommended_model.model_id in your own stack ...
minima.feedback(
rec.recommendation_id,
rec.recommended_model.model_id,
"success",
quality_score=0.95,
input_tokens=1180,
output_tokens=320,
actual_cost_usd=0.0028,
verified_in_production=True,
)

Hard Constraints (provider whitelist, quality floor, cost ceiling, deny-list) plus sweeping cost_quality_tradeoff from 0→10 to watch Minima walk the cost-vs-quality frontier for the same task.

from minima_client import MinimaClient, TaskInput, Constraints
task = TaskInput(task="Refactor this 200-line module for readability.", task_type="code")
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
rec = minima.recommend(
task,
constraints=Constraints(
allowed_providers=["anthropic", "google"],
min_quality=0.8,
max_cost_per_call=0.02,
),
)
print("constrained pick:", rec.recommended_model.model_id)
for cq in (0, 5, 10):
r = minima.recommend(task, cost_quality_tradeoff=cq)
print(f"slider {cq:>2}: {r.recommended_model.model_id} "
f"~${r.recommended_model.est_cost_usd:.4f}")

POST /v1/recommend/workflow routes each step of a pipeline independently — a cheap model for classify/extract, a stronger one for the hard reasoning step — and reports total cost versus the all-premium baseline. Each step gets its own recommendation_id for per-step feedback.

from minima_client import MinimaClient, WorkflowRequest, WorkflowStep, TaskInput
req = WorkflowRequest(
steps=[
WorkflowStep(step_id="extract",
task=TaskInput(task="Extract entities from the email.",
task_type="extraction")),
WorkflowStep(step_id="reason",
task=TaskInput(task="Decide the next action given the entities.",
task_type="reasoning", difficulty="hard")),
],
cost_quality_tradeoff=4,
)
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
wf = minima.recommend_workflow(req)
for step in wf.steps:
print(step.step_id, "", step.recommendation.recommended_model.model_id)
print(f"total ${wf.total_est_cost_usd:.4f} vs all-premium ${wf.total_est_cost_if_all_premium:.4f}")

minima_client.autocapture auto-captures your existing LLM calls into Minima’s memory with no call-site changes — useful for backfilling history from traffic you already run. It needs a Mubit key for the underlying memory.

from minima_client import autocapture
autocapture.enable(api_key="<mubit-key>", endpoint="https://api.mubit.ai",
namespace="team-payments", user_id="svc-router")
# ... your normal OpenAI / Anthropic / LiteLLM / Gemini calls run here, auto-captured ...
autocapture.feedback(good=True) # learn does NOT fabricate a success signal — close it explicitly
autocapture.disable()

The shape you’d ship: recommend a model, run it via the official Anthropic SDK (streaming, real token usage), then feed the realized cost/quality back.

import anthropic
from minima_client import AsyncMinimaClient
async def routed_call(minima: AsyncMinimaClient, client: anthropic.AsyncAnthropic,
prompt: str, *, cost_quality_tradeoff: float = 4) -> str:
rec = await minima.recommend(prompt, cost_quality_tradeoff=cost_quality_tradeoff)
model = rec.recommended_model.model_id
async with client.messages.stream(
model=model, max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
) as stream:
msg = await stream.get_final_message()
text = "".join(b.text for b in msg.content if b.type == "text")
cost = rec.recommended_model.est_cost_usd # or compute from msg.usage + your price table
await minima.feedback(
rec.recommendation_id, model, "success",
quality_score=0.95,
input_tokens=msg.usage.input_tokens,
output_tokens=msg.usage.output_tokens,
actual_cost_usd=cost,
verified_in_production=True,
)
return text