Examples
Copy-paste recipes against the hosted API, from a single curl to a production routing wrapper. Every snippet is self-contained.
1. Quickstart with curl
Section titled “1. Quickstart with curl”Exercise the core endpoints with nothing but curl and jq — the fastest way to confirm your key works.
# Service health (the only keyless endpoint)curl -s https://api.minima.sh/v1/health | jq
# A recommendationREC=$(curl -s https://api.minima.sh/v1/recommend \ -H "authorization: Bearer $MUBIT_API_KEY" \ -H 'content-type: application/json' \ -d '{"task":{"task":"Classify this support ticket by urgency.","task_type":"classification"}, "cost_quality_tradeoff":2}')echo "$REC" | jq '.recommended_model.model_id, .decision_basis, .recommendation_id'
# Close the loopcurl -s https://api.minima.sh/v1/feedback \ -H "authorization: Bearer $MUBIT_API_KEY" \ -H 'content-type: application/json' \ -d "{\"recommendation_id\":$(echo "$REC" | jq .recommendation_id), \"chosen_model_id\":\"claude-haiku-4-5\",\"outcome\":\"success\",\"quality_score\":0.92}" | jq2. The core loop
Section titled “2. The core loop”The whole value loop with the Python SDK: recommend → (you run the model) → feedback. Report realized tokens and cost so the cost ranking sharpens.
from minima_client import MinimaClient
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima: rec = minima.recommend( "Summarize this incident report into 3 bullet points.", cost_quality_tradeoff=3, ) print(rec.recommended_model.model_id, rec.decision_basis)
# ... run rec.recommended_model.model_id in your own stack ...
minima.feedback( rec.recommendation_id, rec.recommended_model.model_id, "success", quality_score=0.95, input_tokens=1180, output_tokens=320, actual_cost_usd=0.0028, verified_in_production=True, )3. Constraints and the slider
Section titled “3. Constraints and the slider”Hard Constraints (provider whitelist, quality floor, cost ceiling, deny-list) plus sweeping cost_quality_tradeoff from 0→10 to watch Minima walk the cost-vs-quality frontier for the same task.
from minima_client import MinimaClient, TaskInput, Constraints
task = TaskInput(task="Refactor this 200-line module for readability.", task_type="code")
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima: rec = minima.recommend( task, constraints=Constraints( allowed_providers=["anthropic", "google"], min_quality=0.8, max_cost_per_call=0.02, ), ) print("constrained pick:", rec.recommended_model.model_id)
for cq in (0, 5, 10): r = minima.recommend(task, cost_quality_tradeoff=cq) print(f"slider {cq:>2}: {r.recommended_model.model_id} " f"~${r.recommended_model.est_cost_usd:.4f}")4. Multi-step workflow
Section titled “4. Multi-step workflow”POST /v1/recommend/workflow routes each step of a pipeline independently — a cheap model for classify/extract, a stronger one for the hard reasoning step — and reports total cost versus the all-premium baseline. Each step gets its own recommendation_id for per-step feedback.
from minima_client import MinimaClient, WorkflowRequest, WorkflowStep, TaskInput
req = WorkflowRequest( steps=[ WorkflowStep(step_id="extract", task=TaskInput(task="Extract entities from the email.", task_type="extraction")), WorkflowStep(step_id="reason", task=TaskInput(task="Decide the next action given the entities.", task_type="reasoning", difficulty="hard")), ], cost_quality_tradeoff=4,)
with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima: wf = minima.recommend_workflow(req) for step in wf.steps: print(step.step_id, "→", step.recommendation.recommended_model.model_id) print(f"total ${wf.total_est_cost_usd:.4f} vs all-premium ${wf.total_est_cost_if_all_premium:.4f}")5. Zero-code intake
Section titled “5. Zero-code intake”minima_client.autocapture auto-captures your existing LLM calls into Minima’s memory with no call-site changes — useful for backfilling history from traffic you already run. It needs a Mubit key for the underlying memory.
from minima_client import autocapture
autocapture.enable(api_key="<mubit-key>", endpoint="https://api.mubit.ai", namespace="team-payments", user_id="svc-router")
# ... your normal OpenAI / Anthropic / LiteLLM / Gemini calls run here, auto-captured ...
autocapture.feedback(good=True) # learn does NOT fabricate a success signal — close it explicitlyautocapture.disable()6. Production routing wrapper
Section titled “6. Production routing wrapper”The shape you’d ship: recommend a model, run it via the official Anthropic SDK (streaming, real token usage), then feed the realized cost/quality back.
import anthropicfrom minima_client import AsyncMinimaClient
async def routed_call(minima: AsyncMinimaClient, client: anthropic.AsyncAnthropic, prompt: str, *, cost_quality_tradeoff: float = 4) -> str: rec = await minima.recommend(prompt, cost_quality_tradeoff=cost_quality_tradeoff) model = rec.recommended_model.model_id
async with client.messages.stream( model=model, max_tokens=1024, messages=[{"role": "user", "content": prompt}], ) as stream: msg = await stream.get_final_message()
text = "".join(b.text for b in msg.content if b.type == "text") cost = rec.recommended_model.est_cost_usd # or compute from msg.usage + your price table await minima.feedback( rec.recommendation_id, model, "success", quality_score=0.95, input_tokens=msg.usage.input_tokens, output_tokens=msg.usage.output_tokens, actual_cost_usd=cost, verified_in_production=True, ) return textWhere to go next
Section titled “Where to go next”- The schemas behind every field: API Reference.
- Why the cost numbers move the way they do: Concepts → Cost-basis tiers.
- The full client surface: Python Client SDK.