Examples

Copy-paste recipes against the hosted API, from a single curl to a production routing wrapper. Every snippet is self-contained.

1. Quickstart with curl

Exercise the core endpoints with nothing but curl and jq — the fastest way to confirm your key works.

# Service health (the only keyless endpoint)
curl -s https://api.minima.sh/v1/health | jq

# A recommendation
REC=$(curl -s https://api.minima.sh/v1/recommend \
  -H "authorization: Bearer $MUBIT_API_KEY" \
  -H 'content-type: application/json' \
  -d '{"task":{"task":"Classify this support ticket by urgency.","task_type":"classification"},
       "cost_quality_tradeoff":2}')
echo "$REC" | jq '.recommended_model.model_id, .decision_basis, .recommendation_id'

# Close the loop
curl -s https://api.minima.sh/v1/feedback \
  -H "authorization: Bearer $MUBIT_API_KEY" \
  -H 'content-type: application/json' \
  -d "{\"recommendation_id\":$(echo "$REC" | jq .recommendation_id),
       \"chosen_model_id\":\"claude-haiku-4-5\",\"outcome\":\"success\",\"quality_score\":0.92}" | jq

2. The core loop

The whole value loop with the Python SDK: recommend → (you run the model) → feedback. Report realized tokens and cost so the cost ranking sharpens.

from minima_client import MinimaClient

with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
    rec = minima.recommend(
        "Summarize this incident report into 3 bullet points.",
        cost_quality_tradeoff=3,
    )
    print(rec.recommended_model.model_id, rec.decision_basis)

    # ... run rec.recommended_model.model_id in your own stack ...

    minima.feedback(
        rec.recommendation_id,
        rec.recommended_model.model_id,
        "success",
        quality_score=0.95,
        input_tokens=1180,
        output_tokens=320,
        actual_cost_usd=0.0028,
        verified_in_production=True,
    )

3. Constraints and the slider

Hard Constraints (provider whitelist, quality floor, cost ceiling, deny-list) plus sweeping cost_quality_tradeoff from 0→10 to watch Minima walk the cost-vs-quality frontier for the same task.

from minima_client import MinimaClient, TaskInput, Constraints

task = TaskInput(task="Refactor this 200-line module for readability.", task_type="code")

with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
    rec = minima.recommend(
        task,
        constraints=Constraints(
            allowed_providers=["anthropic", "google"],
            min_quality=0.8,
            max_cost_per_call=0.02,
        ),
    )
    print("constrained pick:", rec.recommended_model.model_id)

    for cq in (0, 5, 10):
        r = minima.recommend(task, cost_quality_tradeoff=cq)
        print(f"slider {cq:>2}: {r.recommended_model.model_id}  "
              f"~${r.recommended_model.est_cost_usd:.4f}")

4. Multi-step workflow

POST /v1/recommend/workflow routes each step of a pipeline independently — a cheap model for classify/extract, a stronger one for the hard reasoning step — and reports total cost versus the all-premium baseline. Each step gets its own recommendation_id for per-step feedback.

from minima_client import MinimaClient, WorkflowRequest, WorkflowStep, TaskInput

req = WorkflowRequest(
    steps=[
        WorkflowStep(step_id="extract",
                     task=TaskInput(task="Extract entities from the email.",
                                    task_type="extraction")),
        WorkflowStep(step_id="reason",
                     task=TaskInput(task="Decide the next action given the entities.",
                                    task_type="reasoning", difficulty="hard")),
    ],
    cost_quality_tradeoff=4,
)

with MinimaClient("https://api.minima.sh", api_key="mbt_…") as minima:
    wf = minima.recommend_workflow(req)
    for step in wf.steps:
        print(step.step_id, "→", step.recommendation.recommended_model.model_id)
    print(f"total ${wf.total_est_cost_usd:.4f} vs all-premium ${wf.total_est_cost_if_all_premium:.4f}")

5. Zero-code intake

minima_client.autocapture auto-captures your existing LLM calls into Minima’s memory with no call-site changes — useful for backfilling history from traffic you already run. It needs a Mubit key for the underlying memory.

from minima_client import autocapture

autocapture.enable(api_key="<mubit-key>", endpoint="https://api.mubit.ai",
                   namespace="team-payments", user_id="svc-router")

# ... your normal OpenAI / Anthropic / LiteLLM / Gemini calls run here, auto-captured ...

autocapture.feedback(good=True)   # learn does NOT fabricate a success signal — close it explicitly
autocapture.disable()

6. Production routing wrapper

The shape you’d ship: recommend a model, run it via the official Anthropic SDK (streaming, real token usage), then feed the realized cost/quality back.

import anthropic
from minima_client import AsyncMinimaClient

async def routed_call(minima: AsyncMinimaClient, client: anthropic.AsyncAnthropic,
                      prompt: str, *, cost_quality_tradeoff: float = 4) -> str:
    rec = await minima.recommend(prompt, cost_quality_tradeoff=cost_quality_tradeoff)
    model = rec.recommended_model.model_id

    async with client.messages.stream(
        model=model, max_tokens=1024,
        messages=[{"role": "user", "content": prompt}],
    ) as stream:
        msg = await stream.get_final_message()

    text = "".join(b.text for b in msg.content if b.type == "text")
    cost = rec.recommended_model.est_cost_usd  # or compute from msg.usage + your price table
    await minima.feedback(
        rec.recommendation_id, model, "success",
        quality_score=0.95,
        input_tokens=msg.usage.input_tokens,
        output_tokens=msg.usage.output_tokens,
        actual_cost_usd=cost,
        verified_in_production=True,
    )
    return text

Where to go next

The schemas behind every field: API Reference.
Why the cost numbers move the way they do: Concepts → Cost-basis tiers.
The full client surface: Python Client SDK.