API Reference
Base URL: https://api.minima.sh. Base path: /v1. All request and response bodies are JSON. Interactive OpenAPI docs are served at https://api.minima.sh/docs.
Authentication
Section titled “Authentication”Minima uses pass-through auth: present your own Mubit API key as a bearer token. There is no separate Minima key — Minima forwards the key to Mubit to read and write your task → model → outcome history on your behalf.
Authorization: Bearer mbt_<instance>_<keyid>_<secret>A missing or invalid key returns 401. Your Mubit key determines which Mubit instance Minima reads and writes — it’s your data boundary. GET /v1/health is the only endpoint that works without a key (it returns service liveness only).
user_id and namespace are scoping fields within your Mubit instance, not auth boundaries — use them to partition recall and learning across teams, projects, or environments.
Errors
Section titled “Errors”Errors are returned as application/problem+json (RFC 7807-style):
{ "type": "about:blank", "title": "No candidate models", "status": 422, "detail": "no models match the supplied constraints" }| Status | Title | When |
|---|---|---|
400 | Invalid request | Request body fails validation. |
401 | Unauthorized | Missing or invalid Mubit key. |
422 | No candidate models | Constraints eliminated every catalog model. |
POST /v1/recommend core
Section titled “POST /v1/recommend ”Recommend a model for a single task.
Request — RecommendRequest
Section titled “Request — RecommendRequest”| Field | Type | Default | Notes |
|---|---|---|---|
task | TaskInput | required | The task to route (see below). |
cost_quality_tradeoff | float 0–10 | 5.0 | 0 = cheapest acceptable, 10 = highest quality. Sets the quality threshold τ. |
constraints | Constraints | {} | Hard limits on the candidate set (see below). |
user_id | string | null | null | Within-account actor label. Scopes recall. |
namespace | string | null | null | Within-account sub-scope (team / project / environment). Isolates recall and learning. |
max_candidates | int 1–64 | 8 | Cap on candidates considered. |
allow_llm_escalation | bool | true | Allow the cheap-LLM reasoner when evidence is thin. |
explain | bool | true | Include evidence[] refs on each ranked model. |
TaskInput
Section titled “TaskInput”| Field | Type | Default | Notes |
|---|---|---|---|
task | string | required | Raw task/prompt text; embedded by Mubit for recall. |
task_type | enum | null | null | code | summarization | extraction | qa | reasoning | classification | translation | creative | rag | tool_use | other. Heuristic-classified if omitted. |
difficulty | enum | null | null | trivial | easy | medium | hard | expert. |
expected_input_tokens | int ≥ 0 | null | null | Feeds the cost estimate; a per-task-type default is used if omitted. |
expected_output_tokens | int ≥ 0 | null | null | Feeds the cost estimate. |
tags | string[] | [] | Propagated to Mubit env_tags (e.g. lang:python) for version-aware recall. |
Constraints (all optional)
Section titled “Constraints (all optional)”| Field | Type | Notes |
|---|---|---|
allowed_providers | string[] | null | Whitelist by provider. |
candidate_models | string[] | null | Restrict to these model ids. |
excluded_models | string[] | null | Blacklist by model id. |
max_cost_per_call | float ≥ 0 | null | USD hard filter on estimated cost. |
min_quality | float 0–1 | null | Predicted-success floor; raises τ. |
require_prompt_caching | bool | Keep only models that support prompt caching. |
max_latency_ms | int > 0 | null | Reserved latency hint. |
require_context_window | int > 0 | null | Keep only models with at least this context window. |
Response — RecommendResponse
Section titled “Response — RecommendResponse”| Field | Type | Notes |
|---|---|---|
recommendation_id | string | Quote this back to POST /v1/feedback. |
recommended_model | RankedModel | The chosen model. |
ranked | RankedModel[] | Every candidate, sorted by final score. |
fallback_model | RankedModel | null | A more reliable retry target. |
confidence | float 0–1 | Overall confidence in the pick. |
decision_basis | enum | memory | prior | llm — which path produced the pick. |
threshold_used | float | The quality threshold τ applied. |
classified_task_type | enum | Final task type used. |
classified_difficulty | enum | Final difficulty used. |
catalog_version | string | Catalog version that priced the candidates. |
catalog_stale | bool | Prices older than the staleness window. |
latency_ms | int | Minima-side recommendation latency. |
warnings | string[] | See Warnings below. |
RankedModel
Section titled “RankedModel”| Field | Type | Notes |
|---|---|---|
model_id | string | |
provider | string | |
predicted_success | float 0–1 | Probability the model clears the task. |
est_cost_usd | float ≥ 0 | Estimated cost for this request, per the chosen cost basis. |
est_cost_breakdown | object | Keys depend on the basis: {rescaled, obs_output_tokens}, {observed_avg}, or {input, output}. See Cost-basis tiers. |
score | float | Final objective score; the sorting key. |
rationale | string | Human-readable reason (tags cost as obs or est). |
decision_basis | enum | Per-model basis: memory | prior | llm. |
evidence | EvidenceRef[] | Recalled neighbors that informed this candidate (empty if explain=false). |
supports_prompt_caching | bool | |
context_window | int |
EvidenceRef
Section titled “EvidenceRef”| Field | Type | Notes |
|---|---|---|
entry_id | string | Mubit QueryEvidence.id (used for outcome attribution). |
reference_id | string | null | Stable reference id. |
model_id | string | The model this past outcome was about. |
score | float | Retrieval similarity. |
knowledge_confidence | float 0–1 | Mubit’s reliability estimate for the entry. |
observed_success | float 0–1 | The recorded quality of that past outcome. |
is_stale | bool | Whether the entry is marked stale. |
Warnings
Section titled “Warnings”| Warning | Meaning |
|---|---|
cold_start | No recalled outcomes; prior-only. |
recall_timeout | Mubit recall exceeded the timeout; prior-only. |
memory_unavailable | Recall errored; prior-only. |
prices_stale | Catalog prices older than the staleness window. |
no_model_meets_threshold | No candidate cleared τ; recommended the highest-success one. |
no_model_within_cost_budget | max_cost_per_call eliminated all; constraint relaxed for ranking. |
escalation_suggested:<reason> | Escalation criteria met (thin_evidence, low_confidence, tie, …). |
reasoner_consulted | The cheap-LLM reasoner was consulted and changed scores. |
reasoner_failed | The reasoner errored or returned unusable output; deterministic result used. |
reasoner_disabled | Escalation suggested but no reasoner is configured. |
llm_classified | The reasoner refined an ambiguous task classification. |
Example
Section titled “Example”curl -s https://api.minima.sh/v1/recommend \ -H "authorization: Bearer $MUBIT_API_KEY" \ -H 'content-type: application/json' -d '{ "task": {"task": "Write a Python function that merges k sorted linked lists.", "task_type": "code", "difficulty": "hard", "expected_input_tokens": 180, "expected_output_tokens": 600, "tags": ["lang:python"]}, "cost_quality_tradeoff": 3, "constraints": {"min_quality": 0.8, "excluded_models": ["some-deprecated-model"]}, "namespace": "team-payments"}' | jqPOST /v1/recommend/workflow workflow
Section titled “POST /v1/recommend/workflow ”Recommend a model for each step of a multi-step workflow. Each step runs the same engine independently and gets its own recommendation_id for per-step feedback.
Request — WorkflowRequest
Section titled “Request — WorkflowRequest”| Field | Type | Default | Notes |
|---|---|---|---|
steps | WorkflowStep[] | required | The steps to route. |
cost_quality_tradeoff | float 0–10 | 5.0 | Applied to every step. |
constraints | Constraints | {} | Global constraints; each step may override. |
user_id | string | null | null | |
namespace | string | null | null | |
allow_llm_escalation | bool | true |
WorkflowStep
Section titled “WorkflowStep”| Field | Type | Notes |
|---|---|---|
step_id | string | Caller-defined id (echoed in the response). |
task | TaskInput | The step’s task. |
constraints | Constraints | null | Per-step override, merged over the global constraints. |
depends_on | string[] | Declared dependencies (informational; steps are scored independently). |
Response — WorkflowResponse
Section titled “Response — WorkflowResponse”| Field | Type | Notes |
|---|---|---|
workflow_recommendation_id | string | Id for the whole workflow. |
steps | StepRecommendation[] | {step_id, recommendation: RecommendResponse} per step. |
total_est_cost_usd | float | Sum of recommended-model costs across steps. |
total_est_cost_if_all_premium | float | Sum if each step used its most expensive candidate — the savings baseline. |
confidence | float 0–1 | Mean step confidence. |
See the multi-step workflow example.
POST /v1/feedback core
Section titled “POST /v1/feedback ”Report an outcome and close the learning loop. This reinforces the memories that drove the recommendation and records realized cost/token history that powers the observed and rescaled cost-basis tiers.
Request — FeedbackRequest
Section titled “Request — FeedbackRequest”| Field | Type | Default | Notes |
|---|---|---|---|
recommendation_id | string | required | From a prior /recommend (or a workflow step). |
chosen_model_id | string | required | The model you actually ran (may differ from the recommendation). |
outcome | enum | required | success | partial | failure. |
quality_score | float 0–1 | null | null | Caller-supplied; no LLM judge. Defaults applied per outcome if omitted (0.9 / 0.5 / 0.1). |
input_tokens | int ≥ 0 | null | null | Realized input tokens — populate this to enable the rescaled cost tier. |
output_tokens | int ≥ 0 | null | null | Realized output tokens (captures reasoning/thinking) — populate this for the rescaled tier. |
actual_cost_usd | float ≥ 0 | null | null | Realized $/call — enables the observed cost tier. |
latency_ms | int ≥ 0 | null | null | |
verified_in_production | bool | false | Marks a real production outcome; gates lesson promotion. |
notes | string | null | null | |
idempotency_key | string | null | null | Dedupe key; derived from recommendation_id + model if omitted. |
Response — FeedbackResponse
Section titled “Response — FeedbackResponse”| Field | Type | Notes |
|---|---|---|
accepted | bool | false with a warning on failure. |
record_id | string | null | The Mubit id of the upserted outcome record. |
reinforced_entry_ids | string[] | The neighbor entry ids credited. |
updated_confidence | float | null | Mubit’s updated knowledge_confidence for the primary entry. |
reflection_triggered | bool | Whether reflection fired this call. |
lesson_promoted | bool | Whether a durable lesson was promoted. |
warnings | string[] | unknown_recommendation, memory_write_failed, reinforcement_failed, lesson_promotion_failed. |
Example
Section titled “Example”curl -s https://api.minima.sh/v1/feedback \ -H "authorization: Bearer $MUBIT_API_KEY" \ -H 'content-type: application/json' -d '{ "recommendation_id": "…", "chosen_model_id": "claude-haiku-4-5", "outcome": "success", "quality_score": 0.95, "input_tokens": 180, "output_tokens": 640, "actual_cost_usd": 0.0034, "verified_in_production": true}' | jqGET /v1/models
Section titled “GET /v1/models”The current model catalog (cost + capability priors).
Query parameters
Section titled “Query parameters”| Param | Type | Default | Notes |
|---|---|---|---|
provider | string | — | Filter by provider (case-insensitive). |
task_type | enum | — | Keep only models with a capability prior for this task type. |
max_cost | float | — | Keep only models whose max(input, output) $/Mtok ≤ this. |
include_stale | bool | true | Prefer fresh-priced models when false. |
Response
Section titled “Response”{ models: ModelCard[], catalog_version, refreshed_at, stale }, sorted by input price.
ModelCard fields include: model_id, provider, display_name, input_cost_per_mtok, output_cost_per_mtok, cache_read_cost_per_mtok, supports_prompt_caching, context_window, max_output_tokens, capability_priors, capability_by_task_type, cost_source, cost_fetched_at, cost_stale, capability_source.
GET /v1/strategies
Section titled “GET /v1/strategies”Surfaces the rules Mubit has promoted for a namespace — the “why” behind routing patterns.
Query parameters
Section titled “Query parameters”| Param | Type | Default | Notes |
|---|---|---|---|
namespace | string | — | Within-account sub-scope to read strategies for. |
lesson_types | string[] | — | Filter by lesson type. |
max_strategies | int 1–50 | 5 |
Response
Section titled “Response”{ namespace, lane, strategies: Strategy[], count }, where each Strategy has strategy_id, description, supporting_lesson_count, avg_confidence, avg_reinforcement, dominant_lesson_type, dominant_scope, lesson_ids[].
GET /v1/health
Section titled “GET /v1/health”Always returns 200; reports degraded state in the body. The only endpoint that doesn’t require a key (an unauthenticated probe gets service liveness; a key-bearing probe additionally confirms your account’s memory reachability).
{ "status": "ok", "memory": {"reachable": true, "latency_ms": 12}, "catalog": {"version": "…", "stale": false, "models": 42}, "version": "0.1.0"}status is degraded when the memory backend is unreachable. In that state /recommend still serves prior-only recommendations.