Now open — sign in and ship

Your own `/v1` endpoint for frontier OSS LLMs.

Kimi · MiMo · DeepSeek · Qwen · GLM behind a single OpenAI-compatible endpoint at your-org.dynoyard.app/v1. Drop-in for any OpenAI SDK. Pay-as-you-go from credits.

Built for teams shipping AI products — agent platforms, RAG pipelines, multi-tenant SaaS, LangChain and LiteLLM stacks. Swap one base URL, keep your SDK. Your subdomain, your keys, per-tenant usage.

Get started See pricing →

Python — OpenAI SDK, base_url swap

No SDK change

from openai import OpenAI

client = OpenAI(
    base_url="https://your-org.dynoyard.app/v1",
    api_key="sk-dyno-...",
)

resp = client.chat.completions.create(
    model="kimi-k2-thinking",
    messages=[{"role": "user", "content": "ship it"}],
)

Frontier OSS models. Curated catalog.

We test what's good, ditch what isn't, keep the list tight. Same OpenAI-shaped API for all. Updated as new models ship.

gemini-3-5-flash

Gemini 3.5 Flash

1M context · tools · streaming · reasoning · vision · structured_output

Gemini 3.5 Flash — Google's fast multimodal model. 1M-token context, accepts text, image, audio, video, and PDF, with adjustable thinking effort for coding and agentic workloads.

$1.65 in · $9.90 out / 1M tokens

glm-5-2

GLM-5.2

195K context · tools · streaming · reasoning · structured_output

Zhipu's GLM-5.2 — open flagship for long-horizon coding & agentic tasks, 1M context.

$1.65 in · $4.95 out / 1M tokens

kimi-k2-7

262K context · tools · streaming · reasoning · structured_output

Moonshot's Kimi K2.7 — a large open model strong at agentic coding and tool use, with a 262K-token context.

$0.98 in · $4.08 out / 1M tokens

minimax-m3

MiniMax M3

1M context · tools · streaming · reasoning · structured_output

M3 reaches frontier-level performance on specialized tasks such as coding and agentic work. It uses MSA (MiniMax Sparse Attention), a new attention architecture proposed by our team, and supports ultra-long context windows of up to 1M tokens. To much anticipation, it is also a natively multimodal model that supports image and video input and can operate a desktop computer.

$0.33 in · $1.32 out / 1M tokens

qwen-plus-3-7

Qwen3.7 Plus

1M context · tools · streaming · reasoning · structured_output

Alibaba's Qwen3.7 Plus — a fast, cost-efficient generalist for high-volume chat and agent workloads, with a 1M-token context.

$0.30 in · $1.21 out / 1M tokens

qwen-plus-3-6

Qwen3.6 Plus

1M context · tools · streaming · reasoning · vision

Alibaba Qwen3.6 Plus — multimodal (image+video in), 1M ctx, function calling, structured output. Non-thinking by default.

$0.55 in · $3.30 out / 1M tokens

qwen-max-3-7

Qwen3.7 Max

1M context · tools · streaming · reasoning

Alibaba Qwen3.7 Max, 262K ctx, agent-tier.

$0.91 in · $2.72 out / 1M tokens

glm-5-1

GLM-5.1

203K context · tools · streaming · reasoning

Zhipu / ZAI GLM-5.1 — 200K ctx, strong tool use, top OSS Chinese-English.

$1.08 in · $3.39 out / 1M tokens

deepseek-v4-pro

DeepSeek V4 Pro

1M context · tools · streaming · reasoning

DeepSeek V4 Pro — SOTA open-weight reasoning, SWE-Bench 80.6%.

$0.48 in · $0.96 out / 1M tokens

mimo-v2-5-pro

MiMo V2.5 Pro

1M context · tools · streaming · reasoning

Xiaomi MiMo V2.5 Pro — 1T MoE / 42B active, 1M ctx, SWE-Bench 78.9%.

$0.48 in · $0.96 out / 1M tokens

See full pricing →

What you get over a multi-tenant aggregator

OpenRouter is great. We're great too — and built for teams that want isolation, observability, and ownership without standing up infra.

Your own subdomain

your-org.dynoyard.app/v1 is yours. Per-org isolation, no shared-tenant noisy-neighbor surprises. White-label feel without the white-label price.

Per-app keys + caps

Issue, rotate, revoke sk-dyno-… tokens per app. Set monthly spend caps before a runaway agent eats your weekend.

Native dashboard

Per-request token counts, latency, cost, error rate — charted per app. Debug a slow agent without grep'ing your own logs.

Pay-as-you-go credits

Top up via card or wire, spend down per request, auto-refill when balance dips below your threshold. No API subscription, no minimum.

Big context, cheap loops

Long agent loops run cheap on cached-prefix pricing. Several catalog models carry 200K+ context windows for long-haul retrieval, codebase work, and multi-turn agents — exact context length is on each model in the catalog above.

Works with your tools

Cursor · Zed · OpenCode · Cline · Continue · Aider · LangChain · LiteLLM · OpenAI Python · OpenAI JS · curl. Anything that speaks /v1/chat/completions.

Switch in three steps

01

Sign in

Magic-link email. No password. First org auto-created with a slug derived from your email — pick a custom slug later.

02

Top up + create key

Add $10 from a card (auto-topup optional). Generate sk-dyno-… per app, scoped to the models you want — Cursor key, prod key, dev key.

03

Swap base_url

Change one line in your SDK init. Same model names, same response shapes. Watch token counts roll in on the dashboard.

Ship faster on smaller bills.

Sign in with your email. Top up $10. Send your first request. Same OpenAI SDK, same prompts, your own subdomain.

Get started hello@dynoyard.app

Your own /v1 endpoint for frontier OSS LLMs.

Frontier OSS models. Curated catalog.

Gemini 3.5 Flash

GLM-5.2

kimi-k2-7

MiniMax M3

Qwen3.7 Plus

Qwen3.6 Plus

Qwen3.7 Max

GLM-5.1

DeepSeek V4 Pro

MiMo V2.5 Pro

What you get over a multi-tenant aggregator

Your own subdomain

Per-app keys + caps

Native dashboard

Pay-as-you-go credits

Big context, cheap loops

Works with your tools

Switch in three steps

Sign in

Top up + create key

Swap base_url

Ship faster on smaller bills.

Your own `/v1` endpoint for frontier OSS LLMs.