14 min read

Synthetic.new Review (2026): $30/Month Drop-in for Claude Code

Synthetic.new Review (2026): $30/Month Drop-in for Claude Code
Photo by Scott Rodgerson / Unsplash

Editorial note: Pricing, model lineup, and rate-limit numbers in this article are pulled from the public synthetic.new pricing page and the Synthetic developer docs as of May 1, 2026. They will move as the platform ships. We paid for the $30/month subscription directly, with no free credits or affiliate relationship. The Anthropic-compatible setup steps were tested against Claude Code on macOS using a single API key; reproduce them on your own machine before shipping anything important.

Most "best LLM API" reviews in 2026 spend their word count benchmarking models against each other on per-token pricing. That is the wrong frame for the audience that actually pays for these things. If you run Claude Code or OpenCode all day, you do not care that DeepSeek V3.2 is fractionally cheaper per million tokens than Kimi K2.5. You care whether your agent can hit the API without a meter running, whether the provider will train on your private repo, and whether the integration takes ten minutes or ten hours.

Synthetic.new is a flat-fee subscription aimed exactly at that audience. $30 a month, sixteen always-on open-weight models, an Anthropic-compatible endpoint that drops into Claude Code with two environment variables, and a privacy clause that says the platform does not train on your data or store your prompts. This post is an honest hands-on review: what it is, how the setup actually works inside Claude Code, OpenCode, and Crush, where the $30 plan beats Claude Pro, and the one rate-limit number nobody else surfaces that matters more than any benchmark score.

Server room with rack-mounted infrastructure representing private-cloud LLM hosting on synthetic.new

TL;DR: Synthetic.new (Synthetic Lab, Co.) is a $30/month subscription for sixteen always-on open-weight models, including Kimi K2.5, Kimi K2.6, GLM 4.7 Flash, GLM 5.1, Qwen3-Coder-480B, DeepSeek V3.2, and Llama 3.3, hosted on private datacenters that do not train on user data or store prompts. Both an OpenAI-compatible and an Anthropic-compatible endpoint are exposed, which is why the platform reads as "Claude Code with open models behind it" once you wire it up. The catch worth knowing before you subscribe is one concurrent request per model on the subscription tier; the workaround is to run different agents against different models, which the lineup makes easy.


What Is Synthetic.new?

Synthetic.new is a private-cloud LLM subscription run by Synthetic Lab, Co. The product hosts open-weight models on private infrastructure, exposes them through both an OpenAI-compatible (https://api.synthetic.new/openai/v1) and an Anthropic-compatible (https://api.synthetic.new/anthropic) API, and bills either as a flat $30/month subscription or as pay-per-token usage. The pitch is straightforward: the cost shape of Claude Pro and the model lineup of OpenRouter, on a privacy clause your legal team can sign.

The model lineup is the differentiator inside the flat-fee category. As of May 2026, the platform lists sixteen always-on models spanning seven providers: Synthetic-hosted (Kimi K2.5, Kimi K2.6, MiniMax-M2.5, GLM 4.7, GLM 4.7 Flash, GLM 5, GLM 5.1, NVIDIA Nemotron-3 Super 120B), Fireworks-hosted (DeepSeek V3.2, gpt-oss-120b), and Together AI-hosted (DeepSeek R1-0528, DeepSeek V3, Llama 3.3 70B, Qwen3-235B-Thinking, Qwen3-Coder-480B, Qwen3.5-397B). Context windows run 128k to 256k. Four LoRA base models (Llama 3.2 1B/3B, 3.1 8B/70B at ranks 8 to 64) and one embedding model (Nomic embed-text-v1.5, 8k context, no rate-limit impact) round out the catalog.

The named integrations on the developer site are coding agents, not chatbots: Claude Code, OpenCode, Crush, Octofriend (Synthetic's own agent), GitHub Copilot, OpenClaw, and Xcode Intelligence. That is who this product is for. If you are evaluating Synthetic for a customer-support chatbot or a Notion plugin, you are reading the wrong review; this is an inference layer for AI coding work. (For an adjacent piece on the surrounding open-source agent stack, see our coverage of Stash, the self-hosted MCP memory layer, or the open-source text-to-CAD harness for Claude Code.)


How Much Does It Cost, and What Does $30 Actually Buy?

The subscription is $30 per month, billed as $1 per day. That price includes unlimited access to all sixteen always-on models, one concurrent request per model, and a window of 500 messages per five hours. The bundled Nomic embedding model is unmetered and does not consume the rate-limit budget. That matters if your agent indexes a codebase before it answers, which Claude Code, OpenCode, and Crush all do. A separate pay-per-token tier is offered for usage-based workloads that exceed the subscription window. (Source: synthetic.new/pricing.)

The honest comparison is not Synthetic versus per-token API providers, because the price shapes are different. The honest comparison is Synthetic versus Claude Pro, because both are flat-fee subscriptions used by the same buyer for the same job. Claude Pro at $20/month gives you Sonnet (and Opus on a much smaller budget) inside Anthropic's web and desktop apps; Synthetic at $30/month gives you sixteen open-weight models behind a real API key that any coding agent can hit. The $10/month delta is the price of getting an actual programmable endpoint plus the model variety. For someone who only uses Claude in a browser, that is a bad trade. For someone running Claude Code locally five hours a day, it is a good one.

The number nobody else flags is the concurrency cap. One concurrent request per model means that if you launch a second Claude Code session and it picks the same model as the first, the second one queues. Two coding agents running on Kimi K2.5 simultaneously will throttle each other; the same two agents running on Kimi K2.5 and GLM 5.1 will not. The model lineup is wide enough that this is a configuration choice, not a hard ceiling, but you have to know to make it. (Source: synthetic.new/pricing.)

Flat-Fee LLM Subscriptions for Coding Agents Lollipop chart comparing monthly cost and effective message budget for Synthetic.new, Claude Pro, ChatGPT Plus, and a per-token baseline. Flat-Fee LLM Subscriptions, May 2026 Monthly price vs window, programmable API in green Synthetic.new — $30/mo 500 msgs / 5 hrs · 16 models · API Claude Pro — $20/mo ~150 msgs / 5 hrs · 1 model · app only ChatGPT Plus — $20/mo ~80 msgs / 3 hrs · GPT models · app only Per-token baseline — $30 of credit Bursty / metered · variable model access Sources: synthetic.new/pricing; consumer plan documentation, May 2026
The point of comparison is not per-token cost. It is whether the $30 buys you a programmable endpoint your coding agent can hit, and whether the included model variety covers the parallel workloads you actually run.

How Do You Run Claude Code on Synthetic.new?

The headline feature for this audience is the Anthropic-compatible endpoint. Claude Code reads ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN directly, which is an officially supported pattern documented by Anthropic for routing the CLI through an internal gateway or compatible provider. (Source: Claude Code LLM gateway configuration.) Synthetic implements that pattern as a first-class endpoint, which means you do not need a router, a proxy, or a wrapping CLI to use Kimi or GLM as your Claude Code backend. You set two environment variables and run claude.

The full block, taken from the Synthetic developer docs and tested locally, is this:

export ANTHROPIC_BASE_URL=https://api.synthetic.new/anthropic
export ANTHROPIC_AUTH_TOKEN=${SYNTHETIC_API_KEY}
export ANTHROPIC_DEFAULT_OPUS_MODEL=hf:moonshotai/Kimi-K2.5
export ANTHROPIC_DEFAULT_SONNET_MODEL=hf:moonshotai/Kimi-K2.5
export ANTHROPIC_DEFAULT_HAIKU_MODEL=hf:zai-org/GLM-4.7-Flash
export CLAUDE_CODE_SUBAGENT_MODEL=hf:moonshotai/Kimi-K2.5
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

The mapping is the interesting design choice. Claude Code internally routes work to "Opus," "Sonnet," and "Haiku" tiers, and the env vars let you decide which Synthetic-hosted model fills each tier. The recommended defaults map both Opus and Sonnet to Kimi K2.5 (the strongest general coder in the lineup) and Haiku to GLM 4.7 Flash (the fastest, cheapest model, suited to short, high-volume operations like file lookups and tool-use planning). CLAUDE_CODE_SUBAGENT_MODEL controls what subagents spawned by the main Claude Code session use; pointing it at Kimi K2.5 keeps subagent quality consistent with the parent. CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 stops the CLI from making telemetry calls back to the upstream Anthropic endpoint that would otherwise be wasted against a third-party gateway.

For repeated use, Synthetic recommends a shell function rather than persisting the env vars globally, so a regular claude invocation still hits the Anthropic mothership and only synclaude hits Synthetic:

synclaude() {
  ANTHROPIC_BASE_URL=https://api.synthetic.new/anthropic \
  ANTHROPIC_AUTH_TOKEN=${SYNTHETIC_API_KEY} \
  ANTHROPIC_DEFAULT_OPUS_MODEL=hf:moonshotai/Kimi-K2.5 \
  ANTHROPIC_DEFAULT_SONNET_MODEL=hf:moonshotai/Kimi-K2.5 \
  ANTHROPIC_DEFAULT_HAIKU_MODEL=hf:zai-org/GLM-4.7-Flash \
  CLAUDE_CODE_SUBAGENT_MODEL=hf:moonshotai/Kimi-K2.5 \
  CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 \
  claude "$@"
}

Drop the function into ~/.zshrc or ~/.bashrc, set SYNTHETIC_API_KEY from the value in your Synthetic dashboard, reload the shell, and synclaude launches Claude Code against Kimi K2.5 while plain claude still uses your Anthropic key. This is the cleanest way we've seen any open-model provider integrate with the Claude CLI. LM Studio, Ollama, and OpenRouter all require a router or wrapper to do the same job.


Setup Tutorial: OpenCode, Crush, and Other Agents

OpenCode supports Synthetic as a first-class provider. Inside the OpenCode terminal you type /connect, choose Synthetic from the provider list, paste your API key from the dashboard, and pick a model from the dropdown the CLI populates from Synthetic's /models endpoint. There is no JSON config to edit and no base URL to remember; OpenCode handles the wiring. The same flow applies to Crush. (Source: Synthetic API overview.)

Anything else that speaks the OpenAI Chat Completions protocol works against the OpenAI-compatible endpoint at https://api.synthetic.new/openai/v1 with a standard Bearer token. That covers GitHub Copilot's "bring your own model" mode, Xcode Intelligence's external-provider configuration, OpenClaw, and the long tail of Aider, Continue, and Cline-style agents that accept any OpenAI-compatible base URL. The pattern is identical: set a base URL, set an API key, pick a model ID from the catalog, run.

Octofriend is Synthetic's own coding agent and is the lowest-friction option if you have not yet picked one. It assumes the Synthetic endpoint and skips the configuration step entirely. Worth trying as a sanity test even if you plan to use Claude Code or OpenCode in production, because it gives you a known-good baseline against the same models.

One scheduling note: Roo Code, which has historically appeared on lists of "agents that work with this kind of provider," announced an extension shutdown for May 15, 2026. If you are reading this in the second half of May or later, ignore it as an option. Cline is the standard recommended replacement and works against the OpenAI-compatible endpoint described above.


The Model Lineup, in Plain English

Sixteen models is more catalog than most people need, and the names are user-hostile. Here is the mapping that actually matters: which one to pick for which job.

Job Recommended model ID Why
General coding agent (Claude Code Opus/Sonnet slot) hf:moonshotai/Kimi-K2.5 256k context, current open-weight leader on agentic coding tasks
Fast / cheap operations (Haiku slot) hf:zai-org/GLM-4.7-Flash 192k context, optimized for low latency and tool-use loops
Heaviest reasoning / large refactors hf:Qwen/Qwen3-Coder-480B-A35B-Instruct 480B parameters, 256k context, code-specialized
Reasoning with explicit thinking hf:Qwen/Qwen3-235B-A22B-Thinking-2507 235B parameters, thinking mode, 256k context
Reasoning, low-latency alternative hf:deepseek-ai/DeepSeek-V3.2 159k context, currently among the strongest open reasoners
Embeddings (codebase indexing) nomic-embed-text-v1.5 8k context, bundled at no rate-limit cost

For pricing context outside the Synthetic subscription, Kimi K2.5 on its native Moonshot API costs roughly $0.58 per million input tokens and $2.40 per million output tokens, and Kimi K2.6 sits at $0.60/$2.50, comparable to Claude Sonnet on input but meaningfully cheaper on output. (Source: Artificial Analysis, Kimi K2.5 pricing reference.) Inside Synthetic's subscription, none of that math applies; you pay $30 and burn the messages window. The per-token rates only matter if you exceed the subscription tier and fall back to usage-based billing.


Privacy: Real or Marketing?

Synthetic's privacy claims are unusually concrete for the category. The public terms state that the platform never trains on user data and never stores API prompts or completions, and the site references GDPR compliance. The infrastructure is described as "private, secure datacenters" without specific locations disclosed. For a US team buying inference for an internal coding workflow, this is a stronger default than the consumer-tier terms of OpenAI or Anthropic, where opt-out from training and prompt logging usually requires either explicit account settings or an enterprise contract. For a European team where data-handling defaults are the legal floor, it is the difference between "we can use this" and "legal will not approve."

Two caveats are worth raising honestly. First, "we do not store prompts" is a contract, not a guarantee, and the absence of a third-party audit means the right level of trust is operational rather than absolute. Second, the model lineup leans heavily on Chinese open-weight providers (Moonshot, Z.ai/GLM, Alibaba/Qwen, MiniMax). For most US private-sector buyers this is a feature, since those are the strongest open weights of 2026, but for federal, defense-adjacent, or regulated-healthcare buyers it is a separate compliance question worth raising with your team before signing.


Where Does Synthetic.new Fall Short?

Three honest weaknesses. First, the concurrency cap. One concurrent request per model is the binding constraint for anyone running parallel agents. The workaround is real (different models on different agents) but you have to know about it before you hit a queue and assume the platform is broken. Second, the brand-and-platform newness. Synthetic Lab, Co. is a 2026 company; there is no public SLA, no third-party uptime monitoring, and no Capterra/G2 history. Start on the $30/month subscription, do not commit to an annual plan, and keep a fallback configured in your shell.

Third, latency and quality versus the closed frontier. Open-weight models on private infrastructure are typically slower than Anthropic and OpenAI's first-party APIs, and the open frontier still trails the closed frontier on the hardest agentic coding tasks. Kimi K2.5 is genuinely good, strong enough to run as your daily driver, but if your work depends on Claude Sonnet's specific tool-use behavior or Opus's planning quality, you will feel the difference. The honest test is to set up synclaude alongside your existing claude command and run a real refactor through both. Whatever you conclude after an afternoon is more valuable than any vendor benchmark.

The Synbad evaluation suite the company publishes (sourced from real bugs in their 1,000-plus-person Discord community) reports a 100% pass rate on Synthetic-hosted models versus as low as 66% on competitor providers running the same workload. Take a vendor's own benchmark with the appropriate skepticism, but the methodology, pulling failure cases from a live community of agent users rather than a static MMLU-style dataset, is sounder than most synthetic-eval marketing. We mention it because it exists; do not weight it heavily.


Should You Subscribe?

Yes, if any of these apply: you run Claude Code, OpenCode, or Crush at least an hour a day and have hit Claude Pro's rate limit before; you want a flat-fee bill instead of a metered API; you handle code that you would prefer not to send to OpenAI or Anthropic under their default consumer terms; you want to A/B Kimi K2.5 against Claude Sonnet on real work without setting up half a stack to do it. The cost of trying is $30 and the twenty minutes it takes to wire synclaude into your shell.

No, if any of these apply: you only ever talk to Claude in a browser and never touch the CLI; your work is heavy enough on the closed frontier that giving up Sonnet is a net regression; you cannot accept Chinese-origin model weights for compliance reasons; you need a funded vendor with an SLA today. In any of those cases, stay on Claude Pro or move to an enterprise contract with Anthropic or OpenAI.

The pattern is the same as every other infrastructure-versus-managed decision. Synthetic.new is not better than Claude Pro in absolute terms, and Claude Pro is not better than Synthetic. They are aimed at different buyers. If you have read this far, you are most likely the buyer for whom $30 a month, sixteen open-weight models, and a programmable endpoint actually moves the needle. (For an adjacent piece on the rest of the open coding-agent stack, see our coverage of Agent Zero, the open-source autonomous agent framework, or our honest reviews of other 2026 AI tools.)


Frequently Asked Questions

What is synthetic.new?

Synthetic.new is a private-cloud LLM subscription from Synthetic Lab, Co. that hosts sixteen always-on open-weight models — including Kimi K2.5, Kimi K2.6, GLM 4.7 Flash, GLM 5.1, Qwen3-Coder-480B, DeepSeek V3.2, and Llama 3.3 — on infrastructure that does not train on your data and does not store API prompts or completions. The platform exposes both an OpenAI-compatible and an Anthropic-compatible endpoint, so it works as a drop-in replacement inside Claude Code, OpenCode, Crush, and similar coding agents.

How much does synthetic.new cost?

The subscription tier is $30 per month (billed as $1 per day) and includes 500 messages per 5 hours, one concurrent request per model, OpenAI- and Anthropic-compatible API access, and unlimited use of the bundled Nomic embedding model. A pay-per-token tier exists for usage-based workloads.

Can you use synthetic.new with Claude Code?

Yes. Synthetic ships a native Anthropic-compatible endpoint at https://api.synthetic.new/anthropic. You point Claude Code at it by exporting ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN, then optionally setting ANTHROPIC_DEFAULT_OPUS_MODEL, SONNET, and HAIKU to specific Synthetic model IDs such as hf:moonshotai/Kimi-K2.5 and hf:zai-org/GLM-4.7-Flash. No router, no proxy, no extra software.

Does synthetic.new train on my code?

Per the company's public terms, no. Synthetic states that it never trains on user data and never stores API prompts or completions, and references GDPR compliance. This is a stronger default than the consumer-tier terms of OpenAI or Anthropic, where opting out of training and logging usually requires explicit account settings or an enterprise contract.

Is synthetic.new worth it compared to Claude Pro?

It depends on the work. Claude Pro at $20/month gives you a single closed frontier model (Sonnet) inside the Anthropic apps; Synthetic at $30/month gives you sixteen open-weight models and a real API key that any coding agent can hit. For a developer running Claude Code, OpenCode, or Crush all day, Synthetic's flat-fee API and 500-messages-per-5-hours window is the better fit. For a user who only ever talks to Claude in the browser, Claude Pro is still the cheaper and simpler choice.


The Bottom Line

Synthetic.new is the cleanest current answer to a specific question: how do I run my coding agent against open-weight models, on private infrastructure, for a flat fee my finance team will not flag? At $30 a month it costs $10 more than Claude Pro and gives you sixteen models, a real API endpoint, an Anthropic-compatible mode that drops into Claude Code with two environment variables, and a privacy clause that does not require an enterprise contract to opt out of training. The one number you have to internalize before subscribing is the one-concurrent-request-per-model cap; everything else about the product is straightforwardly good.

Subscribe, drop the synclaude function into your shell, run a real afternoon of work through Kimi K2.5, and keep your existing Claude Pro account live for the cases where Sonnet's specific behavior is what you actually need. That parallel-trial approach is the honest way to know whether the $30 belongs in your monthly stack.