Got nerfed by DeepSeek?
2 receipts logged, none critical. Below: open-source replacements you can switch to today + tracked paid competitors ranked by who’s least likely to nerf you next. Step-by-step guides walk you through the actual install + config + API-key transfer - that part’s a Personal-tier feature.
Self-hosted, free, or freemium tools that cover the same workflow. Every entry is real software with a public repo and an active community.
Easiest way to run a local LLM. Pair with Open WebUI for full ChatGPT replacement.
- Run Llama/Mistral/Gemma/DeepSeek locally
- Ollama-compatible API for any client
- Brew/winget/curl install
brew install ollama && ollama run llama3.3The default ChatGPT-replacement self-host. Massive ecosystem.
- ChatGPT-style web UI for any backend (Ollama, OpenAI-compatible)
- RAG with file uploads
- Function calling, web search plugin
docker run -p 3000:8080 ghcr.io/open-webui/open-webui:mainPermissive (MIT) open-weight model targeting ChatGPT/Claude-class chat and agentic coding without API lock-in - the strongest open option for a self-hosted stack.
- Fully-open (MIT) flagship MoE from Z.ai / Zhipu AI (weights released Jun 2026)
- ~744B total / ~40B active parameters, 1M-token context (~5x GLM-5.1)
- Strongest open-weight coding model at release - #2 on Code Arena, ~#3 on FrontierSWE behind only Fable 5 / Opus 4.8
ollama run glm-5.2 (or download weights from HuggingFace: zai-org/GLM-5.2, serve via vLLM/SGLang)Drop-in open-weight replacement for the OpenAI/Anthropic chat APIs. Flash is the locally-runnable variant; serve it behind Open WebUI.
- MIT-licensed open-weight MoE: V4-Flash (284B/13B active) + V4-Pro (1.6T/49B active)
- 1M-token context
- Near-frontier quality at a fraction of API cost
Download from HuggingFace (deepseek-ai/DeepSeek-V4-Flash); serve via vLLM/SGLangUnrestricted (Apache-2.0) open-weight model to replace the OpenAI/Anthropic/Gemini chat APIs in a self-hosted stack.
- Apache-2.0 open-weight MoE from Mistral AI: 675B total / 41B active
- 256K context, multimodal (text + vision)
- Production-grade general-purpose model
ollama run mistral-large-3 (or download weights from HuggingFace: mistralai/Mistral-Large-3-675B-Instruct-2512-BF16, serve via vLLM/SGLang)Permissive (Apache-2.0) open-weight model to back a self-hosted ChatGPT/Claude replacement. The 35B-A3B MoE runs on a single high-RAM machine.
- Apache-2.0 open-weight family from Alibaba's Qwen team
- Qwen3.6-35B-A3B MoE (35B total, 3B active) + 27B dense
- 256K context, extensible to ~1M
ollama run qwen3.6 (or pull from HuggingFace: Qwen/Qwen3.6-35B-A3B)Other paid tools we track in the same category, ranked by current Nerf Index. Lowest = least likely to nerf you next.
1 major change on record.
2 major changes on record · roughly one every 107 days.
DeepSeek → Ollama
Unlock the full Ollama migration
Step one’s on us. The rest — config port, smoke test, the buried cancel path with a prefilled refund email, and the rollback — open the second you join.
- All 6 steps, copy-paste ready
- Every migration playbook — including ones we haven’t written yet
- Refund email that’s clawed back $80+
- gotnerfed ask — grounded help through every step