AI Model Selection: Domestic Solutions and Open-Source Alternatives

Although top-tier flagship models such as ChatGPT, Claude, and Gemini are effectively unavailable in many settings, and corporate or team deployments must still contend with severe compliance constraints, domestic solutions and open-source alternatives are far from irrelevant. They may not yet reach the comprehensive level of those closed flagship systems, but across a wide range of tasks they are already sufficiently capable, and their cost structure is unusually attractive.

DeepSeek has released its latest v3.1, continuing the understated pattern it established in May, when it updated the R1 weights without changing the version number. Although 3.1 has also been criticized as a benchmark-optimized “exam taker” whose real-world behavior is less stable than its scores suggest, its low price, its demystification of the reasoning paradigm, and its speed always wins engineering instinct all reveal a distinctive sense of technical taste. The question is less whether the model is universally capable than whether it is placed in the right context; for API-based daily use and general-purpose tasks, it is already quite composed.

For coding productivity, Kimi K2’s strong performance in agentic abilities may in fact make it the more compelling practical choice. A mature setup that combines the CLI interaction pattern of tools such as Claude Code with Kimi K2 already offers a robust compromise between price and performance. At the same time, the friction of a Claude Pro subscription, together with its demanding network and payment requirements, forms a de facto barrier to entry. Much like Europe’s excessively cautious safety regime, “Constitutional AI” can sometimes appear rather tightly bound.

Another major coordinate in domestic open-source models is Qwen, which, alongside Gemini, has been regarded as something of a managerial miracle; Meta’s LLaMA, by contrast, set a precedent that is not especially worth admiring. Qwen’s embedding models have long been a reliable choice. If one is building a Retrieval-Augmented Generation (RAG) system, or a vector database for a broader class of tasks, they deserve to be considered near the top of the list.

Another direction worth watching is performance-sensitive workloads: OpenAI has released its own open-source model, gpt-oss, whose capabilities reach the level of o4-mini and even o3-mini. This is not a modest position. Even the weight-updated DeepSeek-R1-0528 still sits at some distance from o4-mini, another reasoning model, giving the strange impression of a floor that exceeds someone else’s ceiling. For scenarios with explicit performance requirements, purchasing cloud services to deploy this open-source model is therefore a highly pragmatic path.

For needs that are narrow in scale but entirely real, Gemma 3n offers a different answer. This mobile-first open-source model is almost counterintuitive: even its aggressively compressed sub-1B parameter version still preserves multimodal capability. For mobile computing and real-time translation, this is a remarkably unambiguous blessing.

In conclusion, by August 2025, the debate between open-source and closed-source models had already been settled the previous year in a direct and rather overwhelming fashion. Yet this does not prevent us from using flexible engineering techniques and comparatively rigorous academic methods to construct systems of our own. For instance, the paper The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants uses clustering algorithms to organize open-source models and achieves results surpassing GPT-4.1; in mechanism, it resembles an explicit Mixture of Experts (MoE) architecture. It continues to illuminate the old engineering lesson: there is no absolute perfection, only contextual suitability. To move along the boundary between real-world constraints and technical possibility is precisely the charm of computer science.