Skip to content

How to Choose the Right License for Your Large Model

In recent years, as large models have rapidly evolved, we've been amazed by technological marvels while also becoming increasingly aware: open-sourcing a model is not as simple as dumping parameters and weights onto platforms like GitHub, Hugging Face, or AtomGit.

Open source accelerates technology diffusion and fosters a more vibrant ecosystem. However, for model developers, open-source models are currently in an exploratory state of "crossing the river by feeling the stones":

  • Are model parameters considered copyrightable subject matter? If the license does not explicitly mention it, can downstream parties freely use them?
  • Do training data constitute data rights? If downstream parties use your model to generate infringing content, could you be named as a co-defendant?
  • Can traditional open-source software licenses (like MIT, Apache-2.0) adequately cover the authorization and liability boundaries for model parameters, data, and generated content?

These issues increasingly hang over model developers like a Sword of Damocles, and every practitioner in the field inevitably grapples with them.

To help model developers and downstream users reduce compliance risks and foster a healthier open-source model ecosystem, it's time to discuss how to choose the right license for large models.

I. The "Latent Risks" Facing Open-Source Large Models Are Becoming Apparent

Practitioners know that the key issues with model open-sourcing mainly fall into three categories:

  1. Do Model Parameters Qualify for Copyright Protection? Software is protected by copyright because code is an "expression" written line by line by developers. However, model parameters are not written; they are trained. Within the intellectual property protection system, they are fundamentally different from traditional software. There is significant uncertainty about whether they constitute an "expression" protected by copyright law. This means: when you open-source a model using a traditional open-source software license, you might not be able to license the most critical components—the model parameters and weights—through the license's copyright grant.

  2. How Should Data, Parameters, Model Architecture... Be Licensed? According to the OSI's definition of "Open Source AI," open-source AI must license three core elements simultaneously:

    • Data (Data Information)
    • Code
    • Parameters This is essential to grant users the four freedoms: "to use, study, modify, and share." However, traditional open-source software licenses do not address the licensing of parameters or data at all.
  3. Who Bears the Rights and Responsibilities for Model-Generated Content? If downstream users infringe rights using content generated by an upstream open-source model, can the model provider be held liable? While definitive legal protections for models may still be evolving, both model developers and downstream users clearly need explicit clarification of rights and responsibilities within the license. If these risks are not properly mitigated, open-source models may harbor uncertain legal costs.

II. Why Do Many Teams Still Insist on Using "Open-Source Software Licenses"?

Within the industry, we see many models choosing software licenses like MIT or Apache-2.0. Taking DeepSeek as an example: it initially created the custom DeepSeek License for its large models but later chose to uniformly adopt the MIT License for the following reasons:

  • Non-standard licenses might increase developers' understanding costs.
  • The MIT License is sufficiently permissive, facilitating ecosystem growth.

This reflects a reality: even if not perfectly suited, mainstream permissive open-source software licenses remain the most readily accepted, understood, and disseminated method within the community today. However, as the earlier risk discussion shows: open-source software licenses cannot fully meet the licensing requirements for "open-source large models," especially concerning the authorization of the most crucial component—model parameters.

III. When Traditional Licenses Are Insufficient: The Industry Begins Designing "Model-Specific Licenses"

Precisely because models are created differently from software, leading open-source foundations—both domestic and international—have released two sets of licenses specifically designed for large models:

  • OpenAtom Model License (mid-2024)
  • OpenMDW Model License (early 2025)

The emergence of these licenses marks the industry's official entry into the era where "models require model-specific licenses for open-sourcing."

IV. What Core Pain Points Do Model-Specific Licenses Actually Solve?

  1. Model licenses can cover the licensing of the three core elements of a model: parameters, data, and architecture. This is something traditional software licenses completely fail to do. The OpenAtom Model License explicitly includes "model parameters" within its scope. Furthermore, model developers can choose, based on their own circumstances and licensing intent, whether to open-source the following elements:

    • Model Architecture
    • Training Data
    • Training Code
    • Inference Code This fundamentally addresses the issue of missing authorization for the key component—model parameters—inherent in open-source software licenses.
  2. Grants of Rights Are More Explicit and Comprehensive Traditional software licenses typically only cover software copyright and include a grant of certain patent rights. However, models involve various intellectual property elements. Therefore, the OpenAtom Model License, while retaining grants for copyright and patent rights, adds a "catch-all" clause for "other intellectual property rights." This ensures model developers can fully license their relevant rights in the model to downstream users through this license. OpenMDW, on the other hand, adopts an enumerated approach, adding grants for datasets and trade secrets on top of copyright and patent grants.

  3. Model Providers "Do Not Assert Rights" Over Generated Content For example, the OpenAtom Model License clearly states:

    • The model provider does not assert any rights over generated content.
    • The model user possesses full rights to the generated content and bears the associated responsibilities. This provides clear legal boundaries for both model developers and downstream users.
  4. Simplified Licensing for Hosted Service (MaaS) Scenarios The OpenAtom Model License offers simplified licensing terms for Model-as-a-Service (MaaS) scenarios, making them more user-friendly. It reduces its four open-source licensing conditions:

    • Provide a copy of the license
    • Document modifications
    • Maintain attribution notices
    • Prohibit illegal use To only two conditions applicable in MaaS scenarios:
    • Maintain attribution notices
    • Prohibit illegal use This adjustment lowers the compliance burden in MaaS scenarios, enhances engineering practicality, and significantly improves the user experience of related products.
  5. Balancing Internationalization and Localization: Language-Friendly + Flexible Legal Applicability

    • Adopts both Chinese and English languages.
    • Does not specify governing law.
    • If Chinese law applies, the Chinese version prevails. This provides ample room for domestic teams, international cooperation, and overseas ecosystems.

V. OpenAtom Model License vs. OpenMDW: Which One Should You Choose?

ItemOpenAtom Model LicenseOpenMDW Model License
Coverage of Licensed Subject MatterComprehensive: Parameters, Architecture, Data, CodeSimilar but uses "enumerated" grants
Grant MethodCopyright + Catch-all for other IP rightsCopyright + Datasets + Trade Secrets
Model-Generated ContentExplicitly asserts no rights; clearly states licensee bears risk and responsibility (Clearer)Only states licensor imposes no restrictions or obligations
Suitability for Hosted Services (MaaS)Supports "Simplified License Conditions" (Most friendly)No special simplified mechanism
Language / Governing LawBilingual (CN/EN); Flexible governing lawEnglish text only
Support within Chinese Local EcosystemModels like Mobius, vivo BlueLM, HaiRuo-72B-Health have adopted itRecently released; ecosystem still developing

VI. Final Thoughts: The Open-Source Large Model Ecosystem Needs the "Right Tool"

As model scales expand from billions to hundreds of billions of parameters, as training data grows from terabytes to petabytes, and as open-sourcing transitions from mere "technology dissemination" to a critical component of industrial collaboration, data compliance, and international operations, upstream model teams urgently need a licensing tool that is:

  • Sufficiently professional
  • Comprehensive in its coverage and grants
  • Balances legal rigor with engineering usability
  • Aligned deeply with the industrial ecosystem

The OpenAtom Model License happens to strike the balance the industry needs in these aspects. It is not the only solution, but for large model practitioners in China, it is perhaps currently one of the most "tools" worthy of serious consideration.

In today's era of rapid evolution in the large model industry, "open source" has become almost a core strategic issue for every model team, alongside technological roadmaps. Open-sourcing models can bring prestige, foster ecosystems, attract users, and create commercial opportunities. However, accompanying these benefits are non-negligible risks: Are model parameters copyrightable "expressions" that have been effectively licensed downstream? Are you liable if downstream users generate infringing content with your model? Can you assert your rights if model weights are fine-tuned and commercialized downstream? None of these questions are "easy points"; they are the challenging "final exam questions" that perplex the entire industry and cause collective anxiety.

All these questions ultimately converge at a common entry point: License Choice. The license is not a dispensable accessory for an open-source model; it is the true "secure starting point" for open-sourcing a model. An open-source model without a suitable license is like "the Emperor's New Clothes"—it may temporarily enjoy the superficial "splendor" of widespread praise, but ultimately cannot avoid the embarrassment and risk when the thin veil is pierced.