Government agencies embraced generative AI (GenAI) experimentation with shocking speed in a sector known for slow, top-down change. Pilot programs popped up like daffodils after a long winter: Sudden, scattered and full of potential. As these early projects mature, agencies must make strategic decisions about how to scale their efforts best: Should they rely on public or proprietary artificial intelligence (AI) models, or adopt open-source alternatives?
Each option carries trade-offs. For agencies and their developers, the right fit balances how much control they need, the complexity of the project and how adaptable they should be in the long term.
Choosing Among Public, Open-Source and Proprietary AI Models
The first step to choosing the right AI strategy is understanding the differences among public, open-source and proprietary models — terms that get muddled and confused.
Public AI models — ChatGPT, Gemini and Copilot — are off-the-shelf tools hosted by third-party vendors. Agencies can quickly access scalable solutions with built-in support and updates in exchange for ongoing licensing and service fees. However, development teams have only limited visibility into the underlying algorithms and minimal ability for customization. How these tools handle user data — where it’s stored, for how long and whether it’s used for training — raises significant privacy and data security concerns, especially for public sector users.
Open-source AI models can be hosted on the private cloud, on-premises systems or edge devices. Generally, developers have more control over data and model behavior. This control starts with only models from trusted environments and verifying their authenticity and security. Although open-source solutions are more cost-effective over time, they require skilled teams to implement and maintain.
Proprietary AI refers to models that are built or heavily customized in-house, which offer a high level of control and trust. Generally, building models from scratch is an expensive and resource-intensive pursuit, so this option is becoming rarer. Instead, many tailor open-source models to their specific missions.
What Matters in Model Selection
Ultimately, the right model is the one that meets an agency’s operational requirements and risk posture. Four factors consistently rise to the top when evaluating options:
- Data security is often non-negotiable in government settings. Models must be deployable in secure environments and handle sensitive or classified data without external exposure.
- Bias must be examined regarding what data the model was trained on and what was intentionally excluded.
- Reliability means the model delivers accurate, dependable results across use cases.
- Consistency ensures outputs remain stable and interpretable — even when the prompts or queries are slightly reworded.
These are the guiding principles that should shape procurement decisions and architectural design.
Some agencies are already responding to these priorities with thoughtful, phased deployments. For instance, the Department of Homeland Security initially allowed employees to use public AI tools only with publicly available information before launching DHSChat within its secure internal environment. After all, a forward-thinking organization doesn’t have to choose just one option.
Future-Proofing AI in Government
No single model type will address every use case. What matters most is building an adaptable foundation that can evolve as needs — and technologies — change.
Modular, hybrid architectures allow agencies to use a mix of public, open-source and proprietary models, depending on the task. With proper abstraction layers, teams can swap models in and out without vendor lock-in or major rewrites. For DevOps teams, this means building infrastructure that supports multi-model pipelines, fine-tuning workflows and strict validation. AI should be treated like any other software component: Testable, traceable and manageable within modern CI/CD pipelines.
Moreover, creating this kind of flexibility requires a team with technical chops and subject matter expertise. For a long time, the technology industry has prioritized increasing the number of coders, but AI requires a more holistic approach. The technical team members must understand systems architecture, design patterns, interfaces and software engineering principles. They can then tag in the subject matter experts to validate and fine-tune AI responses. After all, an effective GenAI solution offers an accurate answer, not just any answer.
The early blooms of AI experimentation showed the new, promising ways of streamlining government operations. Now comes the real work of cultivating environments where AI can take root and thrive.