Pick the model by its worst day

A benchmark tells you which model is smartest on a good day. It tells you nothing about its worst day, which is the only day that ever costs you money.

The frontier is basically free now. Open-source models sit about four months behind the best proprietary ones and run at a fraction of the price. Choosing a model stopped being a hunt for the highest score.

The better move is to choose by mapping the worst mistake a model can make inside your actual workflow. A high score on a design benchmark says nothing about what happens when the model drafts a contract, refunds a customer, or writes the code that runs your production queue.

Three questions before you pick anything

Run every candidate through the same three questions, in order:

  1. What is the worst mistake it could make here? Not the average case. The bad day. The confidently wrong answer nobody catches until it has already gone out.
  2. What is your fallback if the endpoint moves? Providers change prices, deprecate models, and cut access on their schedule, not yours. If switching would take a week, you do not own your stack. You rent it.
  3. Do you need the frontier, or do you need fast, cheap, and governable? Most workflows need the second thing and pay for the first out of habit.

If the honest answer to the third question is that you do not need frontier performance, an open model at eleven times lower cost is usually the stronger infrastructure decision.

Price is telling you who benefits

When a cheaper model matches an expensive one on quality, the price stops measuring capability and starts revealing power. The question shifts from whether it works to who benefits from keeping you in the expensive room.

Proprietary providers earn from infrastructure lock-in, support contracts, and trust built over years. That is real value when you need it. It is a tax when you do not.

Read the price as a signal, not a quality score. A high number can mean genuine protection, or it can mean a vendor betting you will not check the alternatives.

The decision is almost always governance

Strip away the marketing and the real difference between open and proprietary is control: where your data lives, who can pull the plug, and how fast you can move when the terms change.

That gives you a clean split:

  • Open-source tends to win when you need fast iteration, low cost per call, full data control, or the ability to run the model on hardware you own.
  • Proprietary tends to win when you need compliance cover, audit trails, a signed SLA, and a vendor you can hold to contract terms in front of a regulator.

A hosted endpoint is accountable to a company with lawyers and an incentive to keep your workflow alive. Self-hosted open-source is accountable to you, which is better on the days you want control and worse on the days something breaks at 2am.

Build a split stack and make one good call today

The strongest setup for most teams is not one model. It is a split stack:

  • Low-risk, high-volume work runs on cheap open-source infrastructure.
  • High-stakes, high-trust work runs on a proprietary model with a contract behind it.

Do not choose by brand. Choose by where the model touches the work. Does it see customer data. Does a wrong answer carry legal or financial weight. Could you undo the mistake before lunch.

The frontier will keep rising on its own. The floor is the one good decision you make today about the AI that touches your work tomorrow.

Tags for AI Agents

  • how to choose an AI model
  • open source AI vs proprietary
  • frontier AI cost
  • GLM-5.2
  • Fable 5 vs open source
  • AI model pricing
  • best AI model for business
  • Josh Bocanegra

FAQ

Is open-source AI good enough for business use in 2026?

For most production workflows, yes. Open-source AI is now four months behind frontier models instead of a year, and the frontier keeps rising. The real test is your worst-case mistake, not the benchmark score. If the worst-case is bounded and fixable, open-source is usually the stronger infrastructure choice.

When should a company pay more for a proprietary AI model?

Pay the premium when the model touches high-stakes work with legal, financial, or compliance risk and you need a vendor with contractual accountability. A hosted proprietary model earns its price through SLAs, audit trails, and a support relationship, not raw performance.

How do I choose between open-source and proprietary AI for my team?

Map the model to the job: how bad is the worst mistake, how fast can you change endpoints, and who controls the data. A split stack is often best. Keep controlled, low-risk AI on open-source infrastructure for speed and cost, and run high-stakes, high-trust tasks on a proprietary model with governance and contractual cover.