The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google highlights that in AI development, the model itself is only 10% of the system. The majority of behavior depends on harness design and context engineering, shifting strategic focus for developers.

A new Google whitepaper released in early 2026 argues that the model used in AI systems constitutes only about 10% of what determines behavior. The paper emphasizes that harness design and context engineering are the critical factors shaping AI performance, marking a significant shift in how organizations should approach AI development and deployment.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the model itself is a small part of the overall system, with 90% of behavior influenced by the surrounding harness — including prompts, tools, rules, and observability. Evidence from public benchmarks shows that changing only the harness can dramatically improve AI performance, even with the same model.

Furthermore, the paper stresses that context engineering— the process of providing relevant instructions, knowledge, examples, and guardrails — is more impactful than optimizing prompts alone. This approach allows a generalist AI to perform as a specialist without carrying the entire knowledge base at once, reducing costs and improving reliability.

Strategically, the authors argue that organizations should focus more on building and owning their harnesses and context structures, rather than chasing the latest model releases, which are only a small part of the equation.

At a glance
reportWhen: published early 2026
The developmentGoogle’s new whitepaper reveals that in AI systems, the model is only 10% of the behavior, emphasizing the importance of harness and context engineering.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Strategy and Development

This shift in understanding has profound implications for AI development. It suggests that long-term competitive advantage depends on how well organizations design their harnesses and manage context, rather than solely on acquiring the newest models. Companies that invest in robust configuration, tools, and context management can achieve better performance at lower costs and with greater security, compared to those relying on constantly upgrading models.

Additionally, the focus on harness and context engineering encourages a more disciplined, structured approach to AI deployment, which can improve reliability, reduce costs, and enhance security — especially as AI becomes more embedded in critical systems.

Amazon

AI harness design tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on the Shift in AI Development Focus

Prior to this whitepaper, the common belief was that model improvements— larger, more capable models — were the primary driver of AI progress. However, recent experiments and benchmarks indicate that configuration and scaffolding of AI systems play a larger role in actual performance. The paper builds on observations from industry and research that show how tweaking prompts, tools, and rules can significantly outperform simply upgrading models.

This perspective aligns with earlier trends towards more modular, configurable AI systems, but now emphasizes that the model is just a small component, with the harness and context being the real levers of performance and cost-efficiency.

“The model is only 10% of what determines behavior; the harness is 90%. Focus on configuration and context.”

— Addy Osmani

Amazon

context engineering software for AI

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unanswered Questions About Practical Implementation

While the whitepaper provides strong evidence that harness and context are crucial, it remains unclear how organizations can best standardize and scale these practices across diverse AI applications. Specific methods for measuring harness effectiveness and integrating these approaches into existing workflows are still being developed and tested.

Additionally, the long-term impact of focusing on harness and context versus model improvements is still being studied, and the industry is awaiting further empirical data to confirm these strategic shifts.

Amazon

prompt engineering tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Teams and Developers

Organizations should prioritize investing in harness development — including tools for configuration, monitoring, and security — and refine context engineering practices. Future research and industry experiments will likely focus on creating standardized frameworks for harness management and measuring their impact.

Expect continued emphasis on cost-effective AI deployment strategies, with many companies reevaluating their AI architecture to optimize for performance, reliability, and security based on these insights.

Amazon

AI observability and monitoring tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper shows that the behavior of AI systems depends heavily on how they are configured, the tools and rules around them, and the context provided. The model itself is just a small component, with the rest of the system shaping outputs.

How can organizations improve AI performance without upgrading models?

By focusing on harness design — such as prompts, tools, rules, and observability — and context engineering — providing relevant instructions and knowledge — companies can significantly enhance AI effectiveness and reduce costs.

What are the risks of ignoring harness and context?

Ignoring these elements can lead to misbehavior, security vulnerabilities, and higher costs due to inefficient configurations and unverified outputs, regardless of model capabilities.

Will this shift affect AI model development?

While model improvements will continue, the whitepaper suggests that more value can be gained from better harness and context management, which may lead to a rebalancing of R&D efforts in the industry.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

7 Best PC Routers for Prime Day Deals in 2026

Discover the best PC routers on Prime Day 2026, including WiFi 7 models, wired options, and gaming routers, with expert picks and buying tips.

Customer service + BPO. The operational-scale displacement.

Empirical evidence shows customer service and BPO sectors are experiencing widespread AI-driven workforce displacement, shifting from cohort-based to operational-scale patterns.

Different Game, or Already Lost? Reading Mistral’s Sovereignty Bet

Mistral emphasizes European sovereignty, open weights, and local deployment to compete in AI. Is this a strategic advantage or a sign of falling behind?

7 Best Headphones for Prime Day Electronics Deals in 2026

Discover the best headphones for Prime Day 2026, including top picks for noise cancellation, comfort, and value across various use cases.