📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

The latest SDLC framework emphasizes that AI models constitute only about 10% of system behavior. The critical factors are the harness and context engineering, which shape AI performance and cost-efficiency. This shift impacts how organizations should approach AI integration.

A new Google whitepaper, titled The New SDLC With Vibe Coding, states that AI models account for only about 10% of the behavior in AI-driven systems. The paper emphasizes that the harness and context engineering are the primary factors influencing system performance, cost, and reliability. This challenges the common focus on model improvements alone and shifts attention toward configuration, scaffolding, and contextual design.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, highlights that 85% of developers are using AI coding agents regularly, with 51% using them daily. Despite the rapid adoption, the paper argues that generation—the AI model’s output—is now largely solved. The focus must move to the harness, which includes prompts, tools, rules, and observability, and to context engineering, which involves structuring instructions, knowledge, memory, and guardrails to guide AI behavior effectively.

The authors demonstrate through experiments that changing the harness can significantly improve AI performance, often more than upgrading the model itself. For example, a team moved a coding agent into the top tier by adjusting only the harness, not the model. They also emphasize that most failures are configuration issues, not model deficiencies, making the harness the surface area for optimization and competitive advantage.

At a glance
reportWhen: announced March 2026
The developmentA Google whitepaper introduces a new software development lifecycle (SDLC) emphasizing that AI models are only 10% of system performance, with harness and context engineering being the main drivers.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This shift in focus from models to harness and context engineering has major implications for how organizations invest in AI. It suggests that cost-efficiency and performance depend more on configuration, scaffolding, and contextual design than on acquiring the latest model. Companies that master harness and context engineering can achieve better results with existing models, reducing costs and improving reliability. This redefines the skills and priorities for AI teams, emphasizing system design over model experimentation.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI System Design and Industry Trends

The whitepaper builds on recent industry trends where AI adoption has surged, with over 80% of developers using AI agents. Previously, the focus was on improving models—new architectures, larger datasets, and faster training. However, as models have matured and generation has become more reliable, the bottleneck has shifted to configuration and integration. The concept of vibe coding—quick prompts with minimal oversight—has given way to disciplined, structured approaches like agentic engineering, which involves formal specifications, testing, and guardrails.

This evolution reflects a broader understanding that AI success depends on how well systems are designed around the models, rather than the models alone. The paper underscores that the most impactful improvements come from refining the harness and context, not from chasing newer models every quarter.

“The biggest shift in software engineering isn’t a new language or framework; it’s moving from writing code to expressing intent and trusting machines to realize that intent.”

— Addy Osmani

Observability in the AI-Native Era: Leveraging AIOps to build, observe, and operate resilient systems

Observability in the AI-Native Era: Leveraging AIOps to build, observe, and operate resilient systems

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unclear Aspects of Implementation and Industry Adoption

While the whitepaper presents compelling evidence that harness and context are critical, it is still unclear how quickly organizations will adopt these practices at scale. Specific strategies for transitioning from vibe coding to disciplined engineering vary across industries, and the long-term impact on AI performance and costs remains to be validated in diverse real-world settings. Additionally, there is limited data on how this approach influences security and compliance in complex systems.

YAML Made Simple: A Beginner’s Guide to Configuration and Data Structuring

YAML Made Simple: A Beginner’s Guide to Configuration and Data Structuring

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Expected Developments in AI System Design and Best Practices

Organizations are likely to invest more in developing robust harnesses and structured context engineering frameworks. Future research and case studies will clarify best practices for scaling these approaches. Industry standards and toolkits may emerge to support systematic harness and context design, making disciplined AI engineering more accessible. Monitoring and evaluation metrics will evolve to measure the effectiveness of harness and context improvements, shaping the next phase of AI development.

Agentic AI with Microsoft Foundry: Design and develop intelligent AI solutions and autonomous agents with Microsoft's Agent Framework

Agentic AI with Microsoft Foundry: Design and develop intelligent AI solutions and autonomous agents with Microsoft's Agent Framework

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

According to the whitepaper, the harness and context—including prompts, tools, rules, and memory—are responsible for the majority of how AI systems behave, with the model itself playing a smaller role.

How does focusing on harness and context improve AI performance?

Adjusting the harness and context allows for better control, reliability, and cost-efficiency, often yielding more significant improvements than upgrading the model itself.

What skills should AI teams prioritize based on this framework?

Teams should focus on system design, configuration, and contextual engineering, developing expertise in harnessing tools, prompts, guardrails, and structured knowledge.

Does this mean model development is no longer important?

Model development remains valuable, but the whitepaper suggests that the real leverage comes from how models are integrated, configured, and guided within systems.

What are the risks of neglecting harness and context engineering?

Neglecting these aspects can lead to unreliable, insecure, or inefficient AI systems, as most failures are due to configuration issues rather than model deficiencies.

Source: ThorstenMeyerAI.com

You May Also Like

Data: The One Thing You Can’t Rent

In 2026, data scarcity has become the industry’s main bottleneck, with access increasingly fenced, priced, and controlled by those who own unique, verified sources.

The Power Bottleneck: AI Data Centers and the Grid Cliff Approaching 2027-2028

Power constraints threaten AI data center expansion, with grid limits and infrastructure delays risking deployment schedules by 2027-2028.

Forward-Deployed Engineer Economics 2.0: The Unit Economics Math, Six Months Later

Six months after initial analysis, FDE economics show high profitability at scale but risks at lower levels, impacting enterprise AI deployment strategies.

Software engineering. The canonical case.

Empirical data confirms a 40% drop in junior developer hiring since 2022, with senior engineers mostly augmented by AI. The sector faces a bifurcated impact and a looming pipeline crisis.