#35 Context Engineering: The Competitive Edge Business Leaders Can't Ignore

Plus: Directors’ Corner: When AI Agents Act, AI Models Took a Bite Out of Software in Earnings Season, Wispr Flow, and More

Aug 13, 2025

Dear Readers,

I recently spent a few productive days in New York speaking at the AI Leadership Forum by Corporate Board Member and meeting with colleagues and friends. The conversations about AI adoption, organizational structure shifts, ROI challenges, and risk oversight were incredibly valuable, and I look forward to sharing those insights with you in upcoming issues.

But sometimes an instructive lesson about AI can come from unexpected places. Mine came courtesy of a series of flight cancellations at Newark over two days.

United Airline's AI chatbot could handle routine questions and point me to webpages, but when faced with my specific rebooking needs, it became stuck in conversational loops that solved nothing. What struck me wasn't just that the AI couldn't help—it was that the entire system seemed designed to prevent me from reaching a human who could. The bot wasn't a helpful first line of defense; it had become an impenetrable barrier.

I ended up making multiple trips to the airport counter and waiting in long lines for a human agent. The irony wasn't lost on me: here I was, fresh from an AI leadership conference, experiencing firsthand how poorly implemented AI can actually make customer experiences worse than having no automated system at all.

This got me thinking about a critical question many of you are likely grappling with: Where is AI a true front door to service, and where is it a gate that adds cost, delay, and risk?

In this edition, I cover:

Notable Developments
- OpenAI Ships GPT‑5, Opens Weights with GPT‑OSS
- Memo Time: Nadella’s Warning on Durable Moats; Simo’s Six Pillars for Responsible AI
- AI Models Took a Bite Out of Software in Earnings Season
- Nvidia and AMD agreed to 15% revenue cut to sell chips to China
Context Engineering: The Competitive Edge Business Leaders Can't Ignore
Directors’ Corner: When AI Agents Act
AI Tool Spotlight: Wispr Flow

Enjoy.

1. OpenAI Ships GPT‑5, Opens Weights with GPT‑OSS

OpenAI launched GPT-5, now the default in ChatGPT and the API, with faster reasoning, fewer hallucinations, and stronger coding and multimodal performance, plus a “Thinking” mode for harder tasks. In parallel, OpenAI released GPT-OSS open‑weight models (20B, 120B), enabling local and cloud deployment with fine-tuning flexibility. Together, they signal a twin-track strategy: a unified, routed flagship model for end users and enterprises, and open weights for builders prioritizing control, cost, and privacy.

2. Memo Time: Nadella’s Warning on Durable Moats; Simo’s Six Pillars for Responsible AI

Microsoft CEO Satya Nadella called the current tension of record profits alongside layoffs “the enigma of success in an industry that has no franchise value,” stressing that progress isn’t linear and Microsoft must “unlearn” and “learn” while maintaining core businesses and creating new AI-driven categories. OpenAI’s Fidji Simo’s first memo as incoming CEO of Applications frames AI as broad empowerment across six pillars—knowledge, health, creative expression, economic freedom, time, and support—warning it could also concentrate power if not built and shared intentionally. Both leaders emphasize disciplined, mission-led AI execution.

3. AI Models Took a Bite Out of Software in Earnings Season

Collaboration platform Monday.com fell 27% post-earnings, crystallizing a broader trend: many SaaS names are wobbling even as the Nasdaq hits new highs. AI demand is strong, but enterprise buyers are pausing on what, how, and how much to buy. They’re weighing buy vs. build, partnerships, acquisitions, and hybrids.

At the same time, AI models and AI-first startups are chipping away at incumbents with rapidly improving capabilities and vertical solutions. The result: longer evaluations, pressure on per-seat pricing as AI automates user-based work, and softer retention for traditional enterprise SaaS. In short, ambiguity in AI purchasing and deployment is reshaping software economics, and raising the bar for products that can prove measurable, positive outcomes.

4. Nvidia and AMD agreed to 15% revenue cut to sell chips to China

Nvidia and AMD reportedly accepted a 15% revenue giveback to keep selling AI chips into China. The U.S. can’t levy export taxes under the Constitution, but many chips are made abroad, opening a path for BIS to attach new license fees or frame a general-purpose surcharge to fund U.S. tech leadership. However, Beijing is pressing Alibaba, ByteDance, and others to justify their Nvidia H20 orders, adding friction just as Nvidia navigates its new export arrangement, underscoring how geopolitics, compliance, and revenue trade-offs now shape the AI supply chain end to end.

Context Engineering: The Competitive Edge Business Leaders Can't Ignore

Context Engineering: Bringing Engineering Discipline to Prompts — Credit: Addy Osmani, Elevate

Picture two teams asking the same model to price a new product. Team A gets averages and caveats. Team B gets a tiered plan tied to segments and margins, anticipates competitor moves, and flags legal constraints. Same model, completely different outcomes. The difference is context.

Context engineering determines what the model sees, how information gets framed, and when specific knowledge gets applied so outputs actually match your business reality. Think of it as connecting your sources of truth (documents, CRM, ERP, data warehouse) with the constraints that matter (policies, brand voice, regulatory rules) and the task signals that drive decisions (customer tier, region, timeframe, risk appetite).

This goes far beyond better prompting. You're essentially exposing the right data, defining guardrails, and packaging everything so these systems work like your best internal expert.

Why This Matters for Business Leaders

What I find fascinating is how relevance beats generic brilliance. A right-sized model with the right context wins against even frontier models operating without it. This translates to better accuracy, fewer hallucinations, and the ability to scale complex workflows that previously required human expertise.

The operational benefits compound quickly. When you operationalize context through data integration, retrieval systems, memory management, and guardrails, you're compressing both decision time and cost while enabling personalization at enterprise scale.

From a governance perspective, embedding policies, definitions, and exclusions directly into the system creates auditable answers. You can trace what sources were used, which rules applied, and why specific recommendations emerged. This aligns perfectly with the enterprise operating models boards expect for responsible deployment.

The scalability story is equally compelling. Once you codify context for key workflows like renewals, RFPs, or quality reviews, you can template and roll them out across teams and geographies. This foundation enables the reliable assistants and agents that actually move business metrics.

The Strategic Timing

Three forces make context engineering particularly urgent right now. Foundation models are becoming commoditized, cheaper, better, and increasingly interchangeable. Competitive advantage is shifting up the stack to your unique context layer.

Simultaneously, organizations are sitting on years of contracts, emails, specifications, support threads, and playbooks that represent idle cost centers. Context engineering converts this dormant knowledge into active capability.

Meanwhile, governance pressure is intensifying. Boards and regulators demand explainability and control. Context engineering creates the audit trail that satisfies these expectations while delivering business value.

Building Durable Advantage

The largest performance gains come from selecting, assembling, and governing information, tools, and memory rather than clever wording. Retrieval quality, memory management, policy guardrails, and tool orchestration drive the consistent outcomes that matter: shorter cycles, lower costs, better personalization.

Context pipelines prove harder to replicate than prompt tips. While clever prompting techniques spread quickly across industries, building robust data rights, retrieval quality, evaluation frameworks, and observability systems compounds over time and creates sustainable differentiation. Prompting is a skill; context is a system.

A Mental Model: The Context Stack

I think about effective context engineering as a stack with five layers.

The foundation establishes shared language and definitions, your taxonomy, product names, and units of measurement.

The knowledge layer curates your best sources, typically the top five to ten most relevant for each workflow.

Signal integration brings in dynamic inputs like customer attributes, stage, geography, dates, and system status.

Rules enforcement handles constraints and policies around privacy, brand guidelines, legal requirements, and financial parameters.

Finally, templates codify task patterns like RFP response frameworks, renewal playbooks, and board brief formats.

Getting Started Strategically

Strategic leaders begin with a context inventory, mapping key decisions to specific sources, owners, data freshness requirements, and sensitivity levels. The integration phase focuses on standing up secure pipelines, retrieval systems, embeddings, and access controls while building observability for relevance, latency, and cost.

Orchestration brings together system instructions, tools, retrieved documents, user and session memory, and structured outputs. Continuous evaluation measures answer quality, hallucination rates, retrieval precision, and business KPIs while iterating on summaries and filters.

For immediate impact with minimal organizational disruption, consider starting with customer support by connecting product documentation, release notes, and prior tickets. Measure first-contact resolution and deflection rates. Sales and renewals benefit from wiring CRM activity, pricing data, contract terms, and usage telemetry to generate account briefs and policy-aligned proposals.

Operations and quality teams can combine standard operating procedures with sensor and incident data to anticipate failures and standardize root-cause analysis. Finance and planning functions gain significant value by constraining models with chart of accounts, cost drivers, and approval thresholds to eliminate recommendations that miss the mark.

When Context Engineering Delivers Real Edge

Context engineering creates sustainable advantage when you have unique, high-quality data you can expose safely and quickly. Work requiring multi-step reasoning, personalization, or compliance represents the sweet spot where base models lag without tailored context. The key is building systems that continuously evaluate relevance and improve retrieval and summaries over time.

However, context engineering won't create a competitive moat when your data is generic, pipelines are thin, or governance is weak. In these cases, gains fade quickly as others adopt similar patterns. Tasks simple enough for base models to handle well also don't benefit from sophisticated context systems.

Anyone can rent access to powerful models. Competitive advantage comes from teaching those models your business. Context engineering converts the documents, decisions, and institutional knowledge you already own into faster cycles, better judgment, and sustainable differentiation. Leaders succeed by being intentional about what these systems should know and when they should know it.

When AI Agents Act

Last week, a CEO told me her company wasn’t using AI agents. A quick look at a few vendor dashboards said otherwise. Consider the recent Replit incident: an AI coding agent deleted a live production database during a code freeze, despite explicit instructions not to proceed. The CEO called it “unacceptable” and shipped fixes, but the trust hit was real. The lesson: don’t let agents act without brakes, mirrors, and a seatbelt.

The shift from copilots that suggest to agents that execute with at least some autonomy changes the board's job. We're governing operational consequences in an environment that’s changing in unprecedented speed. Boards that get this right have to change how they lead.

Beginner's Mind, Expert Judgment

The most effective boards pair beginner’s mind with deep experience. We don’t need to be technical experts; we need to model learning agility. When directors say “we don’t know yet, and we’ll learn together,” management has permission to experiment intelligently rather than pretend certainty.

Speed With Brakes: The New Rhythm

Annual AI plans are obsolete. High‑functioning boards are adopting quarterly strategy reviews and treating major pivots as normal. The discipline is in drawing the line:

Board oversight: strategic direction, risk appetite, resource allocation
Management autonomy: tool selection, pilots, sequencing

Trust as an Operating System

Trust belongs on the dashboard next to uptime and cash. Leading teams are tracking time‑to‑human, escalation success rates, post‑incident satisfaction and recurrence, and quarterly “chaos drills” that simulate AI failures. The design standard is escalation by default. Pair observability with auditability: who/what acted, when, and why.

Fall in Love With Problems, Not Solutions

Nine‑month procurement cycles for tech that changes monthly is a governance anti‑pattern. Direct management to define capability gaps (faster inquiry handling, better forecast accuracy, automated compliance reviews) and run short, budget‑bounded pilots with clear success and pre‑committed sunset criteria. Be prepared to revisit portfolio choices—buy, invest, build, partner—more often than in the past technological adoption cycles.

What Directors Should Oversee

Establish guardrails before agents can act. Any AI that changes records, moves money, triggers communications, or alters configurations requires observability, escalation SLAs, and an incident playbook pre-go-live.Concretely: dev/stage/prod separation; least‑privilege access; human approval for destructive or high‑value thresholds; one‑click rollback and tested backups.

Mandate a quarterly AI strategy loop. Explicit horizon-switching across H1 execution, H2 scaling, and H3 bets. Require pre-mortems/post-mortems and portfolio changes as a norm, not a failure.

Put trust metrics in leadership reporting. Treat time-to-human, escalation health, post-incident outcomes, and employee sentiment as operating KPIs with thresholds and remediation plans.

Verify domain experts co-own deployments. Management should prove subject matter experts are at the wheel for design, testing, and sign-off in their functions.

Fund role-based literacy with evidence of lift. Approve competency rubrics (basic → intermediate → aspirational) by role and track usage-to-outcome deltas, not training hours.

Questions to Ask Next Meeting

Where are agents already executing actions across our systems, and what are the escalation SLAs by function?
What observability exists today: can we fully audit who/what acted, when, and why across vendors and internal systems?
Which top use cases will we scale this quarter, and what guardrails are pre-committed?
Where are we redlining (control risk) or idling (opportunity loss), and what changes in the next 90 days?
How are incentives rewarding safe escalation, learning agility, and outcome quality?

Regardless whether ‘AI agents’ is an overused term, we can’t deny that more tasks will be delegated to AI with tools over time. The board's role is speed with brakes: insist on observability before scale, escalation-by-design, and trust as an operating metric so management can move faster, safer.

AI Tool Spotlight: Wispr Flow

Wispr Flow makes voice dictation not just easy, but genuinely better than typing. Voice dictation is simple to try yet hard to do well, and Wispr nails it by removing friction with thoughtful AI. Unlike past tools, you don’t have to say “comma” or correct the output nearly as much—your thoughts just flow without interruption. I can even speak in Chinese and have it output Chinese characters or switch to English depending on context. I use voice far more now, slip into flow faster, and stay there longer. On my computer, it goes straight into email, search boxes, and other fields, so there’s no tab switching or copy-paste—just capture, refine, and move on. If you want fewer micro-interruptions and more momentum, Wispr Flow upgrades how you work.

Thanks for reading.

Joyce

Share AI Simplified for Leaders

M. J. Kelley

Aug 13

Appreciate your thoughts on context engineering for AI. It's all about the context stack. I've found building out that foundation first is so helpful. Recently, I rebuilt my website with Claude, starting the process with context-creating conversations that eventually became the foundational documents for the project.

Expand full comment