How to build an AI Lab — the only 3 things that matter

Most large organisations are now two years into AI investment. The portfolio is a collection of scattered pilots, point solutions, and training programmes that haven’t compounded into anything that can be defended at board level. The pressure is real — clients are asking, competitors are shipping, and leadership is asking the same question everywhere: we’ve been investing, so why isn’t this scaling?
The answer is almost never the technology. It is the absence of a structured system for deciding what’s worth building before anyone commits to building it. Without that system, AI investment fragments across the organisation — each team pursuing its own direction, nobody filtering for what actually creates value, and the board question still unanswered twelve months later.
This article is about what actually makes an AI Lab work. Not the theory of it. The three structural components that determine whether your Lab produces a governed pipeline of AI decisions — or becomes another layer of expensive activity with nothing to show at the end of the year.
Three patterns that should sound familiar
Before getting into the Lab, it’s worth naming the failure patterns clearly. Not to dwell on them, but because the Lab’s design is a direct response to each one.
Tools and training without direction. Most organisations start here. AI literacy programmes. Copilot workshops. Prompt crafting sessions. People learn things, which matters — but then they apply AI to what they already do. They speed up existing workflows, automate existing tasks, occasionally build something clever that nobody asked for. The broken processes they worked around for years, they now work around slightly faster. Two years of investment and the portfolio looks like a collection of point solutions with no strategic coherence.
A technical team that couldn’t speak to the business. A team of specialists — sometimes PhD-level — set up with serious technical firepower and full leadership backing. Enormous claims made about what AI could save or generate. And then, quietly, the team gets disbanded. Because they couldn’t translate their work into decisions the business trusted or workflows people actually adopted. The pattern is consistent enough that it has a name: the PhD Problem. And it is one of the most common triggers for people in your position looking for a different approach.
Distributed experimentation without a filter. No centralised team, no mandate, AI activity everywhere — trainings here, pilots there, champions in different business units each pursuing their own direction. It feels like momentum until you add it up. “For the last two years we’ve been doing a bottom-up approach — we tried many things, trainings here, experiments there. But now we know we need to be more intentional.” Hungry rabbits everywhere, nobody leading the pack.
These three patterns fail at exactly the same point: the moment someone needs to decide what is actually worth building with AI — for whom, measured how, with what data, within what constraints — and there is no structured system in place to answer it.
What an AI Lab actually is
An AI Lab is not a team. It is not a department, a centre of excellence, or a digital transformation programme. It is not a sandbox for experimentation and it is not another layer of governance.
Those things already exist in most organisations. They are not working fast enough.
An AI Lab is an exploration engine: a structured, repeatable system that runs parallel to your core business to answer one question — what is worth building with AI, and what is not — before serious resources are committed.
It helps to understand what kind of lab this is — because there is another kind that often gets confused with it.
In January 2026, Anthropic expanded Anthropic Labs — an internal incubator designed to rapidly test experimental products at the frontier of what their models can do. It started as two people in mid-2024 and is now scaling aggressively. Its job is to push emerging capabilities into new interfaces, build research previews, and test fast with early users. Claude Code, MCP, and several other products that became industry standards began inside this structure. Anthropic's core product organisation focuses on scaling what works reliably for millions of users. Labs is the protected space where uncertainty is allowed.
That is a frontier lab. Its purpose is to explore what AI can become.
What this article is describing is a different kind of lab entirely. An enterprise AI Lab is not exploring the frontier of AI capabilities. The models exist. The tools exist. The question is not what AI can do in theory — it is what AI can do inside your specific organisation, for your specific workflows, for your specific users, within your specific constraints. The frontier has already moved. Your job is to make the best of what it produced inside the context that matters to you.
This distinction is important because the two labs require completely different structures. A frontier lab needs researchers, engineers, and the freedom to experiment without a fixed outcome. An enterprise AI Lab needs cross-functional teams, structured decision-making, and the discipline to stop ideas that won't work before they consume six months of resources. One explores what's possible. The other decides what's worth building — and builds the organisational capability to do it repeatedly.
If you want to go deeper on what an enterprise AI Lab is, how it fits inside the organisation, and how to think about the system design before getting into the building blocks, this article covers that in full: The AI Lab: the system behind successful AI transformation.
The rest of this article focuses on the three things you actually need to build one.
The Lab is built on three things: AI Discovery Pods, AI Facilitators, and a structured workshop cadence. Each one is necessary. None of them works without the other two.

1. AI Discovery Pods
Most organisations try to run AI the way they run traditional delivery: business writes requirements, product scopes, engineering builds, risk reviews, then launch. That sequence breaks in AI because the unknowns don’t wait their turn.
Data constraints surface before anyone has agreed on the use case. Legal questions emerge the moment someone touches customer content. Trust problems appear before anything ships. Adoption fails even when the model works. AI doesn’t move in a straight line. It moves through messy collisions of workflow, data, risk, and human judgment.
The answer is not to sequence those collisions. It is to put the right people in the same room to work through them together, at the same time, with a clear finish line.
That is the Discovery Pod.
A Discovery Pod is a small, temporary, cross-functional team assembled around one specific AI challenge.
It forms around a problem, runs a structured session with an AI Facilitator, produces a decision, and disbands. Nobody is joining a committee. The pod's job is not to build — it is to answer four questions that most AI initiatives never properly resolve before committing resources: what workflow is actually breaking today? What measurable outcome would change if it were fixed? What data is required, and is it usable? What risks are non-negotiable? If a team can't answer these clearly, they don't have a use case — they have a hope. A Discovery Pod turns that hope into a decision the organisation can defend.
This is the critical distinction from a Centre of Excellence. A CoE is permanent and centralised. A pod is temporary and specific. You assemble it for a defined challenge, drive clarity fast, and disband once the decision is made. Pods keep the Lab from becoming either a backlog of ideas or a slow approval funnel. They create forward motion.
A typical pod includes a business owner accountable for outcomes, a domain expert who understands the workflow being addressed, an AI or ML engineer who assesses technical feasibility, a data engineer who checks data readiness and pipeline realities, a UX designer who brings the user experience lens, a research or customer success voice, a legal and compliance representative who flags constraints early, and a business or process analyst who ties ideas to real workflows and value. The AI Facilitator runs the session from outside the pod.
.png)
2. AI Facilitators
Discovery Pods solve the structural problem. But pods don’t run themselves.
Even with the right people in the room, collective intelligence is not automatic. The moment you bring together people who think differently, speak different professional languages, and optimise for different goals — engineering wants robustness, sales wants speed, legal wants safety, product wants adoption — you introduce friction. Without someone whose job it is to manage that friction, it doesn’t resolve. It hardens. Siloed experts stay siloed even when they’re sitting at the same table.
The AI Facilitator is the person who makes the room work.
They are not a consultant who delivers a roadmap and disappears. They are not a domain expert or a project manager.
They are a process specialist — trained to run structured AI decision-making sessions and get a cross-functional team from a complex, ambiguous challenge to a clear, evidence-based decision in a defined number of days.
Their authority comes from the process, not from hierarchy or domain expertise. That neutrality is what allows them to surface the tensions the team is avoiding, balance voices across the room, and protect the quality of the decision environment.
In practice, AI Facilitators align leadership on where AI should focus before any practitioner work begins, assemble and orchestrate the right Discovery Pods for each challenge, run the structured workshops that move pods from vague mandate to validated decision, and ensure every session ends with a clean output and a clean handoff. Without them, Discovery Pods tend to become debates. With them, they function as decision engines.
The goal of the AI Facilitator is not to create a dependency. It is to build a capability your organisation owns. The best outcome is that your internal people are running the sessions, assembling the pods, and producing the decisions — without needing external facilitation to keep the system running. That is what internal capability actually means.
.png)
3. The structured workshop cadence
The third building block is what turns a one-off Discovery Pod into a repeatable operating capability.
Collective intelligence doesn’t show up because you booked a room and invited smart people. It shows up when the conversation is designed — when the activities are sequenced to move a group from ambiguity to a clear outcome, without getting stuck in debate or drifting into premature solutions. That design is the AI Facilitator’s job. The workshops they run are the mechanism.
The Lab is not a fixed workshop format. It is a cadence of structured sessions, each matched to where the organisation is in its AI decision-making process. The AI Facilitator chooses which session to run based on what the pod needs to accomplish. There are three.
AI Problem Framing is a one-day session that turns vague AI mandates into specific, fundable use cases. A cross-functional pod evaluates the AI opportunities on the table, stress-tests each one against desirability, feasibility, viability, and pragmatic constraints, and converges on a single AI Use Case Card — the one opportunity worth pursuing next, with the evidence and reasoning documented. This is the session to run when you have too many AI ideas and no filter. When everything is getting the AI sticker without proper evaluation.
AI Workflow Sprint is a four-day session that takes a validated use case to a working AI agent MVP tested with real employees. The Discovery Pod works together for the first two days — mapping the current employee workflow, redesigning it to identify where AI creates real value, defining a long-term goal and success metrics, and converging on a solution blueprint. A Builder constructs the working prototype on Day 3. An Interviewer runs structured sessions with real employees on Day 4 and synthesises the findings for the Decider. The session ends with a clear decision: scale, iterate, or stop. This is the session to run when the use case is defined and you need to know whether it works before committing engineering resources.
AI Design Sprint is a four-day session for customer-facing AI products and services. The Discovery Pod works together for the first two days, defining the AI challenge, mapping the customer journey, and converging on a solution direction. A Builder constructs a functional, clickable prototype on Day 3. An Interviewer runs five structured customer sessions on Day 4 and presents findings to the Decider. The session ends with a build, iterate, or stop decision. This is the session to run when you need to validate an AI-powered product before committing to development.
These three sessions are not interchangeable. AI Problem Framing answers: which use case is worth pursuing? AI Workflow Sprint answers: does this work for the employees it is designed for? AI Design Sprint answers: does this create real value for customers? Together, they cover the full decision-making arc from idea to validated prototype — before any production investment is made.
Once you have all three in your Lab's repertoire, the system can respond to any AI challenge at the right level, with the right people, in the right amount of time. And because each session produces a documented decision rather than a conversation, the Lab accumulates evidence over time — a record of what was pursued, what was stopped, and why.
What makes a Lab work at scale
A Lab that runs one session and stops is a one-off. A Lab that runs consistently across business units, produces decisions at a predictable cadence, and builds shared understanding of what works and what doesn't — that is infrastructure. Getting there requires measuring the right things.
The right measurement is not activity.
The most important metric is the stop rate: the percentage of AI ideas that enter the Lab and don’t survive structured evaluation. A stop rate above 60% is a sign the system is working. A stop rate below 30% is a concern — either ideas are being pre-selected for certain success, or conclusions are being softened to avoid conflict.
Decision velocity matters too — the time from a team submitting an AI idea to a confident build, iterate, or stop call. A structured process should compress this dramatically compared to the informal alternative.
Business unit reach signals whether the Lab is becoming organisational infrastructure or staying confined to the same three teams. A working system spreads.
And the number most organisations never ask for: what did the early stops save? For each idea stopped before validation, what was the likely cost of a six-month pilot that would have reached the same conclusion? That calculation transforms the conversation from “what did we build?” to “what did we avoid building?” — which is the right conversation for someone who needs to present AI ROI at ELT or board level.
These four numbers, tracked consistently, give you something concrete to point to at ELT or board level. Not a count of workshops run. A record of decisions made, ideas stopped early, resources saved, and business units engaged. A healthy AI Lab stops more ideas than it advances — and that stop rate is a metric to be proud of. In regulated industries like banking, pharma, and healthcare, it is evidence of responsible AI decision-making, not innovation failure. That framing changes the conversation entirely.
The bottom line
Two years of scattered AI investment without something to show at board level is not a talent problem or a technology problem. It is a decision-making infrastructure problem. No structured way to evaluate which AI ideas are worth pursuing. No cross-functional process that surfaces constraints before they become blockers. No repeatable cadence for moving from vague mandate to grounded decision before serious investment is made.
The AI Lab is that infrastructure. Not a team, not a programme, not a centre of excellence — a system. A Discovery Pod assembles around a challenge, an AI Facilitator runs a structured session, and the pod produces a decision the organisation can act on. Then it disbands, and the next challenge gets the same treatment.
Built with the right structure and run with discipline, an AI Lab turns scattered AI investment into a governed pipeline of validated use cases. It is not a program with a start and end date. It is an operating capability that compounds — and one that your organisation owns, not one it rents.
.png)

