Olito Labs
I scored 50 consultants across top strategy firms (McKinsey, BCG, Bain), Big 4, venture capital and private equity funds, and boutique practices. All use AI daily. A small group builds systems they say multiply output several times over. Everyone else has plateaued at basic chat. The gap appears to be widening.
I scored 50 consultants across top strategy firms, Big 4, venture capital / private equity, and boutique practices on their AI capability across three capability tiers: everyday chat, advanced prompting (giving AI detailed instructions for complex tasks), and automation and AI-agent tools (software that runs multi-step tasks with minimal human input). All use AI daily, but nearly all usage concentrates in chat, with almost no activity beyond prompting. The gap is not between AI users and non-users. It is within the users themselves.
Everyone started early. Here’s where they actually are, and what the distance between them means.
I interviewed 50 knowledge workers across McKinsey, BCG, Bain, Deloitte, Kearney, boutique practices, and venture capital / private equity firms. These are not laggards. They use AI consistently. They are driven, high-performing members of the workforce.
I assessed their capabilities across three capability tiers, ranging from everyday chat to AI agents that run tasks independently. The result is not what I expected.
Nearly everyone lands in the lower half. Not because they are behind, but because they do one thing: everyday chat. They ask questions, draft emails, summarize documents. Then it stops. The right side of the chart is almost empty. Many of these are people their organizations would consider AI-forward.
The historian Paul David documented the same pattern with electricity: factories had electric motors for decades before productivity moved, because gains required redesigning how work was done around the new technology. The same thing is happening now. The tools are here; the reorganization of work around them has barely begun.
Break the composite score into three tiers, and a pattern emerges at every level.
The composite score tells us most people land in the lower half, but not why. Break it into the three capability tiers and a hidden pattern emerges: two distinct clusters, one of people still learning and one of people who have pushed through.
Start with Everyday Chat, the tier where confidence lives. Nearly everyone scores high here. For most, “using AI” begins and ends at this level. But even within chat the distribution splits: Group A uses it intermittently or reluctantly. Group B has made it a daily reflex. The gap between them is already visible, and it is the first sign of the pattern that repeats at every level.
Advanced Prompting is where the cohort splits more sharply. One group has mastered detailed AI instructions, structured output formats, and keeping context across sessions. The other has not touched any of it. Two distinct clusters with a gap in between. This is where real separation starts, because prompting well takes deliberate practice: learning what works, iterating, building a routine. Those who invested that time are pulling away. Those who did not remain at chat, and the distance appears to compound over time.
Each tier depends on the one below it. You cannot automate what you have not learned to prompt well. You cannot run AI agents without the scaffolding that automation provides. The tiers are a ladder, not a menu. Each one multiplies the last, and stalling at any rung means you lose everything above it.
Beyond prompting, the frontier is empty. Automation, connecting tools together, command-line AI tools, autonomous agents. The layer where productivity should compound sits near zero for almost everyone. Two-thirds of the cohort score below 20 on Automation & Agentic. One outlier built an automated system connecting deal sourcing, CRM, and due diligence through software integrations, while everyone else remains at the chat level.
Yet this is where capability grew fastest in my data. One participant went from zero command-line experience to advanced usage in weeks of structured practice. The tools improve so quickly that the gap between someone who practices and someone who waits opens wider every month.
The separation is not abstract. It shows up in deals closed, hours burned, and projects that implode.
What does this separation produce? Three patterns emerged from the interviews, each a version of the same question: did you restructure your work around the tools, or not?
“We can probably see five times as many deals.”
Built an automated system connecting deal sourcing, CRM, and due diligence through software integrations. First-day due diligence quality matches what used to take a full manual process. Yet the system is only half connected, and his team cannot replicate it.
“I am 99% sure no one at the firm will be using it that way.”
Started with zero command-line experience. Over weeks of structured practice, built reusable AI instructions, memory that persists between conversations, and workflows that span multiple sessions. A deliverable that typically takes an experienced associate a full week, he produced in a day. Senior colleagues with years of domain expertise could not match his throughput. Not because they lacked knowledge, but because they lacked the tools.
“All the projects sold relying on AI in the proposal… they imploded.”
She was comfortable with chat and never invested in advanced prompting. Like most, she was waiting for someone else to build the path. When her firm sold projects scoped to AI-accelerated timelines, the gap fell on her team. They had automation-tier deadlines with chat-tier skills. She absorbed the manual work and went on medical leave from burnout.
The archetypes follow the tier structure exactly. The System Builder operates across all three; his output multiplied because each tier builds on the last. The Skill Builder invested weeks of deliberate practice climbing the same ladder. The Skipper never built prompting-tier competence, so when the firm priced automation-tier productivity, the ladder broke beneath her.
The result is role compression. When one person handles 5x the deal flow and another saves 80% of the time on real deliverables, the distance from everyone else doesn’t grow by addition. It grows by multiplication. The rest are doing the same job without the same leverage.
Between these extremes sits the majority. As one participant put it: “I am the silent middle. If it is there, I will use it. If it is not there, you know.” Comfortable with chat, aware that more exists, waiting for someone else to build the path.
If you recognize yourself in the silent middle, our data suggests the gap is real and likely growing. Your peers who climbed appear to be pulling further away each month. The barrier is investment, not access or intelligence.
But the Skill Builder’s trajectory shows this is not a permanent position. He invested on his own. No firm in the sample had a structured program to move people up the tiers. The distance from chat to command-line tools closed in just a few weeks of deliberate practice, which is what makes this worth paying attention to: the gap is closable, but closing it requires investment that almost no one is making.
The gap lives inside the user base, widening because each tier appears to build on the last and almost nobody climbs. For every individual, the question is the same: invest in closing the gap, or watch it widen.
The pattern repeated across the sample. One consultant built a pipeline that reads 50 earnings transcripts, extracts specific financial metrics, and produces a comparative analysis. One participant reported that what took a team two days now runs in 15 minutes. Another set up an agent that monitors regulatory publications daily, surfacing only what matters to their client. A third automated the path from raw interview notes to structured deliverable drafts with citations, cutting first-draft time by 80%. These are not theoretical workflows. They are running today, built by people who were using only basic chat six months ago.
Most firms in the sample restrict sending proprietary data to external AI tools. That constraint is rational. Client data, deal terms, and internal analyses belong behind access controls, and no productivity gain justifies compromising that. The question is not whether to have governance, but how to build governance that enables adoption rather than blocking it. In practice, this means three things: using firm-approved tools where they exist, ensuring any external AI provider meets enterprise compliance standards (SOC 2 Type II, data processing agreements, zero-retention policies), and establishing clear protocols for what data can go where. The major AI providers, Anthropic and OpenAI included, now offer enterprise tiers with these controls. The gap between “we use AI” and “we use AI responsibly” is a governance design problem, and most firms have not begun to solve it. If your organization is working through AI governance and compliance patterns, I would welcome that conversation at olitolabs.com.
AI agents are confident when they are wrong. Every participant who moved beyond basic chat encountered this. The higher you climb the capability ladder, the more you depend on AI outputs, and the harder it becomes to verify them. Several participants described spending almost as long checking AI-generated work as they would have spent doing it manually. The validation burden does not disappear with better tools. It shifts shape.
This sample is 50 professionals across consulting, finance, and adjacent fields. They skew motivated: people who said yes to a research interview about AI are not a random draw from the workforce. The scores, distributions, and patterns reported here describe this cohort, not the broader population. The framework is designed to be replicable (the full pipeline is open-source), but the findings should be read as directional, not representative.
Tool builders ship improvements weekly. Features that required command-line access a year ago now have graphical interfaces. The tier boundaries in this taxonomy will shift as tools mature. What this study captures is a snapshot: where the gap stands today, among this cohort, with today’s tools. The structural observation, that each tier compounds on the last, is likely durable. The specific distances may not be.