Industry Commentary

Why AI Tools Are Making Some Teams More Tired, Not Less

By John Jansen · · 7 min read

Share

The premise sounded reasonable. Give people tools that draft emails, summarise meetings, generate code, and produce first cuts of almost anything — and the tedious parts of knowledge work would shrink. People would spend more time thinking and less time typing. Burnout would ease.

The data keeps saying otherwise. A 1News piece captured something we've been watching in our own work and in our clients' teams: the people most exposed to AI tooling often report feeling more stretched, not less. That's worth taking seriously rather than dismissing as novelty fatigue or bad change management.

We think there are four specific mechanisms at play, and most of them are fixable — but not by adding more AI.

The output ceiling moved, and so did the expectations

Before generative tools, a reasonable weekly output looked like: one strategy doc, three code reviews, a handful of customer emails, a meeting or two summarised into notes. That was the shape of the job.

Once a senior engineer can have Copilot draft scaffolding in ten seconds, or a PM can ask Claude for five variations of a roadmap narrative in under a minute, the implicit quota shifts. Not because anyone announced it, but because throughput became visible. If five variations take ninety seconds, why didn't you produce ten?

The tooling didn't free up the hours. It re-priced them. And the new price is paid in volume. We've seen this in code review queues particularly — PR counts per engineer have climbed measurably in teams using AI assistants, but the cognitive load of reviewing AI-generated code is higher per line than reviewing human-written code, because the reviewer can't rely on the usual cues about intent.

Verification is the new bottleneck, and it's exhausting

Writing is generative. Reviewing is evaluative. These are different mental modes, and they cost different amounts.

When you write code or a document yourself, you carry the intent through the whole thing. You know why a function exists because you chose to create it. When AI produces the draft, you're now in a permanent evaluation posture — reading someone else's reasoning (a machine's, which is worse, because the reasoning isn't really there) and deciding whether it matches what you actually wanted.

This is the same kind of cognitive work that makes code review tiring. The difference is that code review used to be a bounded activity — you did it for a few hours a week. Now, if AI is producing the first draft of everything, verification is all the work. The whole day becomes review day.

We've started calling this the evaluator's tax. It's real, it's underestimated, and it gets worse as the AI gets better — because good-but-wrong output is harder to catch than obviously-wrong output. A model that's right 70% of the time is more exhausting to work with than one that's right 30% of the time, because you can't build reflexes for the failure mode.

The shape of work fragmented

The old rhythm of knowledge work had natural pauses. You'd write for a while, then wait for feedback. You'd code for an afternoon, then run tests. You'd draft a proposal, then sleep on it.

AI collapses those pauses. There's no waiting. The next iteration is always one prompt away. People describe this as productive at first, then as relentless. The pauses weren't waste — they were when the thinking consolidated. Remove them and you get more surface output and less depth.

We notice this most in design and architecture work. The best decisions we've made as a team came from someone staring at a whiteboard for forty minutes. You can't prompt your way to that. And when the cultural default becomes "just ask the model," that forty-minute stare starts feeling like slacking — even to the person doing it.

Context-switching costs went up, not down

One of the quiet lies of AI tooling is that it reduces context-switching. In theory, you can offload a small task — summarise this thread, rewrite this paragraph, explain this error — and return to your main work.

In practice, every AI interaction is itself a context switch. You have to formulate the prompt, evaluate the output, decide what to keep, integrate it back into your work. That's four mode shifts for something that used to be one.

Worse, the interruption threshold dropped. Before, you wouldn't ask a colleague to rewrite a single sentence — the social cost was too high. Now you'll ping a model twenty times an hour for trivial things, and each ping costs a few seconds of working memory. Those seconds compound.

What actually helps

We're not arguing against AI tooling. We build with it daily and wouldn't go back. But we've become much more specific about when it helps and when it quietly makes things worse.

A few things we've landed on:

Use AI for the parts where verification is cheap. Generating boilerplate, writing tests against a clear spec, producing first drafts of docs that will be heavily edited anyway — these have low evaluator's tax. The output is easy to check because the correctness criteria are explicit.

Avoid AI for the parts where verification is expensive. Architectural reasoning, security-sensitive code, anything where "looks right" and "is right" diverge. Here the model saves you typing but costs you more in review than you saved in generation.

Protect the pauses. We've deliberately built blocks of the week where nobody uses AI tooling. Not out of purity — out of observation that the consolidation work that happens in those blocks is where most of the real decisions come from.

Push back on implicit quota creep. If AI made drafting five times faster, that doesn't mean the team should produce five times the output. It might mean producing the same output with more thought behind it, or shipping earlier and sleeping more. This is a management decision, not a tool decision, and it has to be made explicitly or the quota will creep on its own.

Measure review load, not just output. Teams obsessively track lines of code, PRs merged, docs shipped. Almost nobody tracks how much of the day is spent in evaluation mode. If that number is creeping past 60%, burnout is coming whether output is up or not.

The honest framing

AI tools are a genuine productivity multiplier for some tasks and a cognitive tax on others. The trouble is that the productivity shows up in dashboards and the tax shows up in people's nervous systems. One is legible to the organisation. The other is not, until someone resigns.

The teams we've seen handle this well aren't the ones with the best tooling. They're the ones who took the question "what should we not use AI for?" as seriously as the question "where can we use more of it?" That's a less exciting conversation, but it's the one worth having.

The irony in the 1News headline is real, but it isn't really ironic. A tool that generates output faster than humans can evaluate it will, by default, produce more evaluation work than it removes generation work. That's the mechanism. Once you see it, the fatigue stops being surprising — and becomes something you can actually design around.

Want to discuss this?

We write about what we're actually working on. If this is relevant to something you're building, we'd love to hear about it.