Industry Commentary

Personal vs Canonical Skill Libraries: Where Should Coding Agents Get Their Instructions?

By John Jansen · · 6 min read

Share

Matt Pocock published his personal .claude/skills directory on GitHub and it crossed 3,800 stars inside a day. That number matters less than what it represents: a single practitioner's curated agent instructions spreading faster than anything Anthropic, the major frameworks, or most internal platform teams have shipped in the same space.

This is the moment skill libraries become a real question for engineering organisations, not a curiosity. And the question is not "should we use skills" — most teams running Claude Code or similar agents already are. The question is whose skills, curated under what rituals, with what update velocity, and where the trust boundary sits.

What a skill library actually is

For anyone who hasn't poked at this yet: a skill in the Claude Code sense is a small markdown file that tells the agent how to do a specific kind of task. Write TypeScript like this. Run tests like that. When you see a Drizzle schema, here's the migration ritual. They're not prompts in the chat sense — they're persistent, file-system-resident instructions the agent reaches for when context matches.

The interesting property is that skills compose. You can drop someone else's skill folder into your repo and the agent picks it up. That's why Matt's repo went viral: it's not a framework, it's a directory you can copy. Low ceremony, high leverage.

It's also why the curation question is suddenly load-bearing.

Three models, three different bets

We see organisations defaulting into one of three models, usually without naming the choice.

Vendor-canonical. Use whatever Anthropic, OpenAI, or your framework vendor ships. Treat it as a base library. The bet here is that the vendor has the most signal about how their model actually behaves, and the lowest incentive to ship something weird. The cost is that vendor skills tend to be generic — they're not opinionated about TypeScript versus Python, Drizzle versus Prisma, your monorepo layout, your test runner.

Practitioner-curated. Pull from someone like Matt, or from a handful of well-known engineers whose taste you trust. The bet is that a strong individual practitioner produces better, more opinionated, more useful skills than either a vendor or a committee. The cost is provenance risk — you are now downstream of one person's availability, taste drift, and threat model.

Org-canonical. Your platform team or staff engineers curate an internal skill library. The bet is that your codebase is specific enough that generic skills produce mediocre output, and that the cost of maintaining your own is repaid in agent quality. The cost is real: someone owns it, someone reviews PRs to it, someone gets paged when a skill starts producing bad migrations.

Most teams we talk to are doing an unacknowledged mix of all three, which is the worst of every world because nobody owns any of it.

The trust boundary problem

Here's what makes this different from the npm dependency conversation, even though it rhymes.

When you npm install a package, the trust boundary is execution. The package runs in your build or runtime. You worry about supply chain attacks, malicious postinstall scripts, typosquatting.

When you adopt someone's skill library, the trust boundary is your agent's behaviour on your codebase. A subtly wrong skill doesn't pop a reverse shell — it convinces your agent that the "right" way to add a database column is to drop and recreate the table. It tells the agent to disable strict mode to make a test pass. It encodes a pattern that's idiomatic in someone else's codebase and actively wrong in yours.

This is a quieter failure mode and a harder one to audit, because you're auditing instructions about taste, not code that does a specific thing.

The implication: reviewing a third-party skill library is not the same job as reviewing a dependency. It needs an engineer who understands both how the agent will interpret the instructions and how your codebase wants to be modified. That's a specific person, usually senior, and they need time to do it properly.

Update velocity is the sleeper issue

The other thing the Pocock repo surfaces is velocity. Matt can ship a skill change in minutes. Anthropic ships canonical guidance on its own cadence, which is slower and more conservative. An internal platform team ships at whatever speed your platform team ships at, which is usually slower than either.

This matters because models change. Claude 3.5 wanted different instructions than Claude 3.7 wants. When Anthropic releases a new model that handles tool use differently, every skill that encodes assumptions about the old behaviour becomes subtly wrong. The practitioner-curated libraries adapt fast. The vendor libraries adapt with the vendor's release cycle. The org-canonical libraries adapt when someone notices.

So the velocity question isn't "how often do we update skills." It's "who notices when the model changes underneath us, and how long is the lag before our skills reflect that?" If the answer is "a contractor will get to it next quarter," you don't have an org-canonical library, you have an org-canonical liability.

What we'd actually recommend

We think most engineering organisations should land somewhere like this:

Pick a base. Either vendor or a single named practitioner. Don't mix sources at the base layer. Pin it. Treat updates as a deliberate event, not a git pull.

Layer org-specific skills on top. These are the ones that encode your codebase's actual conventions — your ORM, your test patterns, your deployment shape. This layer is small and high-value. It's where your platform team's effort goes.

Have a named owner. One engineer's name against the skill library. Not a team, not a Slack channel. A person who reads the agent's PRs and notices when something starts drifting.

Review skills like you review architecture, not like you review code. A PR to add a skill is a PR that changes how every agent-authored change in the repo will look from now on. That deserves more than a thumbs-up emoji from whoever happens to be online.

Watch for the model change moment. When your vendor ships a meaningful model update, that's the moment to re-audit the base layer. Calendar it.

The honest take

We think practitioner-curated libraries are going to win the early adoption battle and lose the long-term one inside serious engineering organisations. Matt's repo is excellent because Matt is excellent — but "depend on a specific person's taste being available and current" is not a posture an organisation can hold for years.

Vendor-canonical libraries will get better and become the sensible base. Org-canonical layers on top are where the real differentiation lives, because they're the only ones that know what your codebase actually wants.

The trending repo isn't the future of skill libraries. It's the proof that skill libraries are now important enough to argue about — and the prompt to decide, deliberately, where your trust actually sits.

Want to discuss this?

We write about what we're actually working on. If this is relevant to something you're building, we'd love to hear about it.