About Presient Labs

An AI skill optimization lab. We measure what works, discard what doesn't, and prove the difference.

The Problem

Most AI skills are untested. Developers write CLAUDE.md files, .cursorrules, system prompts — and never measure whether they actually work. The output “looks right” so they ship it.

This is how you end up with skills that score 96% internally and collapse to 46% under blind testing. It looks like improvement right up until it doesn't.

Our Approach

We apply the same rigor to AI skills that engineers apply to code: automated testing, blind evaluation, measurable improvement. Every skill we optimize goes through hundreds of variants, scored by 3 independent judges using binary pass/fail criteria.

No vibes. No 1-10 scales. Pass or fail.

Binary evaluation eliminates two failure modes: probability noise from scaled scoring, and proxy drift where an optimizer learns to game your metrics instead of genuinely improving quality. Independent judges catch what a single evaluator misses.

The Founder

Shawn Carpenter

@ShawnDanCap on X

Builder and optimizer. Founded Presient Labs after discovering that AI skills could be systematically improved using evolutionary optimization — and that most “improvements” fail under blind testing. The writing-plans failure (96% internal → 46% blind) is why we guarantee results.

Our Guarantee

If your optimized skill doesn't beat baseline by at least 10 percentage points, automatic refund. No questions asked.

We published our failures alongside our successes. The writing-plans result is public at github.com/willynikes2/skill-evals. We didn't hide it — it's the reason you should trust the five that passed.