Issue 01 · May 2026—Global · prompt v1.8 · tool-agnostic

Who actually delivers with AI?

An objective directory of AI specialists, ranked by what they actually ship — code, MVPs, measurable outcomes. The evaluation prompt runs on the candidate's own codebase and is cryptographically signed. Tool-agnostic. Evidence-based. No vibes.

Submit your evaluation →How it works

§ 03 · How it works

Three steps, one verdict.

Run the prompt

Paste a single prompt into Claude Code (or Codex / Cursor / Gemini) inside the candidate's own repository. The prompt reads code, git history, and memory files — it does not phone home.

Get a signed JSON

The prompt produces a JSON evaluation: scores, capabilities, achievements, MVP velocity, ship throughput. Signed with HMAC-SHA256, valid 24 hours. No way to forge.

Submit. Get listed.

Paste the JSON at /submit. Your public profile goes live with what you can deliver — exactly what a hiring company wants to see.

§ 04 · The prompt

Versioned. Auditable. Public.

Every score on this directory comes from the same prompt (currently core v1.8). It's published in full — read it, audit it, run it locally. Future versions are added; old evaluations are not silently rescored.

Read prompt v1.8 Submit your eval →

Prompt fingerprint

type: core

version: 1.8

length: ~30,000 chars

scoring: Output 30 / Ship 25 / Context 25 / Iteration 20

signature: HMAC-SHA256, 24h TTL

tool-agnostic: Claude · Codex · Cursor · Gemini

§ 05 · Frequently asked

Questions, considered.

Can someone fake a high score?

The prompt is run inside Claude Code (or another agent) on the candidate's actual codebase. It reads git history, memory files, and project structure — not just self-reported claims. The output JSON is signed with HMAC-SHA256 (24h TTL); we re-verify the signature server-side and refuse altered tokens. Could someone theoretically game it? Yes — by writing high-quality code and shipping. That's the point.

Why not have a human jury rank everyone?

Humans don't scale, are biased, and disagree across cultures and fields. A versioned prompt — published, audited, and run on real code — is closer to objective than a human committee, especially for technical evaluation. We trade some judgment for scale and transparency.

Is this Claude-Code-specific?

No. The prompt is tool-agnostic. It looks for ANY signal of good AI workflow — CLAUDE.md, AGENTS.md, GEMINI.md, .cursorrules, .github/copilot-instructions.md. The score reflects what you shipped, not which tool you used.

Who evaluates marketing / design / sales?

The current prompt (core v1.8) is built primarily for IT specialists. Industry-specific prompts (marketing, design, sales, data) are on the roadmap. Submit anyway — the core prompt still extracts capabilities and outcomes.

How do I get listed?

Run the prompt locally → paste the JSON at /submit. That's it. No application, no review queue, no fee. The evaluation is the application.

Can hiring companies search by criteria?

Industry filtering works today. Skill / outcome / location filtering is the next sprint. If you want to filter on something specific now, email [email protected] — we respond fast.

How is this different from GitHub stars or LinkedIn?

GitHub stars measure popularity, not skill. LinkedIn measures whatever people choose to write. aiers measures what the prompt extracts from your actual repository — what you can ship, in what timeframe, with what tools.