Scoring methodology

How we score · how to read percentile · benchmarks

Jump to
TL;DRRubricFormulaPercentilePer-criterionSourcesCall typesScorecardFAQFor managers

TL;DR

Every call you upload is scored on a 7-criterion rubric. Each criterion has 2-3 sub-criteria. Claude reads the transcript, scores each sub-criterion on a 0-5 scale with a transcript quote as evidence, and computes a weighted final score out of 5.

We then compare your final score to a baseline of typical B2B SaaS Solution Engineers and tell you which percentile bucket you fall into. P50 = median. P75 = top quarter. P10 = bottom decile.

The rubric is the same scoring system kaushik used historically in his "Demo of the Month" Excel sheets — we just automated it and benchmarked it against industry data.

1. The rubric — what we score

CriterionWeight (Demo)What we look for
Solution Skills30%Customization to prospect's pain · framing features as outcomes
Craftsmanship20%Personalized demo env · prospect logo · pre-built workflows
Communication15%Tone, pacing · engaging delivery, stories, analogies
Consultative Approach15%Proactive insights · clear recommendations · anchoring takeaways
Presentation10%Relevance of what's shown · narrative cohesion
Touchbase on Pain5%Surfacing + addressing pains throughout, not just at start
Audience Engagement5%Personalization (name, industry examples) · interactivity

Weights change automatically based on call type — see the "Call types" section below.

2. How the final score is computed

Same formula as the original Excel sheets:

final = Σ over criteria of (weight_pct / 100) × avg(sub_scores)

Worked example — a demo call with these sub-score averages:

  • Solution Skills: 3.75 → contributes 0.30 × 3.75 = 1.125
  • Craftsmanship: 4.0 → contributes 0.20 × 4.0 = 0.800
  • Communication: 4.0 → contributes 0.15 × 4.0 = 0.600
  • Consultative Approach: 3.33 → contributes 0.15 × 3.33 = 0.500
  • Presentation: 3.75 → contributes 0.10 × 3.75 = 0.375
  • Pain Points: 3.75 → contributes 0.05 × 3.75 = 0.188
  • Audience Engagement: 4.0 → contributes 0.05 × 4.0 = 0.200

Final = 3.79 / 5. That's the big number on your dashboard.

3. Industry percentile — what P10 / P50 / P75 actually mean

We compare your final weighted score against the distribution of SaaS SE scores from industry data. Current bands (final score out of 5):

BandScore ≥What it means
P954.5Top 5% — world-class. SEs at this level get hired into Director / Head of SE roles.
P904.3Top 10% — leadership track. Trusted to demo to C-suite and run complex POCs.
P753.9Top 25% — strong, consistent performer. Trusted with the biggest deals.
P503.4Median. The "typical" B2B SaaS SE. Solid but not yet differentiated.
P252.8Bottom 25% — needs structured coaching. Usually weak discovery / feature-selling.
P10< 2.8Bottom 10% — likely doing feature tours instead of value-led demos.
"P10" does NOT mean "10/100". It means: out of 100 SaaS SEs scored on this rubric, roughly 90 score higher than you, and you sit in the bottom 10%. P50 means 50 above, 50 below — median. P95 means only ~5 in 100 score higher — top performer.

The percentile is your rank vs the entire SaaS industry, not just SurveySparrow. A P50 score puts you on par with median SEs at Salesforce, HubSpot, Atlassian, Gong — not just this team.

4. Per-criterion gap vs industry median

In the call detail view, each criterion shows a "vs median" gap — your score on that one criterion vs the SaaS-industry median for that specific criterion (not the overall final score).

CriterionIndustry median (0-5)
Communication3.8
Presentation3.6
Audience Engagement3.4
Solution Skills3.5
Consultative Approach3.2
Touchbase on Pain3.3
Craftsmanship3.0

Craftsmanship median is lowest (3.0) because most SEs use generic demo environments. Personalizing yours with prospect logo + data is a fast way to leapfrog the median.

Consultative Approach (3.2) is the second-lowest — the industry over-indexes on product demonstration vs trusted-advisor positioning. SEs who bring proactive insights stand out fast.

5. Where the benchmark data comes from

Current numbers are seeded estimates synthesized from publicly-available sources:

  • Gartner SE Excellence report (annual)
  • PreSales Collective State of PreSales annual survey
  • SalesHood / Gong public demo benchmarks
  • Bain SaaS GTM benchmark surveys

These are reasonable starting estimates — not a live data feed. We refresh them quarterly.

Caveat: the percentile is directionally accurate, not surgically precise. P50 vs P75 is a real signal; don't agonize over P74 vs P76.

6. How scoring adapts to call type

Same 7 criteria, but weights shift and Claude gets type-specific guidance:

Call typeHeaviest weightType-specific guidance
DemoSolution 30% + Craft 20%Personalize env, tie features to pains, clear next step
Follow-up demoSolution 25% + Consult 20%MUST reference prior-call pains — otherwise 0 on Pain Points
Follow-up queryConsult 30% + Solution 20%Be a trusted advisor, not an FAQ answerer
POCSolution 35% + Craft 20%Penalize "that's on the roadmap"; reward real workflow integration
ClosureConsult 35% + Pain 20% (Craft 0%)Loop back to original pains. No new feature tours.

Picking the wrong call type scores you against the wrong lens. If it was a closure call, don't pick "demo" — closure calls don't get penalized for low Craftsmanship.

7. What's in your scorecard

Every scorecard has 4 things:

  • Final weighted score + industry percentile — the headline. Use it for trend, not for ego.
  • Per-criterion scores + industry-gap deltas — the diagnostic. Biggest negative gap = highest-leverage area.
  • Top 3 strengths + top 3 gaps — qualitative, evidence-based. Each tied to a transcript moment.
  • One coaching action for the month — single concrete behavior change. This is the only thing you have to do.
One behavior change, sustained, moves you up a percentile band over a quarter. Six behavior changes attempted at once moves you nowhere.

8. FAQ

Why bands (P10, P25...) not exact numbers like P67?
Because the underlying benchmark data is itself imprecise. Fake precision would mislead.

My score went down month-over-month. Did I get worse?
Maybe — or you took harder deals, or it's normal variance. Look at the 3-month trend, not month-to-month wobble.

I disagree with a score. What do I do?
Reply to your coaching email. Every score has a transcript quote as evidence — that's the conversation starter.

Will my AE see my scores?
No. SEs see only their own scores. Managers see the team. CEO sees aggregates. AEs do not have portal access.

9. For managers — how to use this

  • Don't lead with percentile in 1:1s. Lead with the single coaching action.
  • Watch the trend, not the snapshot. A great SE having a bad month is a coaching conversation. Three months of decline is a process intervention.
  • Use per-criterion gaps to spot team-level patterns. If 6 of 8 SEs are low on Craftsmanship, that's an enablement problem.
  • Demo of the Month rewards sustained quality. Min 2 demo-class calls to be eligible — prevents single-call winners.
Questions about your scores? Reply to your monthly coaching email (kaushik is CC'd). Every score has a transcript-quote as evidence — we use that as the conversation starter.