SurveySparrow SE Coach

TL;DR

Every call you upload is scored on a 7-criterion rubric. Each criterion has 2-3 sub-criteria. Claude reads the transcript, scores each sub-criterion on a 0-5 scale with a transcript quote as evidence, and computes a weighted final score out of 5.

We then compare your final score to a baseline of typical B2B SaaS Solution Engineers and tell you which percentile bucket you fall into. P50 = median. P75 = top quarter. P10 = bottom decile.

Beyond the score, we extract deal-intelligence signals from each call — what product the conversation is about (SurveySparrow / ThriveSparrow / SparrowDesk), the prospect's program maturity (CX or EX scope), features they discussed vs features they explicitly requested as gaps, competitors, trial issues, AE behavior, and selling style.

The rubric is the same scoring system kaushik used historically in his "Demo of the Month" Excel sheets — we just automated it and benchmarked it against industry data. v3 (Jun 2026) added explicit fairness rules — SE check-ins aren't flagged as pain, AE-domain topics aren't your gaps, phased rollouts are respected, and procurement/security calls have their own scoring lens. See "What we won't penalize" below for the full list. Visual-only signals (was your logo on screen?) are only scored when the transcript verbally describes them — see "Audio-only scoring."

1. The rubric — what we score

Criterion	Weight (Demo)	What we look for
Solution Skills	30%	Customization to prospect's pain · framing features as outcomes
Craftsmanship	20%	Personalized demo env · prospect logo · pre-built workflows
Communication	15%	Tone, pacing · engaging delivery, stories, analogies
Consultative Approach	15%	Proactive insights · clear recommendations · anchoring takeaways
Presentation	10%	Relevance of what's shown · narrative cohesion
Touchbase on Pain	5%	Surfacing + addressing pains throughout, not just at start
Audience Engagement	5%	Personalization (name, industry examples) · interactivity

Weights change automatically based on call type — see the "Call types" section below.

2. How the final score is computed

Same formula as the original Excel sheets:

final = Σ over criteria of (weight_pct / 100) × avg(sub_scores)

Worked example — a demo call with these sub-score averages:

Solution Skills: 3.75 → contributes 0.30 × 3.75 = 1.125
Craftsmanship: 4.0 → contributes 0.20 × 4.0 = 0.800
Communication: 4.0 → contributes 0.15 × 4.0 = 0.600
Consultative Approach: 3.33 → contributes 0.15 × 3.33 = 0.500
Presentation: 3.75 → contributes 0.10 × 3.75 = 0.375
Pain Points: 3.75 → contributes 0.05 × 3.75 = 0.188
Audience Engagement: 4.0 → contributes 0.05 × 4.0 = 0.200

Final = 3.79 / 5. That's the big number on your dashboard.

Weight rescaling for unassessable sub-criteria. If an entire criterion ends up unscorable (e.g. all of Craftsmanship was visual-only on a call with no screen description), it drops out and the remaining weights are rescaled to sum to 100. So your final score isn't artificially deflated by the missing criterion — you're scored on what was actually assessable from the transcript. See "Audio-only scoring" for the full rule.

3. Industry percentile — what P10 / P50 / P75 actually mean

We compare your final weighted score against the distribution of SaaS SE scores from industry data. Current bands (final score out of 5):

Band	Score ≥	What it means
P95	4.5	Top 5% — world-class. SEs at this level get hired into Director / Head of SE roles.
P90	4.3	Top 10% — leadership track. Trusted to demo to C-suite and run complex POCs.
P75	3.9	Top 25% — strong, consistent performer. Trusted with the biggest deals.
P50	3.4	Median. The "typical" B2B SaaS SE. Solid but not yet differentiated.
P25	2.8	Bottom 25% — needs structured coaching. Often heavy on product walkthrough without tying back to stated outcomes.
P10	< 2.8	Bottom 10% — likely doing feature tours instead of value-led demos.

"P10" does NOT mean "10/100". It means: out of 100 SaaS SEs scored on this rubric, roughly 90 score higher than you, and you sit in the bottom 10%. P50 means 50 above, 50 below — median. P95 means only ~5 in 100 score higher — top performer.

The percentile is your rank vs the entire SaaS industry, not just SurveySparrow. A P50 score puts you on par with median SEs at Salesforce, HubSpot, Atlassian, Gong — not just this team.

4. Per-criterion gap vs industry median

In the call detail view, each criterion shows a "vs median" gap — your score on that one criterion vs the SaaS-industry median for that specific criterion (not the overall final score).

Criterion	Industry median (0-5)
Communication	3.8
Presentation	3.6
Audience Engagement	3.4
Solution Skills	3.5
Consultative Approach	3.2
Touchbase on Pain	3.3
Craftsmanship	3.0

Craftsmanship median is lowest (3.0) because most SEs use generic demo environments. Personalizing yours with prospect logo + data is a fast way to leapfrog the median.

Consultative Approach (3.2) is the second-lowest — the industry over-indexes on product demonstration vs trusted-advisor positioning. SEs who bring proactive insights stand out fast.

5. Where the benchmark data comes from

Current numbers are seeded estimates synthesized from publicly-available sources:

Gartner SE Excellence report (annual)
PreSales Collective State of PreSales annual survey
SalesHood / Gong public demo benchmarks
Bain SaaS GTM benchmark surveys

These are reasonable starting estimates — not a live data feed. We refresh them quarterly.

Caveat: the percentile is directionally accurate, not surgically precise. P50 vs P75 is a real signal; don't agonize over P74 vs P76.

6. How scoring adapts to call type

Same 7 criteria, but weights shift and Claude gets type-specific guidance:

Call type	Heaviest weight	Type-specific guidance
Demo	Solution 30% + Craft 20%	Personalize env, tie features to pains, clear next step
Follow-up demo	Solution 25% + Consult 20%	MUST reference prior-call pains — otherwise 0 on Pain Points
Follow-up query	Consult 30% + Solution 20%	Be a trusted advisor, not an FAQ answerer
POC	Solution 35% + Craft 20%	Penalize "that's on the roadmap"; reward real workflow integration
Closure	Consult 35% + Pain 20% (Craft 0%)	Loop back to original pains. No new feature tours.
Procurement	Consult 40% + Comm 20%	Vendor security review / SOC 2 / compliance. No demo expected. Accuracy + honesty over selling.

Picking the wrong call type scores you against the wrong lens. If it was a closure call, don't pick "demo" — closure calls don't get penalized for low Craftsmanship.

7. What we won't penalize you for

The scoring rubric exists to coach you on craft — not to nitpick about things that aren't actually your job, or normal call moves that get mis-read as gaps. Based on team feedback, these explicit fairness rules now ship with v3 of the scoring prompt:

SE check-ins are facilitation, not "unaddressed pain"

When you pause to ask "any questions?", "does that make sense?", "anything I missed?", "shall I keep going or pause here?" — that's you structuring the call. The earlier version of the prompt sometimes flagged these as pain points the SE failed to address. Fixed. Only counts as a pain point if the prospect raised a concern / blocker / requirement and you didn't loop back to it.

AE-domain topics are not your responsibility

The following topics are owned by the Account Executive by default — the scoring no longer dings you for "not addressing" them, "not pushing on" them, or "not closing on" them:

Commercial terms / pricing / discount discussion
Contracts / SOW / paperwork
Procurement / vendor onboarding / security questionnaires (as a process)
Billing, invoicing, payment terms
Legal review / MSA / DPA

If you happen to handle these gracefully, that's a small bonus. If the AE handles them while you listen, that's normal and expected. You stepping aside for AE-domain topics is good role boundary, not a gap.

Phased rollouts respected when discovery already scoped them

If the prospect says "Phase 1 is X, Phase 2 will be Y" or references prior planning ("as we discussed, ticketing comes in phase 2"), the analysis now assumes the phases were already defined in an earlier discovery call. It won't flag "phases need to be identified" — that's mature discovery, not a gap. (Caveat: the system only sees the current call's transcript, so it can't proactively reference what was decided in that earlier discovery call. It just won't penalize you for not re-litigating it.)

Procurement / security-review calls have their own scoring lens

When the call is a vendor security review, SOC 2 walkthrough, IT compliance call, or procurement questionnaire — you're in trusted-advisor mode, not selling mode. The new "Procurement" call type shifts the weights accordingly: Consultative Approach jumps to 40%, Solution Skills drops to 10%, Craftsmanship drops to 10%. The analysis rewards accuracy, honesty (admitting when something isn't supported instead of bluffing), and pointing the prospect to the right doc / right person. It does NOT penalize you for the absence of demo content or value-selling — neither applies on this call type.

Pick the right call type at upload time. If you select "demo" for what was actually a procurement call, you'll be scored against the wrong lens — Craftsmanship and Solution Skills will look weak through no fault of yours. Granola titles with "security review", "SOC 2", "compliance", "vendor onboarding" get auto-routed to Procurement; for manual uploads, pick it from the Call Type selector.

8. Audio-only scoring — how we handle visual signals

The analysis only ever sees a written transcript of the audio. We never have video, screenshots, or the actual demo screen. That matters because several sub-criteria in the rubric are about things on screen:

Craftsmanship → Personalization (was the prospect's logo on screen? vertical-relevant data?)
Craftsmanship → Customization (custom dashboards, role-played personas, working integrations?)
Presentation → Relevance (does each shown artifact tie to a stated need?)
Presentation → Cohesion (visual narrative arc)

For these visual sub-criteria, Claude follows a strict rule:

Search the transcript for verbal evidence of what was on screen. "Let me pull up your dashboard with your company logo" counts. "As you can see, this is your industry's data" counts. A prospect reacting to visuals ("nice, that's our brand color") counts.
If found → score normally, with the verbal evidence as the quote.
If none found → mark the sub-criterion not_assessable, score = null. The sub is excluded from the criterion average rather than getting a penalty score.

On the call detail page you'll see a grey banner at the top listing which sub-criteria were excluded for this reason. If most of Craftsmanship was not_assessable, that's expected for an audio-only call where you didn't verbally describe what was on screen.

The takeaway for SEs: if you want credit for a beautifully personalized demo environment,say it out loud during the call — "let me pull up the dashboard we mocked with your logo and last quarter's NPS data." That gives the analysis verbal evidence to score against. Without it, the demo craft doesn't reach the transcript, and we won't make claims about it either way.

9. Deal-intelligence signals — what we extract beyond the score

Each call also gets a structured extract of deal-context signals — visible on the call detail page and rolled up into the manager + CEO dashboards. The score is the SE's coaching loop; the intelligence is the company's deal-loop.

Signal	What it captures
Product	Which product the conversation is primarily about: SurveySparrow (CX/feedback), ThriveSparrow (EX/employee engagement), or SparrowDesk (helpdesk/support).
Use case	1-2 sentence description of what the prospect actually wants to do, with direct quotes.
Maturity	An 8-dimension 0-3 scorecard rolling into a band: Form / Basic · Low Maturity · Potential High · High. The scope (CX or EX) is set by the product. ThriveSparrow conversations get EX maturity; the others get CX.
Features discussed	Capabilities already in our product that came up — demoed, mentioned, or that the prospect asked about and we have. Most product-feature talk goes here.
Feature requests / gaps	Things we don't have that the prospect asked for, or that our team admitted is missing/roadmap. Tagged blocker / nice-to-have / mentioned. Only true gaps end up here, not existing capabilities.
Competitors	Other vendors named, with context: evaluated, currently using, dismissed.
Trial issues	Things that broke or were confusing during a trial, with severity.
Loss-risk signals	No-reference-customer asks, support quality concerns, pricing pushback, product-gap concerns.
AE behavior	How many times the AE interrupted the SE mid-value, plus an impact verdict.
Selling approach	Product-led vs outcome-led vs balanced — describes the approach the call lent itself to, not a label on the SE. The metric reflects framing on this specific call, not a permanent style.
Prospect engagement	Overall sentiment, buying signals, objections.

Features discussed vs feature requests — the bright line: if you demoed it or it already exists in the platform, it's discussed. If the prospect explicitly said they need something we don't have, or your team said "that's not available today" — it's a request. The split exists because product/engineering doesn't want to wade through 50 "feature requests" that are actually existing capabilities the prospect just hadn't seen yet.

10. What's in your scorecard

Every scorecard has the following:

Final weighted score + industry percentile — the headline. Use it for trend, not for ego.
Per-criterion scores + industry-gap deltas — the diagnostic. Biggest negative gap = highest-leverage area.
Top 3 strengths + top 3 gaps — qualitative, evidence-based. Each tied to a transcript moment.
One coaching action for the month — single concrete behavior change. This is the only thing you have to do.
Not-assessable banner (when relevant) — lists which sub-criteria couldn't be scored from this transcript and were excluded from the weighted average.
Product + Maturity badges — at-a-glance: which product (SurveySparrow/ThriveSparrow/SparrowDesk) and what scope of maturity (CX or EX).
Deal-intelligence grid — product, use case, maturity, features discussed, feature requests/gaps, competitors, trial issues, SE selling style, AE behavior, prospect engagement.

One behavior change, sustained, moves you up a percentile band over a quarter. Six behavior changes attempted at once moves you nowhere.

11. FAQ

My procurement / security-review call got dinged for "no demo." Why?
That was the old behaviour — fixed in v3. The system now has a dedicated "Procurement" call type with completely different weights (Consultative 40%, Solution Skills only 10%, Craftsmanship 10%, no demo expected). If your past procurement call was scored under "demo" by mistake, ask kaushik to re-run the analysis after the deploy; the new call type will be applied automatically for titles containing "security review", "SOC 2", "vendor onboarding", etc. For manual uploads, pick "Procurement / Security" from the call-type selector.

I asked "any questions?" before closing — why was that flagged as unaddressed pain?
That was the old behaviour — fixed in v3. SE check-ins like "any questions?", "does that make sense?", "anything I missed?" are now recognized as facilitation, not unaddressed pain. Only theprospect's raised concerns count as pain points.

The scoring said I should have pushed harder on commercials / contracts. But that's the AE's job.
Agreed — fixed in v3. The analysis now explicitly excludes AE-domain topics (pricing, contracts, paperwork, procurement process, billing, legal) from SE evaluation. Stepping aside when the AE handles those is good role boundary, not a gap.

The system says I spoke only 5-6 lines on a call I drove the discovery for. Why?
This is a Granola transcript-fidelity issue, not analysis logic. Granola only distinguishes its account-owner's microphone from "everyone else mixed together." If you weren't the Granola account owner on that call (e.g. you joined someone else's calendar invite), your turns get lumped with the prospect/AE in a single "speaker" track and we can't recover the attribution. The fix is operational: be the Granola account owner on your own calls. For calls where this fails, paste a richer per-speaker transcript via the Upload flow instead.

Why is "Craftsmanship" missing or partial on my call?
Most of Craftsmanship is about what's on screen (your logo on the dashboard, custom data, working integrations). The analysis only sees the audio transcript — so those sub-criteria are only scored when you (or the prospect) verbally referenced them on the call. The grey banner at the top of your call detail page lists which sub-criteria couldn't be scored. Those are excluded from the weighted average — they don't drag your score down. Solution: narrate your craft out loud during demos.

I demoed feature X — why is it in "Features discussed" instead of "Feature requests"?
Because the prospect didn't say it's missing. Feature requests / gaps only contains things we don't have or the team admitted is unavailable. Anything you demoed or that the prospect explored and we support belongs in features discussed. This split exists because product/engineering needs a clean list of true gaps — not 50 "requests" that are actually existing capabilities.

Why does the call say "ThriveSparrow" / "SparrowDesk" / "Unknown" as the product?
The product field is inferred from what the conversation is about. NPS / customer feedback / journeys → SurveySparrow. eNPS / engagement / 360 reviews → ThriveSparrow. Ticketing / helpdesk → SparrowDesk. Multi-product conversations get the primary one in the badge with the others in the call detail. If it looks wrong, the transcript probably didn't have strong enough signal — paste a clearer transcript via Upload or check the AE's qualifying notes.

What's the "Maturity (CX/EX)" badge?
The 8-dimension maturity framework still works the same, but the scope (CX vs EX) is now stated explicitly because ThriveSparrow conversations are about employee experience, not customer experience. SurveySparrow and SparrowDesk conversations are scoped as CX; ThriveSparrow conversations are scoped as EX. The bands are the same.

Why do Granola-sourced calls have a yellow "AE behavior is inferred" banner?
Granola records the SE's microphone separately but mixes the AE and prospect into a single "other" audio track. So when we label speaker turns, we know which lines are the SE's, but we can't directly distinguish AE turns from prospect turns. Claude infers attribution from content patterns (AEs talk about pricing/timeline, prospects ask about features) — accurate enough for direction, not for surgical metrics like exact interruption counts. For calls where AE quality is the question being asked, paste a richer per-speaker transcript via the Upload flow.

Why bands (P10, P25...) not exact numbers like P67?
Because the underlying benchmark data is itself imprecise. Fake precision would mislead.

My score went down month-over-month. Did I get worse?
Maybe — or you took harder deals, or it's normal variance. Look at the 3-month trend, not month-to-month wobble.

I disagree with a score. What do I do?
Reply to your coaching email. Every score has a transcript quote as evidence — that's the conversation starter.

Will my AE see my scores?
No. SEs see only their own scores. Managers see the team. CEO sees aggregates. AEs do not have portal access.

My old calls were scored under the older prompt — will they get updated?
Admins can trigger a "Re-analyze under current prompts" run from the Team page. It re-scores every call on an older prompt version using the latest rubric, the visual-evidence rule, and the features split. Score quotes and dates are preserved; only the scores + insights are refreshed.

12. For managers — how to use this

Don't lead with percentile in 1:1s. Lead with the single coaching action.
Watch the trend, not the snapshot. A great SE having a bad month is a coaching conversation. Three months of decline is a process intervention.
Use per-criterion gaps to spot team-level patterns. If 6 of 8 SEs are low on Craftsmanship, that's an enablement problem.
Demo of the Month rewards sustained quality. Min 2 demo-class calls to be eligible — prevents single-call winners.