How to Create Scorecard From Training Calls
A training call scorecard converts what supervisors hear in call reviews into a consistent, repeatable measurement system. Without one, coaching is subjective and skill gaps are identified by whoever happened to listen to which calls. With one, you have a structured framework that every evaluator applies the same way, making it possible to compare performance across agents, over time, and across different call types. This guide walks through how to build one that actually reflects what good performance looks like at your organization. Step 1: Define What You're Measuring and Why Start with your training objectives, not with generic call center categories. If your program is designed to build consultative selling skills, your scorecard should measure the behaviors that drive consultative selling. If you're building compliance habits in a regulated industry, compliance criteria should be weighted most heavily. Common training call categories to consider: Introduction quality: Did the agent open the call correctly and set the right expectations? Active listening and engagement: Did the agent ask clarifying questions? Did they reflect back what the customer said? Product knowledge: Did the agent accurately describe the product, service, or process? Objection handling: How did the agent respond to pushback or resistance? Closure: Did the agent confirm next steps, summarize the outcome, and end professionally? Limit your scorecard to five to eight criteria. More than that and evaluators will struggle to apply consistent judgment across a full call. What criteria matter most for a training call scorecard? Prioritize criteria that directly reflect your training curriculum. If week three of your onboarding program covers objection handling, that criterion should carry significant weight in the scorecard used during that period. The scorecard should evolve as the training program progresses. Step 2: Weight the Criteria Not all criteria deserve equal weight. A compliance statement in a regulated industry might be worth 30% on its own. Active listening might be worth 15%. The weights signal to agents and evaluators what matters most. Set weights as percentages that sum to 100%. A reasonable starting distribution for a general customer service training scorecard: Criterion Weight Opening and introduction 15% Active listening 20% Product knowledge 25% Objection handling 20% Closure and follow-through 20% Review these weights with your training leads before locking them in. The first version is always a hypothesis. You'll calibrate after scoring actual calls. Step 3: Define What "Good" and "Poor" Look Like This step is where most scorecards fail. Criteria names without behavioral anchors produce inconsistent scoring. Two evaluators will interpret "active listening" differently unless you've defined what it looks like at the exemplary level and what it looks like at the deficient level. For each criterion, write a short description of both extremes. For "active listening": Exemplary: Agent asks at least one clarifying question, reflects back the customer's main concern in their own words before responding, and acknowledges emotional tone before pivoting to resolution. Deficient: Agent moves directly to resolution without confirming what the customer said, doesn't acknowledge frustration, and doesn't ask any clarifying questions. These anchors are what allow AI-assisted QA platforms to score intent rather than just checking whether specific words were used. Insight7's weighted criteria system includes a "context" column where you define what great and poor look like per criterion. Without this context, automated scores diverge from human judgment. With it, the platform calibrates within four to six weeks to match how your best evaluators score calls. How does AI scoring work with training call scorecards? AI scoring applies your defined criteria and behavioral anchors to every call, not just the ones a supervisor had time to review. Manual QA typically covers 3 to 10% of calls. Automated scoring covers 100%, so you're making training decisions based on the full picture rather than a sample. Every score links back to the specific transcript quote that triggered it, so agents can see exactly what the evaluation is based on. Step 4: Pilot on a Representative Sample Before using the scorecard in official training evaluations, score 15 to 20 calls with two or three evaluators independently. Then compare scores. If your calibration gap is more than 15 points on a criterion, the criterion definition needs refinement. Ask the evaluators where they disagreed and why. The answer usually reveals that the criterion was interpreted differently because the behavioral anchors weren't specific enough. Run at least one calibration cycle before using the scorecard for performance tracking. The goal is for two independent evaluators to arrive within 10 points of each other on most calls. Step 5: Build in a Feedback Mechanism The scorecard creates data. That data is only useful if it flows back to agents in a way that drives improvement. Each scored call should generate a report the agent can review: which criteria scored low, what transcript moments triggered those scores, and what they could have done differently. Insight7's agent scorecard system clusters multiple calls into one view per rep per period, showing average performance with drill-down into individual calls. For training programs specifically, this feedback loop closes the gap between classroom learning and live call application. An agent who completed a module on objection handling last week can see whether that skill is appearing in their actual calls. If/Then Decision Framework Situation Action Two evaluators consistently disagree on a criterion Rewrite the behavioral anchors to be more specific Scores are high but customer outcomes are poor Review whether criteria are measuring the right behaviors Scores improved in training calls but not in live calls Check whether scenarios are sufficiently close to real call conditions Agents improve on scored criteria but miss unscored behaviors Add criteria or rebalance weights in the next scorecard version Common Mistakes to Avoid Scoring too many criteria. A scorecard with 12 criteria is difficult to apply consistently. Focus on the behaviors that most directly predict the outcomes you're training toward. Static scorecards. Training programs evolve. Scorecards should be reviewed and updated when the training curriculum changes. A scorecard that doesn't match what you're currently teaching gives agents
How to Create Scorecard From Onboarding Calls
Most onboarding scorecards measure the wrong things. They capture whether a rep followed the agenda, but they miss whether the new hire can actually do the job independently after week one. AI roleplay training for onboarding changes that by turning scorecard data into practice simulations before problems show up in real calls. Why Onboarding Scorecards Fall Short Without Roleplay A scorecard tells you what happened. It does not tell you what to do next. When a new hire scores 62% on objection handling in their first week of onboarding calls, that number is useful only if it triggers a practice session before the rep takes more live calls. The gap between assessment and reinforcement is where onboarding fails. Traditional scorecards document performance, but the coaching response is delayed by schedule constraints, manager bandwidth, or simple inertia. Reps who struggle in onboarding calls repeat the same errors until a manager has time to run a practice session, which can take days or weeks. AI roleplay training closes this loop by converting scorecard gaps directly into simulation scenarios. The moment a criterion fails, a targeted practice session can be generated and assigned automatically. How does AI roleplay improve onboarding outcomes? AI roleplay accelerates onboarding by giving new hires unlimited low-stakes practice before they handle real customer interactions. Instead of learning by making mistakes on live calls, reps rehearse objections, pricing conversations, and escalation scenarios in a simulated environment. Research from Virtway shows that AI-powered simulations reduce time-to-competency by enabling reps to repeat scenarios until they pass a configured threshold, rather than waiting for manager-led sessions. Step 1 — Build the Scorecard Criteria from Real Onboarding Calls The most effective onboarding scorecards are built from actual call data, not from guesses about what good looks like. Pull 20 to 30 completed onboarding calls and identify the criteria that separate strong performers from struggling ones. Common onboarding scorecard criteria include: Dimension What to Measure Weight Introduction and agenda-setting Rep confirms purpose and sets expectations 15% Product knowledge accuracy Correct answers to product questions 25% Objection handling Addresses concerns without escalating 25% Next step clarity Clear action items agreed before call ends 20% Tone and pace Confidence, not scripted or rushed 15% Keep criteria tied to observable behaviors. Avoid vague dimensions like "professionalism" unless you can define exactly what a score of 3 versus 5 looks like. Insight7 uses a weighted criteria system where each dimension includes a "what good looks like" and "what poor looks like" context column. This eliminates ambiguity and allows automated evaluation to align with human judgment. Criteria tuning to match human QA typically takes four to six weeks. Step 2 — Automate Evaluation Across All Onboarding Calls Reviewing every onboarding call manually is not scalable. QA teams typically review three to ten percent of calls through manual review. Automated evaluation extends coverage to 100% of onboarding calls, which matters especially during high-volume hiring periods. Set up automated scoring with evidence-backed outputs. Every criterion should link back to the exact quote in the transcript that drove the score, so coaches can review the moment rather than re-listening to the full call. For onboarding, focus alerts on two triggers: Scorecard threshold alerts: Any new hire scoring below a configured threshold on a key criterion triggers a coaching notification. Compliance alerts: Required disclosures or policy statements missed in an onboarding call flag immediately, not at the next weekly review. Insight7's alert system supports delivery via email, Slack, or Teams, so managers receive real-time notifications without checking the platform manually. Step 3 — Map Scorecard Gaps to Roleplay Scenarios This is where the scorecard becomes actionable. Each criterion that consistently scores below target should have a corresponding roleplay scenario that the rep can practice immediately. What should a roleplay scenario include for onboarding? A well-configured onboarding roleplay scenario includes a customer persona, a specific challenge the rep must navigate, and evaluation criteria that match the scorecard. For example, if "objection handling" is the failing criterion, the persona should be a skeptical buyer who raises the most common objection your new hires encounter. Platforms like Second Nature and Mindtickle offer roleplay simulation tools. Insight7 takes a different approach: roleplay scenarios can be generated directly from real call transcripts, so the hardest actual closes from your top reps become the objection-handling templates new hires practice against. This grounds training in reality rather than hypothetical scenarios. Persona configuration matters. Effective onboarding simulations include: Customer name, job title, and communication style Emotional tone (skeptical, friendly, impatient) Specific objections pre-loaded into the simulation A pass threshold that new hires must reach before the scenario is considered complete Reps can retake sessions unlimited times, with scores tracked over time to show improvement trajectory. Step 4 — Use Auto-Suggested Training to Reduce Manager Overhead Manual training assignment creates bottlenecks. Managers reviewing QA scores and deciding which rep needs which scenario is a manual process that delays coaching by days. Auto-suggested training removes this bottleneck. When QA scoring identifies a criterion gap, the platform generates a practice scenario and queues it for manager approval. The manager reviews and approves, the rep receives the assignment, and the loop closes without requiring the manager to design the training. Insight7 builds this into the coaching workflow: supervisors approve auto-suggested sessions before deployment, maintaining human oversight while eliminating the design burden. Fresh Prints, an existing Insight7 customer, expanded from QA to the coaching module precisely because of this connection: "When I give them a thing to work on, they can actually practice it right away rather than wait for the next week's call," according to their QA lead. Step 5 — Track Score Improvement Over Onboarding Milestones A scorecard without trend data tells you where a rep is today. Trend data tells you whether the onboarding program is working. Set milestones for onboarding cohorts: week one baseline, week two after first roleplay sessions, week four before independent call handling. Track average scores per criterion across each milestone to see which training interventions moved the needle
Best AI Tools for Evaluating Sales Training Impact
Best AI Tools for Evaluating Sales Training Impact in 2026 Sales managers who invest in training programs without a measurement system are essentially flying blind. You can run role-plays, deliver workshops, and build playbooks, but if you cannot connect those activities to rep behavior on calls, you have no way to know what worked. This guide covers the AI tools that actually measure whether training translates to behavior change on real calls, not just quiz scores or completion rates. The query most people land on this page with is: which AI tools measure whether sales training actually changed how reps sell? That is a different question than "what are the best sales training tools." Most tools teach. Few measure outcomes. This guide focuses on the measurement side. What does AI-assisted sales training impact evaluation actually measure? Effective evaluation measures three things: behavior change on real calls after training (did reps start handling price objections differently?), skill progression over time (are scores improving session over session?), and business outcome correlation (do reps who completed training close at higher rates?). Tools that only measure completion rates or post-training quiz scores miss the first two entirely. What are the top AI platforms with call-based training impact measurement? Insight7 analyzes 100% of recorded calls against configurable evaluation criteria, then connects those scores to individual reps and tracks changes before and after training interventions. The AI coaching module lets reps practice scenarios built from real objection data extracted from your actual call library, not generic templates. TripleTen, an AI education company, processes 6,000+ learning coach calls per month through Insight7 for the cost equivalent of one US-based project manager. Reps can retake practice sessions unlimited times, with score trajectories tracked over time to show actual improvement curves. Gong focuses on B2B enterprise sales cycles, surfacing deal intelligence and rep-level call scorecards. Strong for complex sales environments with longer cycles. Less suited for one-call-close or high-volume consumer scenarios where Insight7 positions as the stronger fit. Chorus (ZoomInfo) offers conversation intelligence with coaching features tied to its CRM integrations. Its evaluation criteria are less configurable for custom rubrics compared to platforms built specifically for QA-first use cases. Salesloft Coaching embeds coaching workflows inside its sales engagement platform. Good if your team already runs cadences in Salesloft. Weaker as a standalone measurement tool for training impact. MindTickle is purpose-built for sales readiness: training content delivery, role-play scoring, and field readiness certifications. Stronger on the structured training side than on real-call analytics. Highspot combines content management, training delivery, and analytics. Its analytics layer measures content engagement and training completion but does not provide deep call-level behavioral scoring. Lessonly (Seismic) focuses on learning delivery and knowledge checks. Not a call analytics tool; requires a separate speech analytics platform to close the evaluation loop. If/Then Decision Framework If your primary problem is knowing whether reps apply training on actual calls, then use a platform with call analytics and QA scoring like Insight7 or Gong that compares pre- and post-training call scores. If you run B2B enterprise sales with long cycles and need deal intelligence alongside coaching, then use Gong for its revenue intelligence features. If you need a full readiness curriculum with certifications and structured learning paths, then use MindTickle as your training layer and pair it with a call analytics tool for outcome measurement. If your team is already on Salesloft for outreach and you need lightweight coaching, then use Salesloft Coaching to avoid adding another tool. If you run a high-volume contact center with QA requirements alongside training (sales and service), then use Insight7 for unified QA and coaching on the same platform. If you have a small team (under 20 reps) and need affordable per-user pricing, then Insight7's AI coaching starts at approximately $9/user/month at scale, with QA on a minutes-based plan from approximately $699/month. How to Use AI to Measure Sales Training Impact: A Practical Sequence Step 1: Establish a baseline before training. Pull 30 days of call scores per rep before you run any training intervention. Save the average per-rep score on the specific criteria you plan to improve. Without a baseline, you cannot measure change. Step 2: Connect training content to specific evaluation criteria. If your training addresses price objection handling, configure your call evaluation rubric to score that criterion explicitly. Vague criteria like "handles objections well" produce scores that do not isolate the thing you trained. Step 3: Assign practice sessions built from your real call library. Insight7's AI coaching module generates practice scenarios from actual calls, so reps rehearse the specific objections they will encounter in your market. This closes the gap between generic training content and real-world situations. Step 4: Score reps for 30 days post-training. Compare per-criterion scores before and after. A well-configured rubric will show whether the specific skill you trained improved, stayed flat, or regressed. Step 5: Track score trajectories, not just snapshots. Fresh Prints expanded from call QA to AI coaching on Insight7. Their QA lead noted: "When I give them a thing to work on, they can actually practice it right away rather than wait for the next week's call." Score tracking showed rep improvement over successive practice sessions, giving managers evidence of genuine skill development rather than one-time performance. How do you measure the ROI of sales training? ROI measurement requires connecting training activity to pipeline or revenue data. The most defensible method: segment reps who completed training versus those who did not, control for tenure and territory, and compare close rates, average deal size, and call-to-meeting conversion over 90 days. Call analytics platforms provide the behavioral data layer; your CRM provides the outcome data. Combining both is how you move from "training feels useful" to "training produced X% lift in close rate." What is the difference between training evaluation and performance management? Training evaluation measures whether a specific intervention changed a specific behavior. Performance management measures outcomes over time and drives accountability. They use the same data but serve different decisions. Training evaluation informs
Training to Improve Sales Performance for Smarter Results
Sales training programs that rely on scheduled sessions miss the most valuable intervention window: the moment immediately after a rep's performance signal appears in call data. This guide covers how to build automated nudge systems that trigger targeted practice from actual call performance, and what metrics to track to show the ROI. Step 1: Establish Automated QA Scoring Across All Calls Performance-signal-based training starts with measurement. Manual QA review at 5-10% call coverage misses too many signals to be a reliable trigger mechanism. You need consistent scoring data across a high percentage of calls to detect individual performance patterns. Insight7's automated QA scoring evaluates every call against configurable weighted criteria. Each criterion generates a score with an evidence link: the specific quote and timestamp that supports the score. That evidence is what converts a generic score into actionable coaching feedback. The configurable criteria matter because generic scoring produces generic training nudges. If your scoring criteria reflect what actually matters for your sales process (objection handling rate, script adherence, solution matching), your performance signals will point to the right training interventions. A criterion measuring "did the rep pivot to alternatives when the customer objected on price" generates more useful signals than a general "communication quality" rating. What is the 70/30 rule in sales? The 70/30 rule in sales refers to the proportion of time a rep should listen versus talk: 70% listening, 30% speaking. AI platforms can measure this ratio from call recordings and trigger a nudge when a rep consistently over-talks. The nudge would route to an active listening scenario. Performance-signal-based training is most powerful when criteria directly map to observable conversation behaviors that cause or prevent sales outcomes. Step 2: Define the Signal-to-Nudge Routing Logic A performance signal without a routing rule is just a data point. The routing logic determines what happens when a signal crosses a threshold: which practice scenario is triggered, who approves the assignment, and how urgency is communicated to the rep. Define thresholds: For each scoring criterion, set a threshold that triggers a nudge. A useful starting point is: if a rep scores below 60% on a specific criterion across three consecutive calls, assign the targeted scenario for that criterion. Thresholds should be calibrated to your team's baseline, not set arbitrarily. Map criteria to scenarios: Build a library of practice scenarios that correspond to each scoring criterion. The mapping should be specific: a low score on "open-ended questioning" routes to an open-ended questioning scenario, not a generic communication scenario. Build supervisor approval into the flow: Insight7's auto-suggested training workflow routes scenario recommendations to supervisors for one-click approval. This keeps supervisors in the loop without requiring them to manually identify what each agent needs. The key is reducing the manual step between "system identified a gap" and "agent receives a practice assignment." According to Forbes on micro-learning in sales training, short targeted practice sessions integrated into workflow outperform scheduled group training for skill development in sales roles. The mechanism is timing: practice immediately after a performance signal is more effective than practice weeks later in a scheduled session. What is the 3 3 3 rule in sales training? The 3 3 3 rule is a practice spacing framework: reps practice three key scenarios, three times each, across three time periods. AI training platforms support this naturally. Insight7 tracks scores across unlimited session retakes, showing the improvement trajectory from first attempt to proficiency threshold. The spacing builds retention through repetition rather than massed practice in a single session. Step 3: Deploy Scenarios and Track Completion An assigned scenario that does not get completed is a failed intervention. Three things drive completion: the rep understands why the scenario is relevant to their specific gap, the scenario is short enough to complete between calls, and there is a clear completion goal rather than open-ended practice. Insight7's scenarios can be built from real call recordings, so the practice situations reflect actual customer language rather than generic scripts. Reps recognize the scenario as relevant because it resembles the calls they actually handle. Completion rates increase when practice scenarios feel like real preparation rather than training for its own sake. Track completion, not just assignment. Knowing a scenario was assigned tells you about training administration. Knowing it was completed and scored tells you about skill development. Insight7's dashboard shows completion status, session scores, and improvement trajectories per rep and per scenario. Fresh Prints activated Insight7's AI coaching module to connect QA feedback to immediate practice scenarios. Their QA lead described the core benefit: agents practice the specific feedback they received the same day rather than waiting until next week's scheduled coaching session. Step 4: Measure Impact on Call Performance Metrics Nudge systems need outcome measurement to justify continued investment. The three metrics most directly connected to automated training nudges are QA score improvement, close rate on targeted call types, and ramp time for new reps. QA score improvement speed. Track the time from first scenario assignment to score improvement on the targeted criterion. Manual training programs typically show improvement over months. Performance-signal training with immediate practice should compress that to weeks. Close rate segmented by skill area. If nudges target price objection handling, track close rates on calls featuring price objections before and after the training intervention. This is the most direct attribution measure. Common mistake to avoid: Tracking scenario completion rate as the primary KPI. Completion is a leading indicator, not an outcome metric. The outcome metric is call performance improvement on the criteria that triggered the scenario assignment. If/Then Decision Framework If your QA data shows recurring gaps on the same criteria for the same reps, but coaching sessions are not moving the numbers, then the problem is delay between signal and practice, not coaching quality. If your sales managers spend most of their coaching time identifying what to work on rather than practicing skills, then automated signal-routing shifts the manager role from diagnostician to development partner. If your training program relies primarily on scheduled group sessions, then
Software for Sales Performance That Improves Outcomes
Sales performance software that actually improves outcomes is built on one capability most buyers overlook during evaluation: the ability to connect individual rep behavior patterns to revenue outcomes at scale. Generic tracking dashboards produce reports. Behavior-linked performance analytics produce coaching decisions. This guide covers what distinguishes outcome-improving sales performance software from reporting tools, and which platforms to evaluate based on your specific workflow. What Makes Sales Performance Software Actually Improve Outcomes? Most sales performance software fails to improve outcomes because it measures activity (calls made, emails sent, meetings booked) rather than behavior quality on those activities. Activity metrics tell you how much a rep did. Behavioral data from call analytics tells you how they did it and whether the approach produced a close. The four capabilities that drive measurable outcome improvement are: automated behavioral scoring across the full call population, not just sampled calls; criterion-level breakdowns that tell managers exactly which behavior is underperforming; coaching workflow automation that routes reps to practice sessions without manager scheduling; and progress tracking that shows whether coaching produced score improvement on the coached criterion. According to Gartner research on sales performance management, organizations using behavioral call analytics in coaching see faster rep ramp times than those relying on activity tracking. SQM Group benchmarks show behavior-specific coaching produces improvements that persist significantly longer than general skills training. The mechanism is coaching that targets the specific behavioral gap, not a general skill category. How We Evaluated These Platforms Criterion Weighting Why it matters Behavioral analytics depth 35% Activity metrics do not explain why outcomes differ between reps; behavior data does Coaching workflow integration 30% Analytics without a coaching path produces insights that do not change behavior Outcome correlation 20% Software that links behavior patterns to revenue outcomes enables ROI demonstration Ease of criterion configuration 15% Custom criteria aligned to your sales process outperform pre-built generic models What are the best corporate training outcomes analytics platforms? The best platforms for tracking corporate training outcomes analytics combine automated behavioral scoring with before-and-after performance comparisons at the criterion level. Insight7 scores every call against custom criteria and generates per-rep scorecards that show which behaviors improved after coaching interventions. Salesforce provides activity tracking and pipeline analytics but requires integration with a call analytics tool for behavioral data. Platform Profiles Four platforms represent the main approaches to sales performance improvement: behavioral call analytics, revenue intelligence, CRM-native activity tracking, and content-first training delivery. Insight7 — Best for behavioral call analytics linked to coaching Insight7 scores 100 percent of sales calls against configurable weighted criteria, producing per-rep scorecards that show behavioral performance at the dimension level. Revenue intelligence dashboards surface close-rate drivers, objection patterns, and rep performance tiers from actual call content. Insight7 is best suited for sales teams and contact centers where call volume is high enough that manual QA misses most behavioral patterns, typically 20+ calls per day per team. Weighted scoring: Each criterion is configurable with behavioral anchors defining what passing and failing look like Revenue intelligence: Identifies which behaviors correlate with closed deals and which patterns appear in lost opportunities Coaching integration: Auto-generates practice scenarios from calls that triggered low scores; reps retake until they reach threshold Integrations: Zoom, RingCentral, Salesforce, HubSpot, Google Meet, Microsoft Teams Pro: The connection from QA score to practice scenario to improved score is managed in one platform without manual handoff. This closes the loop that most sales performance tools leave open. Fresh Prints used Insight7 to give reps behavior-specific practice immediately after a coaching session, enabling real-time skill development rather than waiting for the next scheduled interaction. Con: Initial scoring calibration requires 4 to 6 weeks to align AI scores with your company's definition of "good." Teams expecting out-of-box accuracy on custom sales criteria will need to budget for that calibration period. Insight7 is the strongest option when behavioral call data needs to drive both QA documentation and coaching content in the same workflow. Salesforce Sales Cloud — Best for CRM-native activity tracking Salesforce Sales Cloud provides pipeline visibility, activity logging, and forecasting at the deal level. Its Einstein AI layer surfaces conversation insights and deal health signals for Salesforce-native sales teams. Salesforce is best suited for enterprise B2B sales teams where pipeline management, deal progression, and CRM hygiene are the primary performance use cases. Pro: Native CRM integration means activity data, deal outcomes, and forecast accuracy are all visible in one system without data export. For teams already on Salesforce, adding Einstein Conversation Insights requires no new infrastructure. Con: Salesforce Einstein conversation features provide summaries and keywords, not behavioral rubric scoring. Teams needing criterion-level coaching data will need a dedicated call analytics integration. Salesforce is the best choice for pipeline visibility; it is not a behavioral coaching platform without additional tooling. Gong — Best for B2B enterprise revenue intelligence Gong analyzes sales calls and emails to surface deal risk signals, coaching moments, and rep performance trends across complex multi-call B2B deals. Gong is best suited for enterprise B2B sales teams where deal intelligence and forecast accuracy are as important as individual rep behavioral coaching. Pro: Deal intelligence correlates call behavior to pipeline stage progression, identifying which conversation patterns predict deal advancement rather than just measuring individual scores. Con: Per-seat pricing at enterprise scale makes Gong expensive for high-volume contact center or transactional sales environments with short calls. Gong is the right tool for complex B2B sales; the pricing model and feature set are not optimized for contact center environments. HubSpot Sales Hub — Best for SMB sales teams needing CRM and basic analytics HubSpot Sales Hub provides pipeline management, email tracking, and basic call recording with AI-generated call summaries. For teams already in HubSpot CRM, it removes the need for a separate sales performance tool for reporting purposes. HubSpot is best suited for SMB sales teams where CRM hygiene, pipeline visibility, and basic call logging are the primary performance requirements, not deep behavioral coaching analytics. Pro: All-in-one CRM, email, and call tracking for teams that need consolidated visibility without enterprise pricing.
Sales Performance Training That Delivers Measurable Results
Sales training directors and revenue leaders who want training to produce measurable results face a consistent problem: programs are designed around content delivery rather than behavioral outcomes, so the results are hard to verify and even harder to attribute. This six-step guide covers how to structure sales performance training so that behavioral improvement is measurable before, during, and after the program runs. How to measure sales training effectiveness? Measuring sales training effectiveness requires three things: a behavioral baseline before training starts, post-training behavioral scores using the same criteria, and a connection between behavioral improvement and a revenue or operational outcome. Most programs skip the baseline, which makes the post-training measurement uninterpretable. Without knowing where reps started, a post-training score of 72 could represent a large improvement or no change at all. The measurement method has to be defined before the training runs, not after. What is the 70/30 rule in sales? The 70/30 rule in sales describes the ideal talk-listen ratio during a discovery or consultative selling call: reps should aim to speak approximately 30% of the time while the prospect speaks 70%. It is a behavioral guideline, not a guaranteed formula, and its value depends on call type. For one-call-close consumer scenarios, the ratio often looks different than for complex B2B deals. The point of the rule is that reps who dominate the airtime tend to miss the customer signals that would help them close. Call scoring tools can measure actual talk ratios across a rep's full call history, making it possible to see which reps are following the 70/30 principle in practice rather than only in training role plays. Step 1: Define What "Measurable Results" Means Before Training Starts The first decision in any sales training program is not what to teach. It is what you will measure. Revenue metrics like quota attainment and win rate matter, but they are lagging indicators that can take ninety days or more to reflect behavior change, and they are influenced by pipeline quality, pricing, and market conditions that have nothing to do with rep behavior. Behavioral criteria are the leading indicators. Define three to five specific behaviors that, if improved, should produce better revenue outcomes. Examples include: discovery question usage rate in the first five minutes of a call, objection reframe rate when a price objection is raised, next-step commitment close rate at the end of a call, and compliance with required script elements. Write each criterion in observable, scoreable terms. A well-formed criterion can be evaluated from a call recording without ambiguity. A vague criterion like "builds rapport" cannot be scored consistently across evaluators. A specific criterion like "uses the prospect's stated priorities to frame the recommendation before quoting price" can be scored. Avoid this common mistake: Designing training content first and then trying to identify metrics that prove it worked. The measurement criteria have to drive the content design, not the other way around. Step 2: Score Rep Call Behavior Before Training to Establish Baseline Run pre-training call scoring for a minimum of two weeks before the program begins. Score the same criteria you defined in Step 1, using the same scoring method you will use post-training. The baseline serves two purposes. First, it tells you which behaviors are already strong (do not spend training time on them) and which are below standard (those are the training priorities). Second, it gives you the denominator for your post-training delta calculation. Without a baseline, post-training scores have no context. Insight7 scores 100% of calls automatically against configurable evaluation criteria, generating per-rep behavioral baselines across every scored call in the pre-training window. Manual QA programs typically cover 3-10% of calls, which means rep baselines are often drawn from a small and potentially unrepresentative sample. A complete call dataset produces a more reliable pre-training picture of where each rep actually stands. Store baseline scores at the rep level and at the cohort level. Both are useful: rep-level baselines enable personalized training prioritization, and cohort-level baselines enable before/after comparison for the program overall. Step 3: Design Training Content Targeting the Behaviors Below Baseline With baseline data in hand, you can prioritize training content by behavioral need rather than by what is easiest to teach or most recently developed. Group reps by their baseline profiles. Reps who score high on discovery questioning but low on objection handling need a different training focus than reps who score low across all criteria. Content designed for the average trainee addresses no one's actual gap. Content designed for documented behavioral gaps addresses everyone's specific need. For each below-baseline criterion, design the training content and the practice scenario together. The practice scenario is where behavior change happens, not in the instruction. According to RAIN Group's sales training research, training programs that include deliberate practice linked to specific behavioral criteria produce significantly higher skill transfer rates than lecture-and-observe formats. Insight7 generates AI roleplay scenarios directly from real call data: the hardest objections a rep encountered in the baseline period become the scenario content for practice. This connects training directly to the specific behavioral gaps identified in Step 2. Step 4: Run the Training with Practice Scenarios Aligned to Those Behaviors Execute the training program. For each behavioral criterion being developed, the session structure should include: direct instruction (what the behavior is and why it matters), a modeled example (what good looks like on a real call), deliberate practice (reps execute the behavior in a structured scenario), and feedback (specific to the criterion, not general performance). Track which reps completed each practice scenario and how many repetitions they took. Completion rate and repetition count are process metrics: they tell you whether the training was received as designed. A rep who attended the session but did not complete the practice component has not had the same training experience as one who completed it multiple times until they passed the scenario threshold. The Brandon Hall Group's learning and development research consistently shows that practice repetition is the primary predictor of behavioral transfer from
Sales Performance Assessment Steps Explained
Sales performance assessments produce better outcomes when they are built on call data rather than manager impressions. This guide walks through the five steps sales managers use to run a rigorous assessment cycle: defining the right criteria, calibrating scores, identifying coaching priorities, delivering targeted feedback, and measuring whether the assessment actually changed behavior. Why Most Sales Assessments Produce No Behavior Change The typical sales performance assessment follows this pattern: the manager observes a handful of calls, completes a form, delivers feedback in a one-on-one, and the rep returns to their normal behavior within two weeks. The problem is not the assessment itself. It is that the assessment is based on a small, unrepresentative sample and delivers feedback that is too general to act on. An assessment that says "work on your discovery questions" gives a rep nothing specific to practice. An assessment that says "in your last 8 calls, you asked product-focused questions in the first 5 minutes rather than business-impact questions, which correlates with your 22% close rate versus the team's 34%" gives the rep a specific behavior to change and a benchmark to beat. What are the steps in a sales performance assessment? A rigorous sales performance assessment follows five steps. First, define 4 to 6 criteria with behavioral anchors at each score level. Second, score a representative sample of calls against those criteria (minimum 10 calls per rep). Third, identify the 2 criteria with the widest score gaps between top and bottom performers. Fourth, deliver criterion-specific feedback tied to transcript evidence. Fifth, re-score calls 30 days later to measure whether targeted behavior changed. Assessments that skip step five cannot tell whether they worked. Step 1 — Define Criteria That Connect to Sales Outcomes Most sales assessment rubrics measure activity (did the rep ask discovery questions?) rather than quality (did the rep use the answers to advance the deal?). Activity-based criteria are easier to score but poor predictors of close rates. Build your rubric with 4 to 6 criteria that map directly to your sales outcomes. For a one-call-close environment, the highest-weight criteria are typically: urgency creation (did the rep establish why buying now matters?), objection handling depth (did the rep address the real concern behind the stated objection?), and close attempt quality (did the rep ask for the business with a specific next step?). For complex B2B sales, criteria shift to: economic buyer identification, technical fit qualification, and multi-stakeholder alignment. Assign weights that reflect relative importance. Weights sum to 100%. Calibrate against your last 50 won deals: criteria that appear consistently in wins deserve higher weights than criteria showing no correlation. Common mistake: Treating all criteria as equally important. Equal-weight rubrics produce average scores that hide the specific behaviors actually driving deal outcomes. Weight by win-rate correlation. Step 2 — Score a Representative Call Sample per Rep A performance assessment based on 2 observed calls per rep is not statistically meaningful. A rep who had two good calls this month has not been assessed. Score a minimum of 10 calls per rep per assessment cycle to identify patterns rather than outliers. For teams using automated call scoring, this is a configuration change. For teams using manual review, 10 calls per rep per quarter is achievable with 20-minute call reviews. Pull a random sample from each week rather than cherry-picking recent calls, which tend to be unrepresentative. Target at least 80% inter-rater reliability before using any rubric for assessment decisions. Have two managers independently score the same 5 calls. If they disagree by more than 1 point on a 5-point scale for any criterion, the criterion's behavioral anchors are too vague. Refine the anchors before the assessment cycle begins. Insight7 applies your custom weighted rubric to every call automatically, generating rep-level scorecards with dimension breakdowns and transcript evidence for each score. Assessment cycles that previously took a manager 8 hours of call review take 30 minutes of scorecard review. Step 3 — Identify the Top Two Coaching Priorities from Assessment Data An assessment that surfaces 8 areas for improvement produces no improvement. Coaching bandwidth is finite. After scoring, identify the 2 criteria with the largest performance gap between each rep and the team benchmark. For each rep, the coaching priorities are the criteria where their score falls most below the team average, weighted by the criterion's impact on close rates. A rep who scores 2.5 out of 5 on urgency creation (weight 25%) has a higher-priority gap than a rep who scores 3.0 on process adherence (weight 10%), even though the absolute score difference is similar. Document the specific call moments that produced the low scores. "Your urgency creation score was 2.5 out of 5 in this assessment cycle" is insufficient. "In calls on March 3rd and March 17th, you offered discounts before establishing why the prospect's current situation was costing them more than the subscription price" is actionable. What is the best way to assess sales representative performance? The best way to assess sales representative performance is to score a random sample of 10 or more calls per cycle against a weighted rubric with behavioral anchors, identify the 2 criteria with the largest gap versus team benchmarks, and deliver feedback tied to specific transcript moments. Assessments that rely on manager observation of a small sample miss the patterns that only emerge across multiple calls. Call analytics platforms that score 100% of interactions automatically produce more reliable assessment data than manual sampling processes. Step 4 — Deliver Criterion-Specific Feedback with Transcript Evidence Feedback sessions are more effective when the rep and manager review the same call moment. Before the assessment meeting, pull the 3 calls that most clearly illustrate the primary coaching priority. Timestamp the specific moments to review. In the meeting, play or read the relevant transcript moment before explaining the scoring. "At 4:20 in your March 17th call, the prospect said they had been using their current vendor for 3 years. You moved to pricing. The scoring rubric at level 4 for urgency creation would
Performance Sales Training Systems
Sales enablement managers and L&D leaders building sales training programs face a version of the same problem: their training system delivers content, but it cannot tell them whether that content changed how reps sell. A performance-connected sales training system closes that gap by linking training activities to the conversation behaviors and deal outcomes that training is supposed to improve. This article covers what separates a performance-connected system from a content delivery platform, the three main architectural approaches, and how to select the right system for your team's current stage. What is a performance sales training system? A performance sales training system is any training infrastructure where training inputs (scenarios, assessments, coaching sessions, certifications) are connected to performance outputs (call quality scores, deal stage progression, win rate, ramp time). The connection can be direct, training completion unlocks a performance report showing change over time, or indirect, training content is derived from performance data such as real call recordings. The distinction matters because most enterprise learning management systems are built around content delivery and completion tracking. A rep who completes a module is recorded as "trained." Whether their next ten calls look different is a separate system's problem, or nobody's problem. Performance-connected training systems treat those as the same problem. How do you measure whether a sales training system is working? The most reliable measurement approach connects three data layers: training activity (what did the rep do, and when), behavioral change (did their call quality scores, objection handling frequency, or conversation structure change after training), and outcome change (did close rates, deal velocity, or ramp time change in the period after training). Any system that only measures completion rates is measuring training volume, not training impact. Programs with the highest measured impact consistently share one characteristic: they use real call data to identify specific behavioral gaps before building training content. Generic training content built without that diagnostic step produces completion without behavioral change. What Makes a Sales Training System "Performance-Connected" The defining characteristic is bidirectional data flow between training and performance systems. Training informs performance data (reps who completed X scenario should show improvement on Y call criteria), and performance data informs training content (reps who are struggling with objection type Z should receive the scenario built from calls where top performers handled Z well). Three elements are required for this to work. First, the training system must have access to actual performance data, whether from call recordings, CRM win/loss rates, or QA scorecards. Second, the training content must be mapped to specific performance dimensions, not just topic areas. Third, the system must be able to attribute performance change to specific training activities, even at a correlational level. Without all three, you have a training system that tracks completion. With all three, you have a system that tracks impact. Avoid this common mistake: building training content from what your top trainers believe best practices to be, rather than from what your top performers actually do on recorded calls. The gap between the two is often significant, and training built from belief rather than evidence tends to produce completion without behavioral change. Three Approaches to Performance-Connected Sales Training Coaching-Analytics-Led (Insight7 model) In this approach, conversation analytics is the foundation. Call recordings are analyzed against configurable QA criteria, behavioral gaps are identified at the rep and team level, and coaching scenarios are generated from the actual calls where those gaps appear. Training is derived from performance data rather than from a separate content library. Insight7 follows this model. The platform analyzes completed calls using weighted criteria scoring, surfaces behavioral trends across the call corpus, and generates voice-based roleplay scenarios from the calls themselves. A manager reviewing a QA dashboard can see that 60% of reps are failing an objection-handling criterion, then trigger a coaching scenario built from the calls where top performers handled that objection well. Fresh Prints, a staffing company, used this workflow to give reps immediate practice on specific gaps: "When I give them a thing to work on, they can actually practice it right away rather than wait for the next week's call." Limitation: Insight7 does not integrate with LMS platforms via SCORM. Training data stays in the Insight7 platform rather than flowing into external LMS completion tracking. Enablement-Platform-Led (Mindtickle / Allego model) In this approach, a dedicated sales enablement platform handles content delivery, certification paths, and readiness scoring. Training is structured around predefined competency frameworks, and performance data (often from CRM win rates or manager assessments) is connected to readiness scores. Mindtickle is strongest for organizations that need structured certification paths, defined competency frameworks, and manager-visible readiness dashboards. It is well-suited to larger sales organizations where training standardization across regions is a priority. Allego combines video practice with real-call analysis, allowing reps to record practice scenarios and submit them for manager or peer review alongside actual call analysis. It bridges the enablement-platform approach with some of the call-analytics depth of the coaching-analytics model. LMS-Led (Lessonly/Seismic / Docebo model) In this approach, a learning management system is the primary training infrastructure, with sales-specific content modules built on top of a general LMS foundation. Training completion, certification, and compliance tracking are strong. Connection to real call performance data is typically limited unless additional integrations are built. Lessonly, now Seismic Learning, is a strong fit for teams that need training tightly integrated with sales enablement content, where the same platform manages both the playbook and the training built from it. The LMS layer is solid; the connection to call-level performance data requires additional tooling. Docebo is an AI-powered LMS suited to large-scale training programs across complex organizations. Its AI features focus on content recommendation and learning path personalization rather than call analytics or scenario generation from real call data. Strong for organizations that need to train large, distributed sales teams against a standardized curriculum. Comparison Table System Training Derived From Performance Connection Best For Insight7 Real call recordings, QA scores Direct: behavioral trends drive scenario content Gap-based coaching from actual call data
How to Improve Individual Sales Performance Strategies
AI tools have changed how managers identify and close individual performance gaps in sales. Instead of waiting for quarterly reviews or relying on gut instinct, teams can now get a data-backed view of each rep's strengths and weaknesses after every call. This guide covers how AI assesses individual performance and how to adjust training content to match what each rep actually needs. Why Individual Assessment Beats Group Training Group training programs address the average skill gap across the team. Most reps do not have average gaps. They have specific, individual gaps in discovery, objection handling, qualification, or closing that group training never directly addresses. AI performance management research confirms that individualized feedback loops outperform cohort-based training for skill development because they close the gap between what a rep is told to do and what they actually do on calls. AI-powered call analysis makes individual-level assessment scalable for the first time. Insight7's platform scores every call against configurable criteria and generates per-rep scorecards showing which specific criteria fail most often. This is the input individual training content needs. How AI Assesses Individual Sales Performance What is the AI tool to measure performance? AI performance measurement tools fall into two categories: QA scoring platforms that evaluate calls against defined criteria, and revenue intelligence tools that correlate conversation behavior with pipeline outcomes. For individual training content, QA scoring platforms provide the most actionable data because they show which specific behaviors are failing for which rep, not just aggregate conversion rates. Insight7 covers both: QA scoring at the criterion level and revenue intelligence that identifies which behaviors predict close rates for your specific deal type. Call scoring: The platform scores every call against a weighted set of criteria, each with a "what great looks like" and "what poor looks like" context definition. Scores link back to the specific quote in the transcript that drove them. Managers can verify any score without re-listening to the full call. Behavioral pattern extraction: Across multiple calls, the platform identifies which criteria fail most consistently for each rep. A rep who fails discovery question depth 70% of the time needs different training content than a rep who fails objection acknowledgment 60% of the time. Improvement trajectory tracking: Insight7 tracks criterion-level scores over time per rep, showing whether coaching is producing measurable improvement or whether the rep is regressing after an initial uptick. Adjusting Training Content to Individual Gaps Match content to the failing criterion, not the failing rep The instinct is to build a training plan "for the underperforming rep." The more effective approach is to build training content targeting the specific criterion that rep is failing most often. A rep failing discovery needs discovery practice content. The same rep practicing closing scripts gets no closer to what they need. Use the rep's own call data to build scenarios Generic practice scenarios describe conversations the rep may not recognize. AI-generated scenarios built from the rep's actual call failures mirror the exact situations they encounter. Insight7 generates role-play personas from call transcripts, including the emotional tones, objection types, and conversation moments that drove low scores. Verify content transfer, not just content completion A rep completing a training module is not evidence that the behavior changed. Score calls from the week following training on the criterion that was targeted. Movement on that criterion, even a 2 to 3 point improvement, confirms the content transferred. No movement means the content did not connect to the actual behavior gap. According to Training Industry's assessment research, pre- and post-training behavioral assessment is the most reliable measure of individual skill transfer, outperforming quiz-based assessments or manager observations. Individual Performance Strategies That Work with AI Data Criterion-priority coaching sessions: Each session focuses on one criterion based on call data. The evidence (transcript quote or audio clip) opens the session. The coach and rep discuss the specific moment and why the behavior misfired. Practice follows immediately. Self-review with evidence: Share scored call data directly with reps. When reps can see the specific moments that drove low scores, self-awareness improves without requiring a manager present. Insight7's rep-facing scorecard view supports this workflow. Competition scoring boards: Surface criterion-level scores across the team as a leaderboard, focused on improvement rate rather than absolute score. A rep who improved their discovery score by 15 points in two weeks has a more meaningful achievement than a rep who maintained a consistent 80 overall. If/Then Decision Framework If AI scores are improving but outcome metrics are flat: Check whether the criteria being improved predict the outcome being measured. Compliance criteria and conversion criteria are not the same thing. Improve the criteria that correlate with close rates. If different reps fail the same criterion: This is a systemic training gap, not an individual one. Run a group session targeting that criterion for the affected reps before returning to individual development plans. If a rep resists AI-based feedback: Start with the evidence rather than the score. A transcript quote showing what the rep said and when is harder to dispute than a number. Build trust in the data before introducing score-based coaching. If training content is not moving scores: The content may not be addressing the specific failure mode. Review the criterion context description and the scenarios used. Adjust both before concluding the rep is not coachable. FAQ Can AI write a performance evaluation for sales reps? AI platforms can generate criterion-level performance summaries from call data, but the evaluation itself should be produced by a manager who interprets the data in context. AI-generated summaries are starting points, not final assessments. Insight7 generates AI coaching summaries after each scored session, surfacing behavioral patterns for manager review. How to use AI for performance analysis in small sales teams? Small teams (under 10 reps) benefit most from AI analysis because individual score differences become more visible without aggregate data masking them. Run the full call population through a QA platform, score against consistent criteria, and use per-rep criterion failure rates to build individual coaching plans. The
High Performance Sales Training Benefits
Sales training investments rarely fail because the content was wrong. They fail because the outcome wasn't measured. A program that can't connect training to behavior change and then to revenue impact can't justify budget, can't identify what's working, and can't improve over time. This guide covers the measurable benefits of high-performance sales training and how to capture them using a call analytics and coaching framework. The Limits of Traditional Sales Training Measurement Most sales training programs are evaluated against quota attainment six to twelve months after the program ends. That window is too long and too noisy to isolate training impact. Reps' territories change. Market conditions shift. New product lines launch. By the time quota numbers come in, it's impossible to separate what the training contributed from everything else that happened. High-performance programs build in shorter feedback loops. They measure behavioral change within 30 days of a training intervention, before external factors can contaminate the signal. This requires criterion-level data: not just whether a rep hit quota, but whether they're using the specific skills that were trained. According to Sales Management Association research on training effectiveness, organizations that measure behavioral change within 60 days of a training event report 2x higher training ROI than those measuring only revenue outcomes. 5 Measurable Benefits of High-Performance Sales Training Benefit 1: Faster ramp time for new hires. Structured training with a clear competency model and a curated call library reduces the period between hire date and full productivity. New reps who study examples of what good looks like, practice through AI roleplay, and receive structured feedback on early calls reach proficiency faster than those who learn by doing without a framework. The industry benchmark for B2B SaaS ramp time is 90 to 120 days. Teams using call analytics and AI coaching practice consistently run below that benchmark. Benefit 2: Coaching that targets real gaps. Without call analytics, coaching is based on what managers remember from spot-check reviews. With call analytics, coaching targets what data actually shows: which criteria are consistently below benchmark across a rep's last 30 calls, and which specific call moments those failures occur. This shifts coaching from "do better at discovery" to "your problem identification questions are leading – you're suggesting the problem before confirming it." Benefit 3: Practice that happens before live calls. Traditional coaching feedback sits unused until a rep encounters the relevant situation on a live call, which might not happen for two weeks. Insight7's AI coaching module lets reps practice specific scenarios immediately after a coaching session. Managers build scenarios from real call transcripts. Reps practice unlimited times with scores tracked over time. Fresh Prints expanded to this module because reps wanted to practice right away rather than wait for the next live call. Benefit 4: QA data that validates training ROI. When QA criteria are configured to match what was trained, scores for those criteria before and after a training event show whether behavior actually changed. This is the measurement infrastructure most programs lack. Insight7 provides evidence-backed scoring linked to the exact quote and call location, so managers can click through to verify any score and track criteria trends over time. Benefit 5: A self-reinforcing coaching culture. Teams that treat QA as a coaching input rather than a compliance audit create a self-reinforcing system. QA findings feed coaching priorities. Coaching priorities feed training design. Training content gets validated against new QA data. The loop shortens the time between identifying a problem and seeing behavior change. Why is quality assurance training important in sales? QA training creates a shared definition of what good looks like. Without it, managers evaluate calls against subjective standards that vary by reviewer, making feedback inconsistent and unmeasurable. When QA criteria are trained explicitly, reps know what's being measured, coaching conversations use a shared vocabulary, and performance trends can be tracked against a stable baseline. Insight7's weighted criteria system supports both script-based and intent-based evaluation, giving teams flexibility to define quality standards that match their actual sales motion. If/Then Decision Framework Training objective What to measure Timeline Reduce ramp time Criteria-level QA scores at 30, 60, 90 days vs. baseline 90 days post-hire Improve discovery quality Discovery criteria scores per rep per week 4 to 6 weeks post-training Increase close rate Close criteria scores + deal outcome linkage 60 days post-training Reduce compliance errors Compliance criterion pass rate 2 to 4 weeks post-training What Makes Training Stick Training that doesn't transfer to live call behavior is the most common failure mode. Kirkpatrick Model research consistently shows that knowledge acquisition does not predict behavior transfer without reinforcement in the job context. Reinforcement mechanisms that work: Spaced practice. AI roleplay sessions distributed over 3 to 4 weeks reinforce new behaviors better than a single intensive training event. Immediate feedback. Scores available within hours of a practice session, not days, preserve the behavioral connection between action and consequence. Manager reinforcement. When managers reference training behaviors in 1:1s using call data, reps understand the behavior is being observed. That visibility increases transfer rates. How do you measure the ROI of a QA partner with a training-focused culture? The measurement approach compares criterion-level behavioral change before and after the engagement, then links behavioral improvement to deal outcomes over 60 to 90 days. A QA partner with a training-focused culture integrates scoring data directly into coaching workflows. Look for these indicators: Does the QA team provide evidence-backed scores with call timestamps? Do low scores automatically trigger coaching assignments? Are scores tracked over time so improvement is visible? Insight7 provides all three layers, with auto-suggested training from QA feedback and progress tracking across unlimited practice sessions. FAQ How do you measure the ROI of sales training? The most defensible method is to measure criterion-level behavioral change using call analytics before and after training, then link behavioral improvement to deal outcomes over 60 to 90 days. Comparing ramp time for a cohort trained with structured call analytics versus a historical baseline is another clean measurement, as it controls for