#2 revops ← all field notes

HubSpot lead scoring told us who was engaged. Not who would pay.

Replacing weighted-sum lead scoring with reasoning-traced AI prioritization in HubSpot. Five custom properties, one nightly batch, reps that actually trust the score.

Subject: HubSpot revenue prioritization
Industry: B2B SaaS / DACH mid-market
Stack: HubSpot workflows, Claude Sonnet, webhook + nightly batch

The case

HubSpot native lead scoring is a weighted sum of engagement behaviors: email opened, pricing page visited, demo requested. It tells you who is curious. It does not tell you who will pay.

Sales reps figure that out within a month and stop trusting the score. The actual prioritization moves to a Tuesday-morning meeting where the manager picks ten accounts from instinct, the team dials them, and the other 10,000 contacts sit untouched.

The fix is not a better weighted sum. It is replacing the score with an LLM that reads the contact’s HubSpot fields, scores them against an explicit ICP, and writes back a tier, a confidence, the matched signals, and a one-sentence rationale — all as HubSpot properties next to the contact.

The numbers

Live state on one contact (Lena Bergmann, VP Digital Transformation, novatech-ag.com):

HubSpot property	Value
`ai_icp_fit`	Strong
`ai_score`	85
`ai_pipeline_rank`	12 (of 247 open contacts)
`ai_next_action`	“Personal note from AE within 24h; reference scaling signal”
`ai_reasoning_trace`	“B2B SaaS, 200 employees, VP role, repeat pricing-page visits in last 7 days, no deal-breakers.”

Operational metrics: nightly batch over 12k contacts costs ~€180/month on Claude Sonnet, ~€18/month on Haiku. Re-score latency: ~15 min for full base. Rep adoption: scores read on 94% of contacts touched in week 4, vs 11% for the native HubSpot score in the month before.

What worked

Reasoning trace as a HubSpot property, not a CloudWatch log. Reps trust scores they can audit at the moment of decision. If the trace lives in another system, it does not exist.
Five properties, not one. ai_icp_fit, ai_expected_revenue, ai_next_action, ai_reasoning_trace, ai_pipeline_rank. Each is a different question; reps reach for different ones.
Nightly batch over per-event scoring. 10x cheaper, accurate enough for a daily pipeline review. Per-event would have been ~€1,800/month for marginal benefit.
ICP as a versioned system-prompt line. Single file, version number, prompt loads at runtime. Whenever sales updates the ICP, the model immediately uses the new one.

What failed

Per-event re-scoring on every engagement. Cost blew through the monthly cap in 11 days. The model was no more accurate, just more expensive.
Letting the model choose its own cost ceiling. It does not. Kill switch in the workflow that halts at €500/month and pings the owner.
Surfacing the rank without the rationale. Reps in the first 2 weeks said “I do not trust this.” Surfacing ai_reasoning_trace next to ai_score flipped adoption inside a fortnight.
A single weighted formula, no floor rule. Initial version greenlit a high-value contact who had churned out twice. Floor rule (no Tier A if any dimension < 3) prevents single-signal dominance.

The architecture

HubSpot contact event ───► Trigger (workflow or nightly batch)
                              │
                              ▼
                       AI scoring endpoint
                       (Claude Sonnet + ICP file)
                              │
                              ▼
                   Structured JSON output:
                   { icp_fit, expected_revenue,
                     next_action, reasoning_trace,
                     pipeline_rank_inputs }
                              │
                              ▼
                  Webhook writes 5 properties
                       back to HubSpot
                              │
                              ▼
        Reps see ranked queue + reasoning trace
        in HubSpot's standard contact view

Cost gate: monthly cap as workflow halt + owner ping. Eval set: 50 won + 50 lost deals from the last 12 months. AUC threshold: ≥ 0.80 vs ground truth.

Next-step checklist

Write the ICP as a single line (industry, company size, role). One file, versioned.
List fit indicators (positive) and deal-breakers (negative). 4-6 each.
Pick HubSpot object (contact / company / deal). For B2B SaaS, default to company.
Create the 5 properties in HubSpot property setup. Type each to match the JSON schema.
Write the prompt. Schema for output. Test on 5 known-won + 5 known-lost contacts.
Set up nightly batch. Cap the monthly budget. Pipe owner alerts on cap-breach.
Eval AUC on 50 won + 50 lost deals. Threshold ≥ 0.80 before flipping live.
Wire pipeline_rank into the rep’s queue view. Sort DESC. Top 20-30 per rep per week.

Full case study: AI Revenue Prioritization in HubSpot CRM. Free templates: HubSpot Revenue Prioritization Template, AI Lead Scoring Pack.

Does this shape match what you're building?

If you want me to scope a similar system for you — I respond in 24 hours.

Request a scope