#4 executive ← all field notes

Field Note #4: Replacing LeanIX failed. Sitting on top of it worked.

AI decision-support layer over LeanIX. Five-dimension scoring, reasoning trace as auditable property, floor rule that stops single-signal dominance. What broke when we tried to be the EAM.

Subject
AI decision-support platform
Industry
DACH mid-market enterprise / transformation programs
Stack
LeanIX integration, Claude Sonnet, weighted-composite scoring, custom dashboard

The case

Mid-market enterprises run transformation portfolios out of Excel. Twenty to fifty initiatives per quarter. The steering committee meets monthly, argues about priority, decides nothing, leaves no audit trail. The CIO of one DACH manufacturer described it as “an expensive way to forget what we agreed to last Tuesday.”

They already had LeanIX. The capability catalog was clean, application lifecycle data was structured, the import-export worked. The piece missing was a decision layer — scoring, prioritization, the reasoning behind a Greenlight or Defer — that wrote back to LeanIX as auditable properties.

The numbers

MetricBeforeAfter (Q1 live)
Initiatives scored before steering committee~30%100%
Time-from-list to ranked priorities2-3 weeks2 days
Steering committee meetings spent on prioritization2.5/month0.5/month
Initiatives moved from Defer → Greenlight after re-score7 of 47 in Q1
Eval AUC (model rank vs CIO ranking on 50 historical)0.83

Composite formula (live): score = (value × 30 + (6−complexity) × 15 + maturity × 15 + ROI × 20 + fit × 20) / 100.

What worked

  • Decision layer on top of the EAM, not a replacement. LeanIX keeps the catalog. We add scoring, write tier + composite back as initiative properties. Customer sees one tool, not two.
  • Reasoning trace as an auditable property. Every score has a one-paragraph rationale stored next to it. Steering committee can ask “why is this Greenlight?” and read the answer in 10 seconds.
  • Floor rule on Greenlight. No dimension below 3, no matter how high the composite. Stops single-dimension brilliance from drowning a real risk. Caught two initiatives in Q1 that would have been greenlit on raw score alone.
  • Eval against historical CIO rankings. Built confidence in the model by showing it agreed with the CIO 0.83 AUC on 50 past initiatives. After that, the steering committee trusted re-rankings on new ones.

What failed

  • Trying to replace LeanIX. First proposal was “we own the catalog too.” Got nowhere. The customer’s data steward already kept LeanIX clean. We were duplicating work to lose adoption.
  • Per-stakeholder dashboards. Built one for the CIO, one for the CFO, one for the head of transformation. Nobody used them after week 3. Replaced with one dashboard with role-based filters; usage went up.
  • One priority number, no weights. First version asked Claude for “a single priority score 1-100.” It drifted across quarters. Splitting into 5 dimensions with explicit weights stopped the drift.
  • Letting the LLM rank without anchor descriptions. “What does ‘4’ for complexity mean?” — without a rubric, two scorers gave different answers, the model gave different answers across runs. Anchor descriptions per level locked it in.

The architecture

LeanIX (capability catalog + initiatives) ◄────────┐
    │                                              │
    ▼ read 20-50 initiatives                       │
                                                   │
AI scoring (Claude Sonnet)                         │
    │ rubric: 5 dimensions × 1-5 levels            │
    │ floor rule: no Greenlight if any dim < 3     │
    ▼                                              │
Composite + reasoning trace                        │
    │                                              │
    ▼                                              │
Write back to LeanIX as initiative properties: ────┘
   • ai_value_score (1-5)
   • ai_complexity_score (1-5)
   • ai_maturity_score (1-5)
   • ai_roi_score (1-5)
   • ai_fit_score (1-5)
   • ai_composite (number)
   • ai_tier (Greenlight / Pilot / Defer / Drop)
   • ai_reasoning_trace (paragraph)
   • ai_score_refreshed_at (datetime)
    │
    ▼
Steering committee dashboard (read-only on LeanIX)
   • Sort by ai_composite DESC
   • Filter by ai_tier
   • Expand for reasoning trace

Re-score on quarterly cadence, or on demand when an initiative’s scope changes materially.

Next-step checklist

  • Pick 20-50 initiatives. The full portfolio if it fits.
  • Weight the 5 dimensions so they sum to 100. Defaults: 30/15/15/20/20.
  • Write the rubric — what 1-5 means for each dimension. One sentence per anchor.
  • Pull 50 historical initiatives where you know the outcome (greenlit + delivered, deferred + still relevant, dropped + correctly). This is your eval set.
  • Run the scoring against the eval set. AUC vs the actual outcome ranking. Target ≥ 0.80.
  • Add the floor rule (no Greenlight if any dimension below 3). Verify on eval set.
  • Write 8-9 new properties to the EAM tool. Read-only dashboard on top.
  • Re-score quarterly. Or on initiative scope change.

Full case study: AI Decision Support Platform. Free tool: Portfolio Scoring Matrix.

Does this shape match what you're building?

If you want me to scope a similar system for you — I respond in 24 hours.

Request a scope