What is the context for "Why Raw RICE Fails on Internal Platforms"?

RICE is useful because it forces product, engineering, and other contributors to make trade-offs explicit.

What is the core friction or problem?

On internal platforms with small user bases, raw RICE underweights compliance, security, and workflow integrity work because reach looks precise while the cost of being wrong stays hidden.

What is the key insight or pivot?

I still use RICE, but only after a domain pass. That keeps it useful as a conversation tool and stops the score from pretending it already knows the risk.

Why Raw RICE Fails on Internal Platforms

The problem with RICE on internal platforms is not that it is a bad framework. It is that teams ask it to settle arguments it was never designed to settle.

At Zendesk, I ran RICE across roughly 40 compliance-adjacent ideas and watched a critical DPA agreement-handling workflow land in the middle of the pack. The math looked respectable. Reach was decent. Impact looked moderate. Effort was not outrageous. If I had treated the score as truth, we would have under-prioritised a workflow where the cost of being wrong was far higher than the user count suggested.

That is the real failure mode. Not “RICE is useless.” More like: raw RICE gets dangerously persuasive when the user base is small, the risk profile is uneven, and the work crosses multiple internal contexts.

I still use RICE. I just use it for the job it is actually good at.

RICE is a conversation tool, not a verdict

At its best, RICE gives product, engineering, compliance, support, and operations a shared language for trade-offs:

Reach asks who actually feels the problem
Impact asks what changes if we solve it
Confidence asks what evidence we really have
Effort asks what the organization has to spend to get the outcome

That is valuable on internal platforms because the first problem is rarely arithmetic. It is usually translation. Product is talking about adoption. Engineering is talking about risk. Security is talking about blast radius. Operations is talking about manual work. RICE can act like a Rosetta stone between those perspectives because it forces the assumptions into the open.

That is why I do not think internal teams should throw RICE away. I think they should stop pretending the score is the whole decision.

Where raw RICE starts lying

Consumer teams get more out of standard RICE because reach can genuinely differentiate between opportunities. On internal platforms, the math gets flatter and the consequences get weirder.

The same pattern shows up over and over:

the user base is small enough that almost everything touches “a lot” or “not many”
some work is low-frequency but high-consequence
the roadmap compares unlike things: workflow integrity, compliance, permissions, developer experience, and feature requests all in one pile
the real cost of delay sits in policy, trust, or downstream operational drag rather than visible adoption numbers

That is how you end up over-ranking the broad annoyance everyone notices and under-ranking the narrow workflow that can create audit pain, broken approvals, or manual work for months.

What the model is doing	Where it helps	Where it lies on internal platforms	What I add
Standard RICE	Makes assumptions explicit across teams	Reach can dominate even when the real risk is hidden in the workflow	Use it as a discussion language, not as the final answer
Domain pass	Forces the team to name what type of risk is actually present	Prevents low-reach, high-consequence work from disappearing in the averages	Treat the domain as a weighting input before scoring
Decision record	Preserves why the score made sense this quarter	Stops the team from re-litigating the same exception every planning cycle	Write the rationale down and keep it close to the work

Define the domain before you score it

This is where DDD matters more than most prioritisation discussions admit.

If product, engineering, legal, and support are using the same word to mean different things, the score is already fake precision. It does not matter how carefully you rate impact if nobody agrees what an “agreement,” “request,” “environment,” or “approval” actually is. The model looks tidy while the underlying workflow stays fuzzy.

So my rule is simple: if the nouns are unstable, scoring comes second.

That is the DDD step in practical terms. Get the bounded context and shared language straight enough that the team is scoring the same thing. Only then does the RICE conversation become useful.

The two-pass model I actually use

DRICE is not a term I invented. In a 2023 Lenny’s Newsletter piece, Darius C and Alexey Komissarouk used it for a more investigative extension of RICE built around a hypothesis, an impact estimate, an engineering estimate, and a return-on-engineering-investment view.

I am borrowing the label more narrowly here. On internal platforms, my version is a domain-weighted, two-pass RICE workflow. The extra pass is not growth experimentation. It is domain risk.

Pass 1: Domain risk. Before anyone touches the score, I ask what kind of work this actually is. Does it protect compliance, security, revenue, workflow integrity, developer flow, or general productivity? The goal is not to moralize about importance. It is to surface the cost of being wrong.

Pass 2: Standard RICE. Once the domain is visible, I use standard RICE to make the trade-offs explicit. That part is still useful. The conversation gets better because the hidden risk is no longer pretending to be neutral.

The weighting I have used looks roughly like this:

Work type	Typical starting weight	Why it gets extra weight
Compliance / Security	around 1.5x	The downside of being wrong is usually disproportionate to the raw user count
Revenue protection	around 1.3x	Small workflow failures can still create meaningful churn or commercial drag
Developer experience	around 1.2x	The value is often multiplicative because it improves the contribution flow for other teams
Internal productivity	1.0x baseline	Useful as the default when the work matters but carries no unusual hidden risk

Those are not universal numbers. They reflected the context I was operating in, where the cost of a compliance or security miss was materially higher than the raw user count suggested. The point is not the multiplier. The point is to force the organization to say out loud what kind of loss it is actually trying to avoid.

That is what moved the DPA workflow from “mid-pack” to “obviously needs attention.” The score did not become smarter. The conversation became more honest.

Why engineering trusted the roadmap more

This was not just a PM neatness exercise.

Once the model accounted for domain risk, the conversation shifted from “why does compliance work keep jumping the queue?” to “which compliance or workflow-integrity items actually deserve the weight?” That is a much better argument. It is specific. It is legible. Engineering can challenge it without challenging the entire premise of prioritisation.

That is also why I keep calling RICE a conversation tool. It gives different contributors one shared frame for the trade-off. The domain pass makes sure the frame is not lying about what the work protects.

KCS is what stops this from resetting every quarter

The missing piece in a lot of scoring systems is memory.

If the team decides that a class of workflow deserves extra weight, but that reasoning lives only in a planning meeting or a Slack thread, the organization relearns the same lesson every quarter. That is where KCS earns its place for me.

I want the prioritisation rationale captured in the same motion as the decision:

what the team counted as reach
why the domain weight existed
which assumptions lowered or raised confidence
what evidence would change the score later

That way the framework becomes reusable knowledge instead of a performance in a planning meeting. KCS does not make the model more sophisticated. It makes the judgment portable.

Yes, you can even use the same instinct at home

The same logic can help outside work too, including at home, especially when you are choosing between something visibly annoying and something quietly risky.

If a household or homelab task affects very few people but creates outsized pain when it breaks, the raw “how many people does this touch?” logic is not enough there either. But I would not force a formal score onto every domestic decision. If the trade-off is obvious, skip the framework. If the quiet-risk task keeps losing to the visible annoyance, a lightweight version of the same conversation can help.

That is the line I care about: use the framework to surface trade-offs, not to create bureaucracy.

When plain RICE is enough

Standard RICE is still fine when:

the audience is large enough that reach genuinely differentiates
the work shares roughly the same risk profile
the domain language is already clear
the cost of being wrong is mostly reversible

And sometimes no scoring model is worth the overhead at all. If the answer is already obvious, the framework is just theater.

I am not anti-RICE. I am anti pretending that a clean score has already resolved a messy domain argument.

On internal platforms, the mistake usually is not using RICE. It is using raw RICE before the team agrees on what the work means, what kind of risk it carries, and how it will remember the call later.

If your internal roadmap keeps overvaluing visible nuisance work and undervaluing quiet risk, the issue is usually not that the team needs better decimals. It usually needs a better conversation. Get in touch.

Domain Bugs Cost More Than Code Bugs - the DDD discipline I use to get the nouns straight before anyone starts scoring
Documentation Is a Product Surface - the KCS habit that stops prioritisation rationale from disappearing into Slack
The Loudest Request Is Rarely the Most Important - how I decide which incoming signals are worth putting into a scoring conversation in the first place
Your AI Demo Is Not Production Ready - another example of low-reach, high-consequence product work that raw reach math can badly misread

Why Raw RICE Fails on Internal Platforms

RICE is a conversation tool, not a verdict

Where raw RICE starts lying

Define the domain before you score it

The two-pass model I actually use

Why engineering trusted the roadmap more

KCS is what stops this from resetting every quarter

Yes, you can even use the same instinct at home

When plain RICE is enough

Further Reading

Join the Discussion

RICE is a conversation tool, not a verdict

Where raw RICE starts lying

Define the domain before you score it

The two-pass model I actually use

Why engineering trusted the roadmap more

KCS is what stops this from resetting every quarter

Yes, you can even use the same instinct at home

When plain RICE is enough

Related thinking

Further Reading

Join the Discussion

More Field Notes

Domain Bugs Cost More Than Code Bugs

Signal to Noise: The Spreadsheet That Changed What We Shipped

Documentation Is a Product Surface