The problem with RICE on internal platforms is not that it is a bad framework. It is that teams ask it to settle arguments it was never designed to settle.
At Zendesk, I ran RICE across roughly 40 compliance-adjacent ideas and watched a critical DPA agreement-handling workflow land in the middle of the pack. The math looked respectable. Reach was decent. Impact looked moderate. Effort was not outrageous. If I had treated the score as truth, we would have under-prioritised a workflow where the cost of being wrong was far higher than the user count suggested.
That is the real failure mode. Not “RICE is useless.” More like: raw RICE gets dangerously persuasive when the user base is small, the risk profile is uneven, and the work crosses multiple internal contexts.
I still use RICE. I just use it for the job it is actually good at.
RICE is a conversation tool, not a verdict
At its best, RICE gives product, engineering, compliance, support, and operations a shared language for trade-offs:
- Reach asks who actually feels the problem
- Impact asks what changes if we solve it
- Confidence asks what evidence we really have
- Effort asks what the organization has to spend to get the outcome
That is valuable on internal platforms because the first problem is rarely arithmetic. It is usually translation. Product is talking about adoption. Engineering is talking about risk. Security is talking about blast radius. Operations is talking about manual work. RICE can act like a Rosetta stone between those perspectives because it forces the assumptions into the open.
That is why I do not think internal teams should throw RICE away. I think they should stop pretending the score is the whole decision.
Where raw RICE starts lying
Consumer teams get more out of standard RICE because reach can genuinely differentiate between opportunities. On internal platforms, the math gets flatter and the consequences get weirder.
The same pattern shows up over and over:
- the user base is small enough that almost everything touches “a lot” or “not many”
- some work is low-frequency but high-consequence
- the roadmap compares unlike things: workflow integrity, compliance, permissions, developer experience, and feature requests all in one pile
- the real cost of delay sits in policy, trust, or downstream operational drag rather than visible adoption numbers
That is how you end up over-ranking the broad annoyance everyone notices and under-ranking the narrow workflow that can create audit pain, broken approvals, or manual work for months.
| What the model is doing | Where it helps | Where it lies on internal platforms | What I add |
|---|---|---|---|
| Standard RICE | Makes assumptions explicit across teams | Reach can dominate even when the real risk is hidden in the workflow | Use it as a discussion language, not as the final answer |
| Domain pass | Forces the team to name what type of risk is actually present | Prevents low-reach, high-consequence work from disappearing in the averages | Treat the domain as a weighting input before scoring |
| Decision record | Preserves why the score made sense this quarter | Stops the team from re-litigating the same exception every planning cycle | Write the rationale down and keep it close to the work |
Define the domain before you score it
This is where DDD matters more than most prioritisation discussions admit.
If product, engineering, legal, and support are using the same word to mean different things, the score is already fake precision. It does not matter how carefully you rate impact if nobody agrees what an “agreement,” “request,” “environment,” or “approval” actually is. The model looks tidy while the underlying workflow stays fuzzy.
So my rule is simple: if the nouns are unstable, scoring comes second.
That is the DDD step in practical terms. Get the bounded context and shared language straight enough that the team is scoring the same thing. Only then does the RICE conversation become useful.
The two-pass model I actually use
DRICE is not a term I invented. In a 2023 Lenny’s Newsletter piece, Darius C and Alexey Komissarouk used it for a more investigative extension of RICE built around a hypothesis, an impact estimate, an engineering estimate, and a return-on-engineering-investment view.
I am borrowing the label more narrowly here. On internal platforms, my version is a domain-weighted, two-pass RICE workflow. The extra pass is not growth experimentation. It is domain risk.
Pass 1: Domain risk. Before anyone touches the score, I ask what kind of work this actually is. Does it protect compliance, security, revenue, workflow integrity, developer flow, or general productivity? The goal is not to moralize about importance. It is to surface the cost of being wrong.
Pass 2: Standard RICE. Once the domain is visible, I use standard RICE to make the trade-offs explicit. That part is still useful. The conversation gets better because the hidden risk is no longer pretending to be neutral.
The weighting I have used looks roughly like this:
| Work type | Typical starting weight | Why it gets extra weight |
|---|---|---|
| Compliance / Security | around 1.5x | The downside of being wrong is usually disproportionate to the raw user count |
| Revenue protection | around 1.3x | Small workflow failures can still create meaningful churn or commercial drag |
| Developer experience | around 1.2x | The value is often multiplicative because it improves the contribution flow for other teams |
| Internal productivity | 1.0x baseline | Useful as the default when the work matters but carries no unusual hidden risk |
Those are not universal numbers. They reflected the context I was operating in, where the cost of a compliance or security miss was materially higher than the raw user count suggested. The point is not the multiplier. The point is to force the organization to say out loud what kind of loss it is actually trying to avoid.
That is what moved the DPA workflow from “mid-pack” to “obviously needs attention.” The score did not become smarter. The conversation became more honest.
Why engineering trusted the roadmap more
This was not just a PM neatness exercise.
Once the model accounted for domain risk, the conversation shifted from “why does compliance work keep jumping the queue?” to “which compliance or workflow-integrity items actually deserve the weight?” That is a much better argument. It is specific. It is legible. Engineering can challenge it without challenging the entire premise of prioritisation.
That is also why I keep calling RICE a conversation tool. It gives different contributors one shared frame for the trade-off. The domain pass makes sure the frame is not lying about what the work protects.
KCS is what stops this from resetting every quarter
The missing piece in a lot of scoring systems is memory.
If the team decides that a class of workflow deserves extra weight, but that reasoning lives only in a planning meeting or a Slack thread, the organization relearns the same lesson every quarter. That is where KCS earns its place for me.
I want the prioritisation rationale captured in the same motion as the decision:
- what the team counted as reach
- why the domain weight existed
- which assumptions lowered or raised confidence
- what evidence would change the score later
That way the framework becomes reusable knowledge instead of a performance in a planning meeting. KCS does not make the model more sophisticated. It makes the judgment portable.
Yes, you can even use the same instinct at home
The same logic can help outside work too, including at home, especially when you are choosing between something visibly annoying and something quietly risky.
If a household or homelab task affects very few people but creates outsized pain when it breaks, the raw “how many people does this touch?” logic is not enough there either. But I would not force a formal score onto every domestic decision. If the trade-off is obvious, skip the framework. If the quiet-risk task keeps losing to the visible annoyance, a lightweight version of the same conversation can help.
That is the line I care about: use the framework to surface trade-offs, not to create bureaucracy.
When plain RICE is enough
Standard RICE is still fine when:
- the audience is large enough that reach genuinely differentiates
- the work shares roughly the same risk profile
- the domain language is already clear
- the cost of being wrong is mostly reversible
And sometimes no scoring model is worth the overhead at all. If the answer is already obvious, the framework is just theater.
I am not anti-RICE. I am anti pretending that a clean score has already resolved a messy domain argument.
On internal platforms, the mistake usually is not using RICE. It is using raw RICE before the team agrees on what the work means, what kind of risk it carries, and how it will remember the call later.
If your internal roadmap keeps overvaluing visible nuisance work and undervaluing quiet risk, the issue is usually not that the team needs better decimals. It usually needs a better conversation. Get in touch.
Related thinking
- Domain Bugs Cost More Than Code Bugs - the DDD discipline I use to get the nouns straight before anyone starts scoring
- Documentation Is a Product Surface - the KCS habit that stops prioritisation rationale from disappearing into Slack
- The Loudest Request Is Rarely the Most Important - how I decide which incoming signals are worth putting into a scoring conversation in the first place
- Your AI Demo Is Not Production Ready - another example of low-reach, high-consequence product work that raw reach math can badly misread
Further Reading
- Intercom on RICE - the original RICE framework
- Introducing DRICE: A modern prioritization framework - the DRICE framing by Darius C and Alexey Komissarouk that shaped how I use the term
- Martin Fowler - Domain-Driven Design - concise grounding on why domain language matters
- KCS Principles and Core Concepts - the knowledge-management loop behind reusable decision rationale