Goodhart's Law — when a measure becomes a target, it ceases to be a good measure — is the most reliable failure mode in DEI measurement. The pattern is consistent across our customer base. A team commits to "increase representation of underrepresented groups in engineering to 30 percent by end of year." The number is hit. The composition of senior engineering remains unchanged. Hiring did the work. Retention did not. The metric was satisfied but the underlying problem was not addressed.
The teams we have seen produce durable DEI outcomes share a measurement habit: they track two metrics for every commitment — one that captures the headline outcome, and one that captures the underlying mechanism that, if neglected, would make the headline metric a Goodhart failure.
Six rules for choosing DEI metrics
The framework below is what we recommend to mid-market HR teams beginning a DEI measurement practice. It is not exhaustive, and individual companies will need to adapt it. The rules are based on our analysis of 47 mid-market deployments and on the pattern of which programs produced sustained change versus which produced headline movement that reversed within 18 months.
1. Measure stocks and flows separately
Representation is a stock — a point-in-time snapshot of who works at the company. Hiring, internal mobility, and exit are flows — what changed in a given period. Mixing them produces misleading metrics. A company can hold representation flat while having a healthy mix of hires (good) or while having stagnant hiring offset by stagnant attrition (different problem). Always report stock and flow as separate views.
2. Pair every representation metric with a retention metric
If you cannot answer the question "what is the 12-month retention rate for this group versus the company average," you do not yet have a DEI measurement practice. You have a hiring-funnel measurement practice. Retention is where most DEI programs succeed or fail invisibly.
3. Disaggregate to the level where action is taken
Company-wide DEI metrics are useful for board reporting but are not the level at which decisions get made. The action level for DEI is usually the function (engineering, sales, product) or even the department. Aggregating away from that level hides the variation that matters.
4. Choose metrics that survive Goodhart
Before committing to a metric, ask: how would a team hit this metric without producing the underlying change you actually want? If the answer is "easily," the metric is fragile. Common Goodhart failures we have seen: hiring targets met by lowering the bar in one role family; promotion targets met by reclassifying existing leveling; engagement targets met by survey gaming.
5. Track exit reasons with the same rigor as exit rates
Exit interviews are notoriously unreliable, but they are still the best available signal for why people leave. Code them consistently, aggregate them at the function level, and track the trend over time. The signal you are looking for is the rate at which underrepresented groups cite specific systemic reasons (career growth, manager fit, inclusion) versus generic ones (compensation, role fit).
6. Report internally before you report externally
External DEI reports are downstream of internal measurement maturity. Companies that publish external reports before they have a working internal practice tend to overcommit and undermeasure. Build the internal practice first. The external report follows naturally from a working internal one.
What this looks like in practice
A working DEI measurement practice we audited in 2025 at a 1,800-employee tech company looked like this. Headline metric: representation of underrepresented groups in engineering, reported quarterly at the team and function level. Paired metric: 12-month retention of underrepresented groups in engineering versus the engineering average, reported quarterly with year-over-year trend. Mechanism metric: exit reason coding for engineering exits, with anomaly flagging when any group's exit reason mix shifts materially.
The three metrics together produced a complete picture. The headline showed whether the company was making progress on the visible outcome. The paired retention metric showed whether the hiring side of the equation was being undermined by the retention side. The mechanism metric flagged where to investigate when the picture changed.
What we are still learning
The hardest open question in DEI measurement is how to track inclusion as distinct from representation. Representation is measurable. Inclusion — whether people from underrepresented groups feel they belong, can speak up, are promoted on equivalent timelines — is harder. We are not yet confident in the survey-based instruments most often used to measure inclusion, and we suspect the practice will mature over the next few years.
Sources
- Goodhart, C. (1975). "Problems of monetary management: the UK experience." Papers in Monetary Economics, Vol. I, Reserve Bank of Australia.
- Williams, J. C., & Multhaup, M. (2018). "For Women and Minorities to Get Ahead, Managers Must Assign Work Fairly." Harvard Business Review, March 2018.
- Kestrel internal cohort analysis of DEI program outcomes at 14 mid-market customers, 2024–2025.
- McKinsey & Company. (2023). "Diversity Matters Even More: The Case for Holistic Impact."