What Tukey outlier deletion is

Tukey outlier deletion is a statistical step in how CMS sets Medicare Star Ratings cut points. Before CMS clusters contract scores into the 1 to 5 star bands for a measure, it removes the contracts whose scores are statistical outliers. It identifies those outliers using the Tukey outer-fence rule, a standard method that flags data points sitting far below or above the bulk of the distribution.

The sequence matters. CMS removes the Tukey outliers first, then runs mean resampling and the hierarchical clustering algorithm on the remaining scores to draw the cut points. Pulling the outliers out before clustering stops a handful of extreme contracts from dragging the thresholds toward themselves.

When CMS started using Tukey

CMS applied Tukey outlier deletion to the Star Ratings cut points for the first time with the 2024 Star Ratings. It was finalized through rulemaking and layered on top of the cut point guardrails and mean resampling CMS had already adopted to make cut points more stable from year to year. Tukey applies to the non-CAHPS measures that use clustering, not to the patient experience measures collected through CAHPS, which use a separate methodology.

How Tukey changes cut points

The effect runs in one main direction. In most measures there are more low-performing outlier contracts than high-performing ones, so cutting the outliers before clustering removes more weight from the bottom of the distribution than the top. That pulls the cut points upward.

2024
First Star Ratings year CMS applied Tukey outlier deletion
Up
General direction of cut points once outliers are removed
Non-CAHPS
Measures the Tukey outer-fence rule applies to

When Tukey took effect, many cut points rose, which made it harder for contracts to reach or hold 4 stars and above. The same method also makes cut points steadier across years, because a few extreme contracts can no longer swing where the thresholds land. The two effects are linked: the cut points are both higher and more stable.

Why Tukey matters for a quality team

Tukey raised the bar without raising any single team's measure rate. A contract can post the same raw rate it posted last year and still lose a star if the cut point moved above it. That is the trap. The thresholds are not published until after the measurement year closes, so a team that targets last year's cut points is aiming at a line that has likely already moved up.

The defensible response is to forecast where cut points are likely to land and manage to that forecast during the year, rather than reconciling against fixed historical thresholds after the year is over.

Common mistakes teams make with Tukey

  • Targeting last year's cut points. Tukey tends to push thresholds up, so matching last year's rate can still drop a star this year.
  • Assuming stability means easier. Tukey makes cut points more predictable, but predictable does not mean lower. The stable level is often a higher one.
  • Treating Tukey as a one-time event. It applies every Star Ratings year now, not just the year it was introduced.
  • Forecasting too late. Cut points are set retrospectively, so the work to clear a higher bar has to happen during the measurement year, before the thresholds are known.

How Pelica handles cut point risk

Pelica's Quality and Stars Copilot runs glide-path forecasting for HEDIS and Star Ratings measures, projecting where cut points are likely to land under the current methodology so teams close gaps before the thresholds are set rather than after. On the three triple-weighted Part D adherence measures, customers hold 96 percent medication adherence.

Related terms

Tukey is one step in how cut points are set. See Cut Points for the thresholds Tukey feeds into, and Triple-Weighted for why a star lost on an outcome measure costs three times as much in the overall rating. PDC is the adherence measure most exposed to a rising cut point.

Sources