Discoveries in Health Policy: CMS Posts New Idea for Colorectal Screening Biomarkers

Update - I walk through the old and new SENS-SPEC spaces, graphically, here.

###

On March 10, 2026, CMS released a new proposal for covering non-invasive CRC screening tests. Currently, CMS uses a threshold of sensitivity 74% (picking up about 3/4 of all colon cancers relative to colonoscopy) and a specificity of 90% (sending about 1 in 10 patients to colonoscopy due to a false positive FP biomarker.) CMS uses one NCD for DNA FIT testing (Cologuard) and another NCD for blood-based CRC screening.

In its opening of the NCD revision six months ago, CMS proposed to change the title to "Non-Invasive Biomarker Tests," suggesting they could merge coverage of blood-based and stool-based tests. That is what they are in fact attempting to do. CMS expects to issue its final version June 8, 2026.

However, it looks like CMS is making a cognitive error. Although they seem to understand there is a continuous tradeoff between SENS and SPEC (just by sliding the cut point up and down) they proposed to allow only two particular "bins" for coverage - SENS ≥90, SPEC≥87, or else option two, which is SENS ≥ 79, SPEC ≥ 90.

(Pick up 90% of cancers, while sending 13 patients to a false positive FP colonoscopy; or pick up 79% of of the cancers, while sending only 10 patients to a FP colonoscopy).

Clearly, you should be able to pick up 89% of cancers, but send 11 patients to colonoscopy. But that would fail. You'd fail he 90% rule of option 1, you'd fail the ≤10 rule of option 2.

The probably is, companies can get preliminary data, set predetermined cutpoints to meet one or the other bin, and then "miss" the two bins, although actually having a more accurate test that the NCD requires. That is, the test performs well against a continous quality frontier.

This is not hard to express algebraically, and CMS could use a simple formula by which anyone could tell in 30 seconds if a test meets the true (frontier-based) performance or not. This is also much less wasteful, since you don't have to discard super-costly trials that miss a "bin" while exceeding the implied accuracy frontier.

I'll let Chat GPT explain it.

###

Current state.

CMS currently covers non-invasive colorectal cancer (CRC) biomarker screening tests if they meet a performance threshold roughly equivalent to ≥74% sensitivity and ≥90% specificity for detecting colorectal cancer when compared with colonoscopy. In simple terms, the test must detect most cancers (few false negatives) while keeping the false-positive rate low enough that colonoscopies are not overused. Because any diagnostic test has a tunable threshold that shifts the balance between sensitivity (catching more cancers) and specificity (avoiding unnecessary colonoscopies), developers can move along a tradeoff curve: improving one usually worsens the other.

Nature of the tradeoff.

This relationship is the familiar receiver-operating characteristic (ROC) tradeoff. If a test is tuned to be very sensitive, it will capture more cancers but also send more patients to colonoscopy unnecessarily. If tuned to be very specific, it will avoid colonoscopies but miss more cancers. The acceptable balance depends on the clinical context of screening. For CRC screening, CMS is attempting to strike a balance between the harms of missed cancers (false negatives) and the harms and costs of unnecessary follow-up colonoscopies (false positives).

CMS proposal: two “bins.” (Uh-oh)

In the current proposed framework, CMS offers two acceptable combinations of sensitivity and specificity. A test qualifies if it satisfies either:

Criterion 1: sensitivity ≥90% and specificity ≥87%
Criterion 2: specificity ≥90% and sensitivity ≥79%

The intent is to allow some flexibility: one test might emphasize sensitivity, another specificity. However, mathematically these rules simply define two discrete points on the ROC plane. A test must land inside one of the two boxes to qualify.

Why a continuous equivalence frontier makes more sense.

In reality, diagnostic performance lies along a continuous tradeoff frontier, not two discrete bins. The two CMS criteria implicitly define the endpoints of that frontier. The acceptable combinations between them can be described with a simple linear relationship—essentially the line connecting those two CMS points:

Specificity ≥ 1.115 - 0.273*(Sensitivity)

That's not an equation that should blow up anyone's brain.

This equation simply expresses the same tradeoff CMS already recognizes: as sensitivity increases, some decrease in specificity is acceptable, and vice versa. The mathematics involved is no more complicated than high-school algebra.

Using a continuous equivalence frontier would better reflect the underlying biology and statistics of screening tests. Developers designing a test typically target a particular operating point on the ROC curve, but the final result in a large, expensive pivotal trial may fall slightly above or below the intended sensitivity or specificity threshold due to sampling variation. Under a two-bin rule, a test could fail coverage because it lands between the bins—even if its overall diagnostic accuracy lies squarely between, and possibly above, the performance of tests that CMS approves. For example, a test with intermediate sensitivity and specificity might fall just outside both bins yet still lie well within the reasonable tradeoff frontier implied by the CMS endpoints.

Adopting an equivalence frontier would therefore provide a fairer and more scientifically coherent standard. It would preserve CMS’s intended balance between missed cancers and unnecessary colonoscopies while avoiding arbitrary exclusions of tests whose overall performance is comparable—or even superior—to those that happen to fall inside one of the discrete bins.

###

Discoveries in Health Policy

Tuesday, March 10, 2026

CMS Posts New Idea for Colorectal Screening Biomarkers - But Should Use an Equivalence Frontier