Sunday, December 21, 2025

Two Fascinating AI Tasks: Windows Repair; New Viewpoint on Point of Care Testing

 I had two unusually interesting experiences with AI at Work this week.  In one, Chat GPT diagnosed a quite arcane computer software problem, involving a commerical site, my lap top, and my router.

In the other, I asked Chat GPT to take a kind of weird unexpected position on the value of Point of Care Testing, and it came back with some interesting ideas.

### SOFTWARE DEBUGGING

Header: Chat GPT fixed a complicated multi-part software problem that sidelined my work for half a day.

Nowadays we use a lot of cloud-based software (Google Drive, etc).   I use a database and information management system called NOTION.  For a day or two it starting to perform really badly - web pages refused to open, or took 60 seconds.  After even simple pages loaded, a progress symbol would keep spinning, as if trying to download more.

My first guess was a (rare) server problem at NOTION, but their system was up.  Chat GPT advised me to try running my desk top off my iPhone wifi - and NOTION ran far better.   Chat GPT announced the problem was not NOTION, not my office computer, but my ... router.   After some back and forth, we disabled the 6GHz band on my (4/5/6 GHz) router, which, Chat says, also unloads router software called "WIFI 7."   NOTION now worked.

While this solution may sound simple when explained, getting there from a mysterious glitch was an impressive pathway.   I never could guessed that my ethernet (hard wire) computer could be messed up by a wifi setting (6 GHz) I had never even heard of.

(Boring, but long dialog here.)


###  NEW IDEAS IN POCT

Header:  Chat GPT Came up with "interesting new ideas" by applying a novel theory to a studied problem.

I was reading some recent review articles about Point of Care Testing (POCT).   For some reason, I had a stray thought about "Shannon Information Theory."  I barely know what that is, but I gave Chat GPT several recent review articles, and asked it if it could figure out new, interesting value implications by looking through the lens of Shannon Information Theory.   It came up with a number of at least very interesting ideas.   To my eye, it was an example of AI doing pretty interesting "thinking" and "ideas."

Note: I am "NOT" saying it came up with publishable, ground-breaking ideas or that anybody would do a PhD thesis on this.  Rather, that it was doing interesting or suprising work.  

For more detail of what it actually said, here.

Summary of its ideas here:

Conventional health economic and outcomes research (HEOR) analyses of point-of-care testing (POCT) emphasize turnaround time, workflow efficiency, and operational convenience. In the present project, a paired conceptual analysis applies Shannon information theory and modern decision-theoretic frameworks to re-examine POCT as an information-processing system rather than a logistics innovation. 

Using qualitative reasoning and simplified mathematical formulations, the essays model the diagnostic encounter as a communication channel linking latent disease states to clinical action. The analysis demonstrates that POCT fundamentally restructures this channel by reducing information loss (“erasures”), preserving clinical context, enabling feedback-driven sequential decision-making, and aligning diagnostic information with disease dynamics. Even when analytic sensitivity and specificity are identical to central laboratory testing, POCT increases effective information yield, improves physician–patient communication fidelity, and raises the expected utility of diagnostic information. At the system level, same-visit clarification reduces care-pathway entropy, improves triage accuracy, and stabilizes downstream resource allocation. 

This new information-theoretic framing provides a unifying explanation for observed POCT advantages reported in the [conventional] HEOR and implementation literature and offers a complementary theoretical foundation for evaluating POCT value beyond speed, cost, or assay performance alone.

Brief Blog: CMS Unleashes Major Drug Pricing Proposals

 If you want to bury something, CMS says release it on Friday night.   If you really want to bury it, use the Friday night before Christmas week.

See open access coverage at Axios.

Proposals peg US drug prices to prices (benchmarks) in other countries.  There are two separate rules.  One, GlOBAL BENCHMARK or "GLOBE," is Part B injectibles.  The other, GUARD, is Part D drugs.

See GLOBE press release here.  Innovation website here.   Fed Reg here.  The proposal is 279 pages; see paginated version Dec. 23.

See GUARD press release here.  Innovation website here.  Fed Reg here.   The proposal is 281 pages long.

Here's the TLDR note:

The Trump administration proposed two Medicare drug pricing pilots tying U.S. payments to international benchmarks. A Part B model (2026–2031) and a Part D model (2027–2031) could save Medicare about $26 billion total and reduce seniors’ outpatient out-of-pocket costs by $6.2 billion. However, prescription drug costs for seniors could rise $3.6 billion longer term as manufacturers offset losses by raising other prices. Both models exclude drugs already under Medicare negotiation. Pharma groups strongly oppose the plans, warning of higher costs and reduced R&D.

Brief Blog: CMS Updates Price Transparency

 CMS is arguing for important updates to its Price Transparency rules.

See:

  • News at Fierce Healthcare here.
  • CMS press release here.
  • CMS FAQ here.
  • CMS proposed rule here.
    • See paginated Fed Reg version Dec 23.
  • The work follows Executive Order 14221, “Making America Healthy Again by Empowering Patients with Clear, Accurate, and Actionable Healthcare Pricing Information.” (2-25-2025)

###
AI CORNER
###

I asked ChatGPT to review the documents and also, to let us know if there was any interesting emphases or deletions in the press release, relative to the proposed rule.   Comment 60 days (about Feb 21).
##

Below is a blog-ready review of the Transparency in Coverage Proposed Rule (CMS-9882-P), written with your policy-savvy readership in mind. I focus first on what the actual proposed rule does, then compare it with the press-release framing, and finally flag notable provisions that are largely absent or understated in the press messaging.


CMS Proposes Major Reset of Payer Price Transparency Rules (CMS-9882-P): What Actually Changes

On December 19, 2025, CMS, jointly with DOL and Treasury, released a sweeping proposed update to the 2020 Transparency in Coverage (TiC) rules. While publicly framed as a “burden-reducing” cleanup, the proposed rule is more accurately described as a structural redesign of how payer price data must be generated, organized, contextualized, and consumed .

The core policy diagnosis is blunt: TiC succeeded in forcing disclosure, but failed in usability. CMS identifies three systemic problems—file size, ambiguity, and misalignment with hospital price transparency—and the proposal is aimed squarely at fixing those defects.

1. From “Plan-Level” to “Network-Level” In-Network Rate Files

Perhaps the most consequential change is the requirement that In-Network Rate Files be reported at the provider-network level, not separately for each plan or product. CMS recognizes that dozens (or hundreds) of plans often share identical networks and rates, and current plan-level reporting is massively duplicative.

Why it matters:
This is a fundamental data-model shift, not a formatting tweak. It aligns payer disclosures with hospital price transparency conventions and implicitly treats provider networks—not plans—as the economic unit of price formation.

2. Mandatory Exclusion of “Unlikely Provider–Service” Combinations

Plans would be required to strip out negotiated rates for services a provider would never realistically perform, using the plan’s internal provider taxonomy already used for claims adjudication.

CMS goes further and requires public disclosure of:

  • The taxonomy mapping file itself

  • A Utilization File listing providers who actually billed for services over a defined lookback period

  • A Change-Log File showing what changed quarter-to-quarter

Why it matters:
This quietly moves TiC from a “dump everything” regime to a curated, defensible dataset—and exposes plans’ internal logic for determining who is paid for what.

3. Substantially More Out-of-Network (OON) Data

CMS explicitly acknowledges that Allowed Amount Files have been nearly useless due to the 20-claim threshold and short lookback window. The proposal would:

  • Aggregate OON data by insurance market type (large group, small group, individual, self-insured)

  • Lower the threshold from 20 to 11 claims

  • Extend the reporting window from 90 days to 6 months, with a 9-month lookback

Why it matters:
This is not cosmetic. CMS expects materially more OON pricing data to enter the public domain, enabling analysis of balance-billing exposure and market-type pricing behavior.

4. Quarterly (Not Monthly) Reporting

CMS proposes to move In-Network and Allowed Amount files from monthly to quarterly updates, while keeping prescription drug files monthly.

Why it matters:
This is framed as burden reduction, but it also implicitly acknowledges that rates are sticky, and that analytic value does not increase with frenetic refresh cycles.

5. New Contextual Data Elements

Each file would newly include:

  • Product type (HMO, PPO, etc.)

  • Enrollment counts

  • Common network names

This is CMS conceding that raw prices without context are analytically misleading.

6. “Findability” Requirements

Plans must post:

  • A plain-text (.txt) file in the website root identifying file locations and contacts

  • A standardized “Price Transparency” footer link on the homepage

This mirrors hospital transparency enforcement and is clearly designed to support automated crawling and third-party analytics.

7. Single File Format Under Consideration

CMS explicitly floats the idea of mandating a single file format (JSON or CSV) via future technical guidance.

Why it matters:
This is a signal that CMS sees standardization—not flexibility—as the next phase of transparency policy.

8. Phone-Based Price Transparency (No Surprises Act Alignment)

Plans would be required to provide the same price-comparison information by phone as online or in print, and CMS explicitly states that satisfying this requirement also satisfies No Surprises Act §114.

Why it matters:
This quietly expands operational obligations and training costs—and pulls grandfathered plans into certain disclosure duties.


How the Press Release Frames It (and What It Downplays)

The CMS press release emphasizes:

  • Smaller files

  • Easier consumer use

  • Empowerment and competition

  • Burden reduction

  • Alignment with Trump-era transparency goals

All accurate—but selectively so .

What’s Missing or Underplayed in the Press Messaging

Here are important features in the rule that are largely absent or muted in the press release:

  1. Disclosure of Internal Provider Taxonomy Logic
    The rule requires plans to expose how they internally decide which providers can be paid for which services. 

  • This is new transparency into payer decision-making, not just prices.
  1. Utilization Files as a De-Facto Validation Layer
    The Utilization File requirement effectively creates a ground-truth layer against negotiated rates—something the press release barely mentions, despite its analytical importance.

  2. Magnitude of New Compliance Costs
    The regulatory impact analysis estimates over $900 million in one-time compliance costs, driven largely by Utilization Files and Change-Logs—numbers absent from public messaging .

  3. Shift Toward Machine-First, Not Consumer-First, Transparency
    Despite consumer-friendly rhetoric, the rule is optimized for developers, researchers, employers, and enforcement, not casual patient shopping.

  • May serve as input for AI, other advanced data analyses.
  1. Quiet Expansion of Federal Authority via No Surprises Act Crosswalk
    CMS explicitly states that satisfying TiC phone-based disclosure requirements satisfies NSA obligations—even for some grandfathered plans. This is a regulatory consolidation move not highlighted publicly.


Bottom Line for Policy Watchers

This proposed rule is not just a cleanup. It represents CMS’s tacit admission that first-generation transparency failed at usability, and that the second generation will be more structured, more enforceable, and more analytable—even if that means exposing payer logic and imposing nontrivial new compliance costs.

For data users, this is a clear win.
For plans and TPAs, it is a quiet but significant operational escalation.
For CMS, it signals a pivot from “disclosure as ideology” to disclosure as infrastructure.

Comment deadline: February 21, 2026.

####

I work in clinical genomics, how do I summarize in 30 seconds for my boss?

####

Two bullets for your lab industry boss (executive-level)

  • CMS is hardening price-transparency data into infrastructure, not optics.
    The TiC rewrite forces payers to publish cleaner, contextualized, machine-usable datasets (networks, utilization, enrollment), signaling that CMS increasingly expects pricing data to support analytics, benchmarking, and enforcement—not just consumer browsing.

  • This strengthens payer leverage and normalization pressure—eventually affecting advanced diagnostics pricing.
    Better out-of-network and network-level data improves payer ability to benchmark negotiated rates and utilization patterns, which over time increases pressure toward price rationalization even in complex service categories.


Why this matters (if your boss asks “so what?”)

1. It’s upstream of molecular pricing, not adjacent to it
The rule doesn’t touch labs directly—but it improves payer visibility into how prices behave across markets and networks. Once that infrastructure exists, it becomes easier for CMS and commercial payers to ask uncomfortable questions later about:

  • why certain services sit outside norms,

  • why OON utilization persists,

  • why network participation differs by market type.

That matters for molecular labs because complexity has historically protected pricing. CMS is slowly dismantling opacity as a structural defense.

2. It aligns with CMS’s broader “data-first” regulatory arc
This rule fits the same pattern you see in:

  • Hospital Price Transparency enforcement tightening

  • No Surprises Act operationalization

  • CMS interest in AI/analytics using administrative data

For diagnostics companies, the takeaway isn’t “price transparency is coming for us tomorrow,” but rather:

CMS is building the data rails first; policy follows later.

3. It subtly favors scale and sophistication
Smaller plans and TPAs will struggle more with utilization files, taxonomy logic, and change logs. Large payers and integrated systems will adapt faster—and those are the same actors that increasingly shape lab network inclusion and reimbursement terms.


What I would not overstate

  • ❌ No immediate impact on CPT pricing, PLA rates, or MolDx

  • ❌ No new disclosure requirements for labs

  • ❌ No short-term revenue effect

This is strategic weather, not tactical risk.

  • Draft comment-letter themes for payers, employers, or analytics firms

  • Compare this rule to hospital price transparency enforcement trajectories

  • Analyze implications for AI-driven price comparison tools and data licensing




Friday, December 19, 2025

AMA Posts Agenda for Palm Springs AMA CPT Meeting (Feb 5-7, 2026)

Header: AMA has posted the complete agenda for the February 5-7, 2026, AMA CPT editorial meeting in Palm Springs.

Note that pathology/laboratory codes were posted on an earlier comment timeline, which is now closed.  (You can still get the info, but it won't get to the subcommitees). For other topics, the deadline for comments is January 6.  You have to apply for AMA permission to comment at least a few days ahead of that.  See more at the first page of the agenda pdf;

https://www.ama-assn.org/system/files/cpt-panel-february-2026-agenda.pdf

CPT AI POLICY ON DISPLAY: "APPENDIX S" (tab 68)

AMA is working on making changes to its AI policies all across the CPT.  One feature is updates to the "Appendix S" about AI.  

By December 19, AMA had already posted "option A" and "option B."  Appendix S is Agenda Item 68 on page 11 of the agenda PDF.

Finding Parts of Tab 68.  In the CPT app that you use to become a commenter, note that there are four columns in the app for the row "Tab 68," Appendix S.  The first column symbol givs you a word document that explains the committee's rationale for Appendix S revisions.   The second column has a symbol you tap to submit a structured box-by-box comment portal.   The fourth column is an extra document which is Ballots A and B.  This are a first version and modestly edited second version of the new heavily edited Appendix S.

ROLE OF APPENDIX S

I mentioned that Item 68 givs you a word document that explains the commitee's rationale.  This is the standard CPT code change document with many questions.  It provides interesting insights into how many uses Appendix S gests when one is filling out, or reviewing, a CPT application.  

In the agenda for Feb 5-6, I see references to AI about 11times (informal count).

In the CPT application template (I used the Appendix S version), I see references to AI or to Appendix S twiee - at III.2 and V (AI Specific section). 

While Appendix S says AMA does not define AI, Code Change Section V identifies software as AI, software is identified and classified AI "if software has performed more than data processing."  (!)  Data processing is then defined in a parenthetical:  "(data processing includes helping to aggreggate, organize/arrange, transmit, develop, or otherwise visually enhance the data.)"

FINAL WORD ON APPENDIX S

It's hard to evaluate Appendix S without knowing its purpose.  Yes, in part it defines "assistive, augmentive, autonomous" AI (except it doesn't define AI!).  But why?   And later, autonomous AI is broken into three parts.  There's not the faintest clue why, so it's hard to know if that table is fit for purpose or not, having no purpose.

I would prefer they include a page about why they have Appendix S, how Appendix S is "used" in practice at CPT, and maybe 10 problems with the current system, 10 points where Appendix S addresses a problem, and how Appendix S fixes the problem.  This kind of structured end-oriented thinking is wholly absent from Appendix S and its change documents.  

AI CODES

While there are nearly a dozen AI codes, nearly all in Category III, one is "AI algorithm generated synthetic nerve potentials" 0278T, which the applicant hopes to elevate to Category I ("97XX5.")  Note that this is the applicants hope; the CPT may or may not buy the upgrade.  

LAB TESTS

Of particular interest, the PLA committee has been declining to take AI-augmented lab tests, at lesat in digital pathology.  In Tab 44, an applicant asks for 2 codes for "algorithmic image analysis on cancer tissue," I think on H&E sections.  They asks for two codes - I'm guessing, perhaps one could be starting with a slide to image, and one could be starting with a DICOM.  Proposals for X548T X549T.

Also with a nod to the PLA committee, which seems to now exclude WSI on IHC because IHC often includes physician work, there is an applicant for Synucleinopathy detection on immunofluorescent slides.   Tab 46 for X547T.   Stained skin slides predict Parkinson's disease presence or absence.   

The above are of interest because there have been very, very few Cat III codes for lab tests up until now.  Apparently Cat III is accepting refugees from PLA.


##

MECHANICS & DATES

Note that if you comment, you are sent to a "CPT Smart App Portal" where you have to enter the part of the portal for "Interested Party" in a top-of-page field.


dates:





More Big News: HHS Request for Information - Artificial Intelligence Reimbursement!

Header:  HHS Issues Important New RFI on AI Reimbursement - Connections to Digital Pathology.

It's been a big several weeks for news related to digital pathology and healthcare AI more generally.  I'll quick link to several very recent blogs, then jump into today's big news, an AI RFI from HHS that includes Reimbursement explicitly.

###

  • Is Digital Pathology Outrunning the AMA CPT Process?  December 13, 2025.
  • CMS Current Approaches to AI Reimbursement (MedPAC 2024) December 16, 2025.
  • HHS RFI - AI REIMBURSEMENT.   This blog today.

###

  • For an entry point, see Linked In posting on December 19 from Steven Posnack, HHS.
    • He is the Principal Deputy Assistant Secretary for Technology Policy at the HHS Assistant Secretary for Technology Policy/Office of the National Coordinator for Health IT.  Posnack holds a Master's in health policy from Hopkins and has been in government IT health policy since 2018.
  • See Steven Posnack's standalone article on December 19 at healthItBuzz here.
  • See the HHS Press Release on December 19 here.
  • And see the RFI itself here - 7 pages in pre-publication format. 

Excited to see a whole section under the title, REIMBURSEMENT.

Reimbursement 

HHS’s payment policies and programs have massive effects on how health care is delivered in the United States, often times with unintended consequences. Hypothetically, if a payer is taking financial risk for the long-term health and health costs of an individual, that payer will have an inherent incentive to promote access to the highest-value interventions for patients. 

Under government designed and dictated fee-for-service regimes, however, coverage and reimbursement decisions are slow. Rarely does covering new innovations reduce net spending; and waste, fraud, and abuse is difficult to prevent, often times leading to massive spending bubbles on concentrated items or services that are not commensurate with the value of such products. 

Given the inherent flaws in legacy payment systems, we seek to ensure that the potential promises of AI innovations are not diminished through inertia and instead such payment systems are modernized to meet the needs of a changing healthcare system. We seek feedback on payment policy changes that ensure payers have the incentive and ability to promote access to high-value AI clinical interventions, foster competition among clinical care AI tool builders, and accelerate access to and affordability of AI tools for clinical care.

Wednesday, December 17, 2025

A Little More About WISeR: Medicare's Rapidly-Arriving Pre-Authorization by AI

 WISeR is the recently-announced, rapidly-arriving demo program for AI-Prior Auth in Medicare Part B.

None of the target services are labs, like genomics.  But that could happen, so this policy zone is worth watching.

See a current update, December 10, 2025, by Corner et al. at the Sheppard Mullin law firm.   Find it at National Law Review:

https://natlawreview.com/article/new-wiser-model-aims-leverage-ai-technology-reduce-costs-and-inefficiencies




Rakha et al. 2026: New International Consensus Criteria for HER2 Low & Ultralow

For several years, there's been steady news about using drugs or combinations in ultra-low Her2 categories.  

See a new concensus article by Rakha et al., 'International Expert Consensus Recommendations for HER2 Reporting in Breast Cancer:  Focus on HER2-LOW and ULTRALOW Categories."

It's open-access at Modern Pathology.

https://www.modernpathology.org/action/showPdf?pii=S0893-3952%2825%2900223-6

The provenance is:

On behalf of the UK National Coordinating Committee of Breast Pathology, the Association of Breast Pathology, the European Working Group for Breast Screening Pathology and the International Society of Breast Pathology.

The abstract states in part,

  • Updated clinical management guidelines now, therefore, incorporate the spectrum of HER2 results into treatment selection algorithms in the metastatic setting. 
  • To align histopathologic practice with these developments, the College of American Pathologists has issued a new biomarker-reporting template that recommends explicit distinction between IHC 0/absent membrane staining and IHC 0+/with membrane staining. 
    • Key concerns among pathologists include assay variability, scoring reproducibility, and quality assurance standards for accurately detecting such low levels of HER2 expression. 
  • This manuscript provides expert consensus, evidence-based practical recommendations for identifying and reporting tumors with HER2-low and HER2-ultralow expression. 
  • We emphasize standardized testing protocols, validated assays, robust internal and external controls, and focused training for pathologists. A universal structured pathology report is proposed to highlight the accurate distinction between IHC 0 (null), IHC 0+ (ultralow), and HER2-low expressions.

Tuesday, December 16, 2025

MolDx on the Podcast Circuit: Gabriel Bien-Willner @ DIASORIN

The DIASORIN Podcast, "Beyond the Result," brings us an interview with Dr Gabriel Bien-Willner.  It's titled, "The Payer Perspective: Driving Diagnostic Stewardship and Smarter Coverage Policies.”

Find it at YouTube, here:

https://www.youtube.com/watch?v=KfXZAd9N5qc

ai pic

Here's an AI summary of the 40-minute interview.

###

MolDx’s Dr. Gabriel Bien-Willner on
Building Smarter Medicare Coverage for Molecular Diagnostics

In a recent episode of Beyond the Result, DiaSorin’s Luminex division podcast, host Angelo Rago interviewed Dr. Gabriel Bien-Willner, Chief Medical Officer of MolDx, the Medicare program that has become a central force in U.S. molecular diagnostics coverage. The conversation offered a rare inside look at how MolDx has reshaped reimbursement policy to better align innovation, evidence, and diagnostic stewardship.

Dr. Bien-Willner explained that MolDx was launched in 2012 by Palmetto GBA, a Medicare Administrative Contractor, to address a fundamental problem: traditional CPT coding could not adequately describe what modern molecular tests actually do. In response, MolDx introduced a test registry and Z-codes, requiring laboratories to precisely define each test—its analytes, methods, intended use, and clinical context—before Medicare reimbursement. This infrastructure allows payers to understand not just that a test was billed, but what was tested and why.

Over time, MolDx has expanded from human DNA and RNA testing to a broader range of molecular diagnostics and is now used across 28 states, with multiple Medicare contractors adopting its shared policies and tools. Bien-Willner emphasized that MolDx’s approach is not about limiting innovation, but about ensuring that new tests demonstrate they are “reasonable and necessary”—Medicare’s statutory standard—through credible evidence.

A major theme of the discussion was diagnostic stewardship: using the right test for the right patient at the right time. Bien-Willner described how MolDx policies have evolved away from rigid, test-specific coverage toward foundational, future-proof frameworks. These policies define evidence thresholds rather than naming specific products, allowing coverage to expand naturally as science matures. He cited next-generation sequencing in oncology and minimal residual disease testing as examples where MolDx policies have successfully accommodated rapid innovation without repeated rewrites.

Addressing concerns that reimbursement can stifle adoption, Bien-Willner argued the opposite: well-designed policy enables innovation by separating high-value diagnostics from waste and abuse. MolDx, he said, aims to act not as a blunt instrument, but as a scalpel, protecting the Medicare trust fund while ensuring that effective, evidence-based diagnostics can thrive.

For laboratories and industry, his advice was clear: engage payers as partners, understand their evidentiary standards, and recognize that sustainable innovation depends as much on policy design as on scientific breakthroughs.

SIDEBAR

5 Interesting, Slightly Surprising Takeaways

  1. MolDx exists partly because CPT is  [mathematically] “degenerate.” Bien-Willner’s blunt point: for molecular testing, a CPT code often can’t tell a payer what was actually tested—so the claim can’t be meaningfully adjudicated without extra structure.

  2. The “Z-code” is basically Medicare’s “tell me exactly what this test is” tag. It’s not just paperwork—it’s how MolDx ties coverage decisions to a specific lab’s specific LDT, rather than to a broad, ambiguous billing code.

  3. He reframed diagnostic stewardship as “separation of powers.” Providers treat; payers pay. In his analogy, the payer is a built-in “sanity check” like the friend who tells you the dealership’s $4,000 repair might be overkill.

  4. MolDx changed its philosophy on panels—more permissive, but also more logical. He described moving away from “if any analyte isn’t necessary, the whole panel fails” toward “a panel should include everything reasonably needed—avoid partial testing + stacking.”

  5. He’s trying to write coverage policy like software. His “foundational policy” approach uses conditional logic (“if X and Y, and one of A/B/C…”) to make policies future-proof, so new tests can qualify as evidence evolves without rewriting the whole LCD every year.

MedPAC 2024: How Medicare Thinks About AI Reimbursement

We have more and more emphasis in genomics on AI and bioinformatics, so the way CMS thinks about software is relevant.

A current article by Scott Gottleib about healthcare and AI, contains a link back to a 2024 chapter by MedPAC on how CMS thinks about software and AI.

Find it here, it's an up-to-date 32 page PDF.

https://www.medpac.gov/wp-content/uploads/2024/06/Jun24_Ch4_MedPAC_Report_To_Congress_SEC.pdf



From ESMO: Requirements for AI-based Biomarkers in Oncology

 At Linked In, Josie Hayes flagged an important consensus new article from European Society of Medical Oncology.   Her note is here.  

Hayes writes:

What if you could screen 100,000 patients for a low prevalence biomarker with histology slides, then confirm only the positives with molecular tests?

That's the promise of Class B AI biomarkers—and ESMO's new framework just gave us the roadmap to get there.

Class B biomarkers use AI as an indirect measure of known biomarkers through alternative methods. 

Example: Analyzing H&E slides to predict MSI status using AI, then confirming positives with molecular testing.

Tempus is already doing this with their prostate MSI screen (p-MSI)- AI on pathology images flags likely MSI-high cases before running expensive sequencing.

The economics are stunning: 

→ Screen broadly at low cost 

→ Confirm selectively with precision tests 

→ Deploy globally without breaking the budget

She points us to a new article, Aldea et al., in Annals of Oncology:

https://www.annalsofoncology.org/article/S0923-7534(25)06267-2/fulltext

###

AI CORNER

###

Chat GPT 5 discusses Aldea et al. at length.   Then, Chat GPT compares this paper on a framework for AI biomarkers, with, the GigaTime paper in Cell this month, Valanarasu et al., reporting actual data with a particular technology.

###

What the paper is and why it matters

Aldea et al. present EBAI (ESMO Basic Requirements for AI-based Biomarkers in Oncology), a consensus framework intended to close the widening gap between AI biomarker development and routine clinical adoption. The authors’ starting premise is pragmatic: regulatory authorization alone does not resolve the issues that actually determine whether an AI biomarker is used in practice. These include generalisability across sites, calibration stability, integration into clinical workflows, accountability when models err, and downstream implications for reimbursement and clinical trust.

The paper arrives at a moment when oncology faces a paradox. AI biomarkers are proliferating rapidly, often with striking performance metrics, yet clinical adoption remains cautious. Aldea et al. argue that this hesitation reflects not conservatism but category error: AI biomarkers are being treated as a single class when, in fact, they differ fundamentally in risk profile, evidentiary burden, and intended role in care. EBAI is meant to provide a shared vocabulary that aligns technical development with clinical expectations.

The framework was developed using a modified Delphi process involving 37 experts across oncology, pathology, radiology, biostatistics, ethics, regulation, and patient advocacy. Consensus was reached over four voting rounds, emphasizing the paper’s role as a field-level alignment document rather than a single-author position.


The central contribution: a use-based classification of AI biomarkers

The paper’s most important innovation is its three-class taxonomy (A, B, C), which explicitly links what an AI system does to the level and type of evidence required for its use. This classification reframes validation as context-dependent rather than universal.

Class A systems automate or standardize measurement of an already accepted biomarker using the same data modality that a human would interpret. These tools are conceptually closest to traditional pathology automation and therefore carry the lowest incremental risk. Because the output is directly auditable and maps onto existing clinical practice, the central validation question is concordance rather than discovery. Examples include automated PD-L1 or HER2 scoring on immunohistochemistry slides, tumor-infiltrating lymphocyte quantification, or residual tumor burden estimation following neoadjuvant therapy.

For Class A systems, Aldea et al. emphasize analytical validation and agreement with expert readers. Replacement of human scoring is considered reasonable when AI error rates fall within known inter-observer variability, rather than demanding unattainable perfection.

Class B systems represent a more disruptive—and economically powerful—category. These models predict a known biomarker using a different input modality, most commonly using H&E histology to infer molecular or transcriptomic features. Crucially, the intended use is usually pre-screening or triage, not full replacement. This is the category highlighted by Josie Hayes: AI can screen very large populations at low marginal cost, reserving expensive molecular testing for those most likely to benefit.

The paper draws a sharp conceptual line here. Using AI to enrich then confirm is treated as a fundamentally different—and lower-risk—proposition than replacing molecular testing outright. Validation expectations reflect this distinction. Analytical validation against a gold-standard reference test is mandatory, and high-quality real-world or retrospective trial data are acceptable. Many experts favor additional retrospective clinical validation, particularly if AI output could influence treatment decisions. Prospective “silent trials,” in which AI runs in the workflow without affecting care, are discussed as a trust-building step but are not universally required.

A key limitation is stated explicitly: when therapy selection depends on mutation subtype rather than gene-level status, current image-based predictors often lack sufficient granularity. In such cases, Class B systems should remain screening tools rather than aspirational replacements.

Class C systems are the most conceptually ambitious. These models derive novel biomarkers directly from clinical outcomes rather than predicting existing markers. The authors divide Class C into prognostic (C1) and predictive (C2) systems. Prognostic tools estimate outcomes such as recurrence or survival independent of treatment, while predictive tools aim to identify differential benefit from one therapy versus another.

For predictive Class C systems, the evidentiary bar is especially high. Demonstrating treatment interaction requires comparison across treatment arms or against an established predictive biomarker. The paper points to examples that have undergone randomized trial validation and have begun to enter clinical guidelines, underscoring that such adoption is possible—but demanding.


What ESMO says must be demonstrated

Across all classes, the framework converges on three essential requirements that cannot be waived. These are best understood not as technical formalities but as safeguards against misplaced confidence.

First, ground truth must be clearly defined. This includes how labels were generated, who performed them, whether readers were blinded, and how disagreements were adjudicated. Second, performance must be evaluated in a way that matches clinical intent, rather than relying on generic accuracy metrics. Third, generalisability must be demonstrated, with stability shown across institutions, scanners, laboratory protocols, and patient populations.

Beyond these core elements, the paper strongly encourages fairness auditing within validated populations and practical explainability checks. Importantly, explainability is framed not as philosophical transparency but as a diagnostic tool to detect shortcut learning or spurious correlations, using techniques such as occlusion testing or confounder stress-tests.


Moving beyond headline metrics

Aldea et al. are explicit in discouraging the field’s fixation on single summary statistics such as AUC. Instead, they advocate multi-dimensional performance reporting aligned to clinical use. This includes discrimination metrics, calibration assessment, incremental value over existing standards, and explicit evaluation of clinical utility.

Calibration receives particular emphasis. A well-calibrated model that clinicians can trust at specific decision thresholds is treated as more valuable than a marginally higher AUC with unstable probabilities. Decision curve analysis is highlighted as a practical way to connect model performance to real clinical trade-offs.

The authors also stress the importance of a priori sample size justification and independent validation cohorts. Models should not be validated on data that overlap—directly or indirectly—with training sources, and performance claims should be scoped to the populations actually studied.


Generalisability as a prerequisite, not an aspiration

One of the paper’s strongest messages is that AI biomarkers should not be casually “ported” across cancer types, specimen preparations, scanners, or institutions. Each such shift represents a new operating environment that requires evidence. Generalisability is treated as a first-class requirement, not a post-marketing hope.


Replacement versus pre-screening: an explicit risk calculus

Throughout the paper, intended use remains the organizing principle. For pre-screening applications, the relevant benchmark is whether human plus AI outperforms human judgment alone. For replacement, error rates must match or exceed the reliability of the existing standard of care, including accepted variability in molecular testing.

Notably, even when Class B systems approach molecular-test-level performance, many experts still favor confirmatory testing in the near term. This reflects sociotechnical realities—trust, liability, and workflow—rather than purely statistical concerns.


After adoption: monitoring is expected, not optional

EBAI treats post-deployment monitoring as part of validation, not an afterthought. Continuous performance surveillance, drift detection, and bias monitoring are recommended, with predefined remediation pathways. When retrospective validation is insufficient, limited prospective evaluation within real workflows is encouraged.


Ethics, regulation, and human oversight

While not a regulatory manual, the framework aligns closely with emerging AI governance principles. It emphasizes clinician oversight for high-impact decisions, transparency to patients when AI materially influences care, and careful alignment between validated populations and real-world use. Fairness is framed pragmatically as fitness for the intended population rather than abstract demographic parity.


Strategic interpretation in the context of gigaTIME and Hayes’ commentary

If gigaTIME represents the technological leap of foundation models applied to H&E pathology, Aldea et al. provide the governance and deployment logic. The paper formalizes Class B AI biomarkers as a legitimate and powerful category, particularly for population-scale screening and trial enrichment, while warning against prematurely claiming replacement.

The unifying message is not that AI biomarkers work, but that they must be classified, validated, deployed, and monitored in ways that match their intended role. This framing is what allows impressive AI performance to translate into clinically adoptable and economically scalable biomarker infrastructure.

 ###

###

###

ESMO PAPER (Aldea) vs CELL PAPER (GigaTime, Valanarasu)

You’re dealing with two fundamentally different genres of paper, serving complementary but non-overlapping purposes:


Aldea (ESMO): policy, framing, and clinical governance

The Aldea / ESMO EBAI paper is not trying to advance the science of AI models. Instead, it is doing something rarer and arguably harder: stabilizing the conceptual ground so that AI biomarkers can move from impressive demonstrations into clinical systems without constant category errors.

Its unit of analysis is intended use, not architecture or performance ceilings. The paper assumes that powerful models already exist (and will continue to improve) and asks:

  • What kind of AI biomarker is this, really?
  • What evidence is proportionate to the clinical risk it introduces?
  • When is screening acceptable, and when is replacement a bridge too far?
  • What does “validation” actually mean once you leave the lab?

In that sense, Aldea is closer to clinical doctrine, health policy, and systems engineering than to computer science. It is explicitly normative: it tells the field how to behave if it wants trust, adoption, and scale.


Valanarasu et al. (Cell): scientific discovery and technical proof

By contrast, Valanarasu et al. (gigaTIME) is a pure science research paper, published in Cell for exactly that reason. Its goal is to show that something previously thought infeasible is, in fact, possible.

Its core scientific claims are:

  • H&E morphology contains enough latent signal to reconstruct spatial proteomic patterns.
  • A multimodal, foundation-style model can learn a cross-modal translation from H&E to multiplex immunofluorescence.
  • Once that translation exists, you can generate virtual populations at unprecedented scale, enabling discoveries that were previously blocked by data scarcity.

The unit of analysis here is capability:

  • Can we do this at all?
  • Does it generalize?
  • What new biological insights fall out once we can?

This is not a clinical deployment paper, and it is not trying to be. There is no pretense that gigaTIME is a “biomarker” in the regulatory or ESMO sense. It is an enabling scientific instrument.


Why they are different — and why they belong together

Seen clearly, the papers are not in tension at all. They sit at different layers of the same stack:

  • gigaTIME (Cell) lives at the capability layer:
    What can foundation models extract from routine pathology that humans cannot?
  • Aldea / ESMO (Annals of Oncology) lives at the deployment layer:
    Once such capabilities exist, how do we classify, validate, and safely use them in medicine?

A useful way to say it bluntly:

gigaTIME expands the possibility space.
EBAI constrains the permission space.

That is exactly why your instinct to review them together — but not conflate them — is correct.


The key connective insight (without collapsing the categories)

What gigaTIME enables, Aldea helps discipline.

  • gigaTIME makes Class B–style economics (image-first, molecular-scale inference) plausible at population scale.
  • Aldea explains why those same tools should initially be framed as screening, enrichment, or discovery instruments, not instant replacements for molecular assays.
  • gigaTIME shows that H&E can be a gateway to multiplex biology.
  • Aldea explains when and how such gateways can be allowed to influence care.

Put differently:
Cell papers create new worlds; ESMO papers decide which doors you’re allowed to open, and under what supervision.

 


Pierre Arsene: LBX Stocks Raise $50B in Market Cap This Year

I haven't tried to replicate the math, but at Linked In, Pierre Arsene repors that liquid biopsy stocks generated $50B in market cap this year.  He's including companies who's market is partially in liquid biospy, but still.

His article is here:

https://www.linkedin.com/posts/pierrearsene_would-you-believe-that-liquid-biopsy-stocks-activity-7406596554528247808-nLe7/

His market is given as:

  • Exact Sciences
  • Natera
  • Tempus AI
  • Guardant Health
  • Adaptive Biotechnologies
  • Grail
  • GeneDx
  • BillionToOne
  • Caris Life Sciences
"Daniel G" made an interesting remark, quoting in part:

  • Over the same period, many liquid biopsy and precision diagnostics companies have either disappeared from public markets, been acquired at distressed valuations, or quietly failed to translate scientific promise into sustainable, reimbursed clinical businesses. [aka survivor bias]

  • Capital is not rewarding liquid biopsy per se — it is rewarding reimbursement visibility, guideline inclusion, operating scale, and disciplined execution. The winners built infrastructure, not just technology.




Sunday, December 14, 2025

Unprecedented; SENATE Wants to Hear Your Coding - Innovation Problems

Update.  Senator Cassidy has exchanged letters (and press releases) with the AMA on a number of coding topics, RUC, CPT membership, etc.  I used that in this blog.  It turns out this newly-opened portal is tuned to questions about AMA fees & and code-use contracting.  However, they also provide an email for the project, which is coding@help.senate.gov.


My blogs  here on October 29 and again on December 2 flagged little-known U.S. Senate investigations on the U.S. healthcare coding system.

Now the Senate opens a website for concerns about the coding system and innovation.  I cut/paste from Senate, below.  The submission webpage gives a due date of January 16, 2026.

###

https://www.help.senate.gov/rep/newsroom/press/chair-cassidy-continues-ama-inquiry-seeks-stakeholder-feedback

12.09.2025

Chair Cassidy Continues AMA Inquiry, Seeks Stakeholder Feedback

WASHINGTON – U.S. Senator Bill Cassidy, M.D. (R-LA), chair of the Senate Health, Education, Labor, and Pensions (HELP) Committee, is seeking information from stakeholders regarding the American Medical Association’s (AMA) monopoly of Current Procedural Terminology (CPT®) codes and its impact on patients, providers, and health care costs. Cassidy is asking stakeholders with relevant experience and knowledge of CPT ® coding contracts with the AMA to inform the Committee’s inquiry by responding to this questionnaire[If any problem start at "Newsroom" link above].



As chair of the HELP Committee, Cassidy is using all tools at his disposal to lower costs for American patients. Thus far, the AMA evaded questions and failed to cooperate with Cassidy’s inquiry. If the AMA does not respond in a fulsome and transparent manner by December 15, 2025, the Chairman is committed to finding answers by other means.

“The federal government mandated the use of CPT codes. This creates the potential for abuse in that if someone has to buy your product, you can charge them what you want,” said Dr. Cassidy. “There may be nothing wrong here, but we should get answers to make sure the CPT system is working for the American patient and for the American health care system.”

Saturday, December 13, 2025

Scott Gottlieb in JAMA: AI Can Solve Productivity Challenges

 I enjoyed this article alot, and it's very germane to current hot topics at CMS (and AMA).   Here's an AI picture, an AI summary, and a link to Gottlieb's open-access article.

###

https://jamanetwork.com/journals/jama-health-forum/fullarticle/2842893

ai pic


Scott Gottlieb, JAMA Health Forum (2025): 

How AI Will Help Solve Medicine’s Productivity Challenges

His core argument:
Health care appears “unproductive” not because medicine hasn’t improved, but because productivity is measured incorrectly—by counts of visits and procedures, not by health outcomes. This creates a classic case of Baumol’s cost disease, where human-intensive professions (medicine, education, performing arts) improve quality but fail to show output-per-hour gains, suppressing compensation growth and driving costs upward.

What AI actually changes (beyond hype):
Gottlieb’s key claim is that AI must move from augmentation to substitution. Merely making physicians “smarter” does not solve the productivity problem. True productivity gains require AI systems that safely replace discrete physician tasks—diagnostic interpretation, triage, routine decision-making—so that one clinician can effectively care for more patients per unit time.

Why regulation is the bottleneck:
FDA’s framework assumes static devices, not learning systems. Gottlieb argues for a regulatory model that allows controlled algorithmic drift, with predefined performance thresholds and re-review triggers. Without this, developers avoid FDA entirely by labeling tools as “research use only,” sharply limiting real-world deployment.

Why reimbursement is the bigger bottleneck:
CMS does not pay software vendors directly and ties physician payment to labor inputs. Ironically, AI that reduces physician labor can lower RVUs, triggering cuts to physician income.  Physicians must buy productivity-enhancing AI out of their own reimbursement, often with no financial upside.

Policy solutions proposed:

  • Bundle AI into episode payments

  • Expand OPPS pass-through mechanisms

  • Create new benefit categories (e.g., via the Health Tech Investment Act of 2025)
    The unifying goal is to align AI adoption with physician compensation growth, finally breaking the cost-disease trap.

Bottom line:
If AI is allowed to replace physician effort (not just decorate it), medicine could finally follow the same productivity curve as other tech-enabled industries—improving outcomes, stabilizing costs, and restoring physician earnings growth.

##

##

Hidden as an italicized hotlink rather than a full biblio citation, Gottlieb cites Medpac 2024, which is an up-to-date 32 page review of CMS issues with software pricing.


What the Hell is MAHA ELEVATE?

You've tried to at least recognize what the new programs ACCESS and TEMPO mean (at CMS and FDA, respectively, but they interact...)

Here's yet another big CMS demonstration/innovation program, MAHA ELEVATE.

I'll give a summary of Trey Rawles' article below.   If you want to start with Trey's whole article, here it is.

https://www.onhealthcare.tech/p/cms-just-opened-a-100m-door-for-lifestyle?'

##

And here's the cut-to-the-chase AI summary:

Trey Rawles argues that CMS’s new MAHA ELEVATE Model represents a quiet but consequential shift in Medicare policy that most investors will overlook. While the program’s $100 million budget—spread across up to 30 three-year cooperative agreements—is small relative to Medicare’s scale, its significance lies in what CMS is choosing to fund. For the first time, the CMS Innovation Center is explicitly testing functional and lifestyle medicine interventions—nutrition, physical activity, stress, sleep, and social connection—that are not currently covered by Original Medicare.

Rawles emphasizes that MAHA ELEVATE is not a coverage decision but an evidence-generation exercise designed to determine whether whole-person lifestyle interventions can improve outcomes and reduce costs for Medicare beneficiaries with chronic disease. Every proposal must include nutrition or physical activity, and three awards are reserved for dementia-focused interventions, signaling CMS’s priorities. Awards average roughly $1.1M per year, with CMS actively involved in program design, data collection, and quality oversight.

The strategic value extends far beyond the grant dollars. Winning an award provides CMS validation, access to Original Medicare beneficiaries, and—most importantly—a potential pathway to future Medicare coverage if strong cost and quality evidence emerges. Rawles argues that companies with documented scientific evidence, real-world outcomes data, HIPAA-ready infrastructure, and integrated, multi-modal interventions are best positioned to win, favoring mature digital health and lifestyle medicine platforms over early-stage wellness apps.

For investors, MAHA ELEVATE de-risks evidence generation for a historically under-reimbursed category and signals a broader CMS pivot toward prevention and lifestyle medicine as complements to conventional care.

Here's Fierce Healthcare.
Here's a time-travel article, back to May 2025, about what HHS wanted to achieve in chronic care innovation.  

Will FDA and Other Digital Pathology Advances Outrun AMA CPT Rules in 2026?

Header:  AMA CPT may soon face a crossroads for modernizing coding of digital pathology.

##

Update:  The U.S. Senate wants to hear your concerns about the coding system and blocking innovation - here.

##

AMA CPT has had limited means of coping with digital pathology and AI, and in some ways, has even regressed.   There is a single Category I CPT code for making 1 immunostained slide and interpreting it with computer assistance (88361).   

For several years, AMA also made about 10 codes involving whole slide imaging in the PLA series, but labs that have tried to replicate this have told me that the rules have changed.   If the biomarker requires immunohistochemistry, it may be judged to include "physician work" and therefore ineligible for PLA coding.  (I worked with one lab whose service didn't involve any physician work, and was rejected anyway for "requiring physician work.")  But if the service is based on conventional stains, like H&E, it may be judged as not having a bona-fide biomarker, and therefore ineligible for PLA.   If some of the service is performed offsite, e.g. a cloud algorithm allowed by CAP distributed testing (here), or a separately prepared DICOM file, it may be judged as a multi-lab, not single-lab, service and therefore ineligible for PLA.   If it involves "AI" (and NGS has already involved massive amounts of bioinformatics) it may get shunted into another coding system, a sort of registry for SaMD.  It's like a game of Chutes and Ladders, where there are lots of chutes on every side and not a ladder in sight.

##

Meanwhile, outside of AMA CPT policy, the world is moving fast.   Here are just four examples:

  1. Rakaee's report of deep learning immunotherapy guidance from digital imaging of H&E, reported in JAMA Oncology.  Here.
  2. FDA authorizes a clinical trial biomarker (which may well become a companion diagnostic), used in NASH drug trials and based on software reading of trichrome and H&E sections.  See FDA announcement, here. See the project's webpage for associated documents here.  There, see particularly the FDA integrated review (PDF 52pp).   See discussion at Linked In here.
  3. See similarly, Valanarasu et al., in Cell, on GigaTIME sophisticated proteomic reproduction from AI-assisted readings of H&E slides.  Here.  See insights from two authors at Linked In, here and here. Also Hayes.
  4. Meanwhile, and also based on FDA drug trials, see the novel "QCS" AI-assisted immunomarker reading, which can do things that humans simply can't do.   It's a collaborative project for Roche, AstraZeneca, FDA, entry point here.
###
AI CORNER
###
ChatGPT 5.2 reads the essays and articles above, and summarizes.
###

Is scientific reality is outrunning CPT structures?