On February 4, 2025, Friends of Cancer Research held a half-day meeting in DC about diagnostics and regulatory innovation.
Find their summary here:
https://api.neonemails.com/emails/content/B60mE3Jx82Jht6EkTFcSukdCdM6R-8jeHaTXRcapjwo=
Find the YouTube stream here:
https://www.youtube.com/live/wFHolEBYx48
The FOCR summary notes,
- Discussions at Friends public meeting titled “Advancing the Future of Diagnostics and Regulatory Innovations,” reflect a pivotal moment at the intersection of artificial intelligence (AI) and regulatory frameworks for diagnostics, precision medicine, and healthcare. The three sessions emphasized the dual objectives of ensuring patient safety while fostering an environment conducive to innovation—a balance that is becoming increasingly critical in an era marked by rapid technological advancements.
- Key takeaways and meeting resources, including Session 1 presentation slides by Mark Stewart, Friends of Cancer Research and Session 2 presentation slides by Megan Doyle, Eli Lilly & Company are available below.
FOCR Meeting Highlights AI’s Role in Diagnostics and the Need for Standardization
Washington, D.C. | February 4, 2025 – The Friends of Cancer Research (FOCR) convened its first public meeting of the year, focusing on the future of diagnostic testing and the evolving role of artificial intelligence (AI) in pathology. The event brought together experts from academia, industry, and regulatory agencies to examine the challenges and opportunities of AI-driven diagnostics, particularly in the assessment of biomarkers like HER2 in breast cancer.
Ensuring Diagnostic Consistency in the AI Era
A major discussion centered around variability in diagnostic testing, particularly in biomarker assessment. The meeting featured data from the Digital Pathology Harmonization Project, a collaborative effort analyzing the performance of AI-based pathology tools. The project assessed HER2 scoring by comparing seven AI models with three human pathologists. The data showed strong agreement in identifying HER2-positive (3+) cases but notable variability in distinguishing HER2-low (1+ and 2+) cases.
As HER2-low has become an actionable biomarker with new targeted therapies, accurate classification is increasingly important. The variability in scoring raised concerns about how AI models are trained, validated, and ultimately integrated into clinical practice.
Reference Data Sets: A Critical Need for AI Validation
Speakers emphasized the importance of common reference data sets in evaluating AI models, noting that consistency across different tools remains a challenge. The data set used in the FOCR study, curated from a single institution in Belgium, allowed for a controlled comparison of AI models and pathologists. However, participants discussed whether a more diverse dataset—including multiple scanners, staining techniques, and patient demographics—would provide a more robust validation framework.
The study also highlighted that AI models showed similar levels of variability as human pathologists. This finding raised a key question: Should AI aim to replicate human interpretation, or should it establish a new standard of accuracy based on objective quantitative measures?
Regulatory and Implementation Challenges
The discussion also touched on regulatory hurdles in deploying AI-driven diagnostics in clinical settings. Current FDA review processes require clear definitions of ground truth when assessing AI models. However, variability among expert pathologists complicates this effort. The traditional method of using a panel of three pathologists for adjudication remains standard, but panelists debated whether new methods are needed to establish a more consistent framework for validation.
Another barrier is infrastructure readiness. Today, only a small percentage of pathology labs have the necessary digital pathology infrastructure, including high-resolution scanners and cloud-based computational tools, to support AI-driven workflows. The transition to digital pathology and AI-assisted diagnostics will require significant investment in technology and training.
Operational Considerations in AI Deployment
Panelists also addressed operational concerns related to model development and reproducibility. AI models must be trained on diverse datasets to ensure generalizability across different lab conditions, tissue processing methods, and scanner types. The study’s post-hoc analysis of discordant cases suggested that staining artifacts, heterogeneous expression patterns, and sample type (biopsy vs. resection) contributed to variability in HER2 scoring.
The potential for AI to augment, rather than replace, human pathologists was another key point. While AI can process large datasets rapidly and provide quantitative assessments, panelists discussed the importance of AI-pathologist collaboration rather than full automation.
Next Steps: Expanding AI’s Role in Cancer Diagnostics
Building on the insights from this study, FOCR announced plans for a new initiative, AI-RECIST, which will apply AI validation strategies to imaging-based tumor response assessments. This effort aims to improve consistency in radiological evaluations, another area where human interpretation varies widely.
As FOCR President Jeff Allen emphasized, continued collaboration between researchers, industry, and regulators is necessary to ensure that AI-driven diagnostics enhance patient care without introducing new sources of error. The meeting reinforced the need for standardized validation frameworks, robust reference datasets, and regulatory clarity as AI takes a greater role in precision medicine.
The event marked a significant step forward in aligning technological innovation with clinical and regulatory realities, ensuring that AI-driven diagnostics contribute to improved patient outcomes.
###
###
Conference Documents. Each session (Digital Biomarkers, and Rare Biomarkers) had an online PPT document and an online WP. I gave the 4 documents to Chat GPT in a separate session, to summarize.
###
###
Summary and Analysis of Four Conference Documents - for Experts Who Could Not Attend
Document 01: Digital and Computational Pathology Tool Harmonization (PATH) Project
This presentation focused on the role of AI-driven digital pathology in improving HER2 biomarker scoring for breast cancer. Key points included:
- Clinical Relevance of HER2: HER2 scoring guides treatment, particularly with new therapies targeting low and ultra-low HER2 expression.
- Challenges in HER2 Assessment: Variability in manual scoring has led to inconsistencies in treatment selection, making AI-based solutions attractive.
- AI’s Potential Contributions:
- Reproducibility: AI models reduce observer variability.
- Efficiency: High-throughput capability enables large-scale analysis.
- Granularity: AI can detect subtle biomarker variations beyond human visual capabilities.
- Study Design:
- A common dataset of 1000+ whole-slide images (WSIs) was created and shared among AI developers.
- AI models applied HER2 scoring, and results were compared against pathologists’ assessments.
- Findings:
- Greatest agreement in HER2 3+ cases.
- Significant variability in 1+ and 2+ cases, making AI refinement crucial.
- Sample attributes (Ki-67, PR, ER status) may influence AI-pathologist discordance.
- Post-review, AI models improved accuracy, but challenges remain for complex cases like Paget’s disease and sparse tumor cells.
- Next Steps:
- Developing best practices for AI validation.
- Expanding reference datasets for AI training.
Implications for Experts: This research supports AI-assisted standardization in pathology, highlighting the need for collaborative validation efforts between AI developers, regulatory bodies, and clinicians.
Document 02: Considerations for Developing Reference Data Sets for Digital Pathology Biomarkers
This discussion document from Friends of Cancer Research outlines the need for publicly available, standardized datasets for AI-driven computational pathology. Key takeaways:
- The Problem:
- Computational pathology lacks universal validation datasets, leading to inconsistent results across AI models.
- PD-L1 assay variability serves as a precedent—different antibodies and scoring methodologies led to regulatory challenges.
- The Role of Reference Data Sets:
- Support comparability across AI models.
- Enhance regulatory evaluation.
- Enable more efficient FDA review.
- Considerations for Building Reference Data Sets:
- Intended Use: Should datasets validate AI’s accuracy, precision, or clinical validity?
- Data Diversity: Ensure demographic, clinical, and specimen representativeness.
- Interchangeability: AI outputs should align with pathologist consensus.
- Infrastructure & Governance:
- Datasets must be securely stored, version-controlled, and validated.
- Federated learning models may allow AI testing without sharing raw WSIs.
- Defining a Reference Standard:
- Traditional pathologist-scored benchmarks have limitations.
- Alternative reference methods include molecular assays (mRNA, in situ hybridization) or clinical treatment outcomes.
- Metadata Considerations:
- Clinical and patient data should be de-identified but well-documented.
Implications for Experts: AI models will need standardized validation benchmarks, particularly as digital pathology becomes integral to precision oncology.
Document 03: Innovative Processes for Validating Diagnostic Tests for Rare Biomarkers
This Eli Lilly-led presentation tackled the challenges of validating diagnostics for rare biomarkers, given limited clinical samples. Key themes:
- Defining Rare Biomarkers:
- Prevalence ≤1% in a given cancer type.
- Limited availability of well-characterized clinical samples.
- Regulatory Challenges:
- Traditional validation relies on clinical trial samples, but rare biomarkers lack sufficient specimens.
- Compressed drug development timelines exacerbate validation difficulties.
- Alternative Validation Strategies:
- Clinical trial samples: Prioritize for clinical validation.
- Representative clinical approaches: Use data from similar genomic contexts.
- Real-world evidence (RWE): Consider post-market validation.
- Procured human specimens: Use archival tissue or commercially available samples.
- Cell lines & contrived samples: Useful for analytical validation (accuracy, stability, reproducibility).
- In silico datasets: Validate bioinformatics pipelines.
- Regulatory Engagement:
- Developers should engage with FDA early.
- Consider a pre-submission “snapshot” to outline sample justifications.
Implications for Experts: Given rare biomarker challenges, a flexible regulatory approach is needed, balancing rigor and practicality in CDx validation.
Document 04: Innovative Validation and Regulatory Processes for Companion Diagnostic Tests for Rare Biomarkers
This Friends of Cancer Research white paper expands on regulatory flexibilities for rare biomarker companion diagnostics (CDx). Key points:
- The Core Challenge:
- Precision therapies rely on companion diagnostics to identify patients.
- Rare biomarkers lack ample clinical samples for traditional validation.
- Regulatory Flexibility Considerations:
- Prevalence Threshold: Biomarkers ≤1% prevalence may qualify.
- Sample Accessibility: Factors like tumor location, biopsy feasibility, and sample degradation impact validation feasibility.
- Unmet Need & FDA Prioritization: Expedited pathways (e.g., Breakthrough Therapy) may warrant validation flexibility.
- Proposed Validation Approaches:
- Prioritize clinical trial samples for validation.
- Use alternative samples where necessary, ensuring biological and technical justification.
- Consider RWE to supplement clinical validity assessments.
- Framework for Regulatory Discussions:
- Structured justification for sample selection.
- FDA pre-submission meetings for alignment on validation strategies.
- Recommendations for Future Efforts:
- Proactively bank clinical trial samples for future diagnostics.
- Establish aligned reference datasets for standardization.
- Consider shared regulatory frameworks for efficient biomarker validation.
Implications for Experts: CDx validation must evolve to accommodate rare biomarkers, requiring strategic regulatory engagement and innovative validation methodologies.
Conclusion
This forum provided a comprehensive examination of AI-driven pathology and rare biomarker validation challenges. Key takeaways:
- AI in Digital Pathology: AI models show promise for HER2 biomarker scoring, but require further refinement—especially for HER2-low cases.
- Standardized Reference Data Sets: Lack of publicly available validation datasets hinders AI model standardization, requiring collaborative efforts.
- Rare Biomarker Validation: Limited clinical samples necessitate regulatory flexibility. Alternative sample sources and real-world data can help bridge the gap.
- Regulatory Engagement: Developers should proactively engage with the FDA to align on alternative validation strategies.
For experts in oncology, computational pathology, and regulatory science, these discussions underscore the importance of data harmonization, AI validation, and regulatory adaptability in the evolving landscape of digital pathology and rare biomarker diagnostics.