Discoveries in Health Policy: AI Corner: Comparing a Paper's Abstract to an AI Summary

Two ways to get the gist of a medical publication, are to (1) read the abstract, or (2) get an AI summary of the paper.

This week I had the opportunity to compare these, and be able to blog it, on a new paper.

The paper, which was highlighted on Linked In by Harvard's Joe Lennerz, is titled, "Initial interactions with the FDA on developing a validation dataset as a medical device development tool," by Hart et al. Although the front page mentions "copyright by the authors" the back page says it is a work product of the FDA and not copyright. Therefore, I've felt able to both process it through AI and publish the result.

I'll show the authors' abstract first, and the AI summary (by Chat GPT) next.

Note that there are modest gains for an AI summary relative to a good full abstract from the authors. But AI will also provide a summary for articles that have no abstract of their own.

AUTHORS' ABSTRACT [238 words]

Quantifying tumor-inﬁltrating lymphocytes (TILs) in breast cancer tumors is a challenging task for pathologists. With the advent of whole slide imaging that digitizes glass slides, it is possible to apply computational models to quantify TILs for pathologists. Development of computational models requires signiﬁcant time, expertise, consensus, and investment. To reduce this burden, we are preparing a dataset for developers to validate their models and a proposal to the Medical Device Development Tool (MDDT) program in the Center for Devices and Radiological Health of the U.S. Food and Drug Administration (FDA). If the FDA qualiﬁes the dataset for its submitted context of use, model developers can use it in a regulatory submission within the qualiﬁed context of use without additional documentation. Our dataset aims at reducing the regulatory burden placed on developers of models that estimate the density of TILs and will allow head-to-head comparison of multiple computational models on the same data. In this paper, we discuss the MDDT preparation and submission process, including the feedback we received from our initial interactions with the FDA and propose how a qualiﬁed MDDT validation dataset could be a mechanism for open, fair, and consistent measures of computational model performance. Our experiences will help the community understand what the FDA considers relevant and appropriate (from the perspective of the submitter), at the early stages of the MDDT submission process, for validating stromal TIL density estimation models and other potential computational models.

CHAT GPT SUMMARY [450 words]

In this comprehensive publication, the authors present a detailed account of their project, which revolves around the development of a reference dataset for assessing stromal Tumor-Infiltrating Lymphocytes (sTIL) densities in triple-negative breast cancers (TNBC). This initiative addresses a crucial need for accurate sTIL density assessment in TNBC, a parameter of significant importance for prognosis and treatment decisions. Moreover, it seeks to provide a standardized dataset for evaluating computational models used in digital pathology, particularly for TNBC diagnosis.

The authors begin by highlighting the challenges involved in creating computational models for sTIL density assessment, emphasizing the importance of a standardized reference dataset. They introduce the concept of the Medical Device Development Tool (MDDT) program, administered by the U.S. FDA, as the framework for qualifying datasets of this nature.

The development process of the dataset is outlined in detail, taking into consideration various factors essential for its accuracy and reliability. These factors include patient demographics, cancer stage, and the type of scanning equipment used. Notably, the authors elaborate on the decision to include only TNBC cases in the dataset, taking into account the disparities between TNBC and non-TNBC cases. Their rationale reflects a meticulous approach to dataset construction, addressing various considerations and limitations.

The paper also discusses the invaluable feedback received from the FDA, which covers multiple facets of the dataset development process. The FDA's input encompasses aspects such as the context of use, validation of devices employed for reference standard annotations, selection of Regions of Interest (ROIs), and the exclusion of the statistical analysis plan and software from the MDDT proposal. The authors provide comprehensive explanations of their responses to the FDA's feedback, highlighting the collaborative nature of their interactions with regulatory authorities.

The concluding section of the publication underscores the overarching goals of the project. These objectives include enhancing the quality of the dataset, alleviating the burden on model developers, facilitating direct model comparisons, and promoting transparency and standardization in assessing model performance.

In summary, this publication presents a cohesive and collaborative endeavor aimed at establishing a reference standard dataset for sTIL density estimation in TNBC, aligning closely with the FDA's MDDT program. The dataset's development process is meticulous, accounting for essential considerations and incorporating valuable feedback from regulatory bodies. The authors aspire not only to benefit their specific field but also to set a precedent for the creation of similar datasets in the realm of digital pathology. Their work exemplifies the significance of regulatory cooperation and standardized evaluation in advancing medical device development, particularly within the domain of laboratory medicine.

The comprehensive dataset resulting from this project holds substantial promise for enhancing the accuracy and reliability of computational models in TNBC diagnosis, ultimately contributing to improved patient care and outcomes.

###

I asked CHAT GPT 3.5 to compare the authors' abstract with the "paper summary" it generated.

CHAT

Both DOCUMENT ONE (Author's Abstract) and DOCUMENT TWO (GPT Summary) serve different purposes and cater to different types of readers.

DOCUMENT ONE (Author's Abstract):

This document provides a concise and straightforward summary of the key points in the research paper.

It outlines the primary goal, which is to create a dataset for validating computational models related to Tumor-Infiltrating Lymphocytes (TILs) in breast cancer.

It mentions the involvement of the FDA's Medical Device Development Tool (MDDT) program and the potential benefits of a qualified validation dataset.

It is relatively brief and direct in its communication, focusing on essential details without delving into extensive explanations or elaborations.

DOCUMENT TWO (GPT Summary):

This document offers a more comprehensive and detailed summary of the research paper.

It provides a broader context by explaining the importance of assessing stromal Tumor-Infiltrating Lymphocyte (sTIL) densities in triple-negative breast cancers (TNBC) and its relevance for prognosis and treatment decisions.

It delves into the challenges involved in developing computational models for sTIL density assessment and how the Medical Device Development Tool (MDDT) program plays a role.

It discusses the meticulous dataset development process, including considerations like patient demographics and equipment used.

It highlights the feedback received from the FDA and the collaborative nature of interactions with regulatory authorities.

It concludes by emphasizing the overarching goals of the project and its potential impact on improving patient care and outcomes.

Which is better depends on the reader's specific needs and preferences:

DOCUMENT ONE is better for readers who want a quick, concise overview of the research paper's main points without going into extensive detail. It is suitable for those looking for a brief summary.

DOCUMENT TWO is better for readers who desire a more comprehensive understanding of the research paper, its context, and the intricacies of the dataset development process. It is suitable for readers who want a detailed, well-explained summary.

[BQ: I might add, once PAPER is uploaded to Chat GPT, you can continue to ask it multiple questions about the paper, which you can't do with an abstract online.]

In summary, DOCUMENT ONE is preferable for a quick and straightforward summary, while DOCUMENT TWO is better for readers who seek a more in-depth and comprehensive overview of the research paper.

###

NOTES

Chunking. ChatGPT takes about 2000 words at a time. I used free online software to split HART into three parts and uploaded each separately into GPT by cute/paste. Although a little tedious, this requires only a few seconds. https://chatgpt-prompt-splitter.jjdiaz.dev/
3.5 for Summaries. ChatGPT comes in version 3.5 for free (though possibly slow or overloaded) and version 4.0 which is subscription. I find that *consistently* 3.5 provides better summaries, especially from multi part documents.
SIDEBAR - SCITE.AI Although I don't use it often, I also subscribe to SCITE.AI. This does some remarkable things beyond PubMed. Let's say you have Paper X. You select Paper X inside SCITE with search (like PubMed). Once selected, it shows the bibliography of every paper cited by Paper X, and it shows every publication it can find that cites to Paper X. Now, wait for the next: For the latter, for every paper with full text, it automatically pulls several sentences around the place where each paper cites "Paper X."

For example, the first citation to Paper X may actually say "We strongly support the findings of Paper X" and the next citation may actually say "Our results differ sharply with the findings of Paper X."
You can see all these citations and each context in a second, rather than looking up each citing paper one by one for two hours.

Friday, October 6, 2023

AI Corner: Comparing a Paper's Abstract to an AI Summary