Table 1.
Outcomes | Definitions | Score |
---|---|---|
Answer Returned | The ability of LLM to return a meaningful answer to each instance of the question submitted, rather than returning an error or declining to return an answer, independent of the accuracy of this response. | Recorded as Boolean True/False |
Reproducibility | The ability of LLM to return a generally similar series of answers across the three separate queries with no fundamental differences or inconsistencies between these three answers. | Recorded as Boolean True/False |
Accuracy | The ability of LLM to provide accurate and correct information addressing the question asked and returning all major or critical points required in such an answer. Response not adversely marked for extraneous or irrelevant information here - as long as this information was correct. | Recorded numerically from 1 to 3 |
Readability | The ability of LLM to return comprehensible and coherent natural language text in English, including appropriate syntax, formatting, and punctuation, independent of the accuracy of this response. | Recorded numerically from 1 to 3 |
Relevance | The ability of LLM to return information that was relevant and specific to the question asked or immediately adjacent topics without extraneous, unrequested, or tangential information. Accuracy was not specifically assessed here, though the result was adversely marked if the response included immaterial information while neglecting to address the specific question asked. | Recorded numerically from 1 to 3 |
Note: for scoring of Relevance, the answer returned was not adversely marked for any included disclaimers to the effect that the LLM cannot provide medical advice and any such advice should be sought from a clinician or that anyone with a cancer diagnosis and/or receiving systemic therapy with potential toxicity should contact their treating clinician/s. This was deemed to represent appropriate and medically sound advice and not to be irrelevant or extraneous material.