Skip to main content
Annals of Medicine and Surgery logoLink to Annals of Medicine and Surgery
. 2022 Jun 16;79:103993. doi: 10.1016/j.amsu.2022.103993

Terminal digit preference and the accuracy of breast cancer diameter reporting based on Benford's law

Shahzaib Ahmad a,, Amber Latif b, Mehwish Mehmood c, Ramisha Aslam a, Zain Ul Abiddin a, Hassan Mumtaz d,e, Khadija Ahmed f, Waqas Mehdi a,g, Waheeda Begum h
PMCID: PMC9289319  PMID: 35860050

Abstract

Background

Breast cancer is the most frequent cancer in women all over the world, and it is one of the leading causes of cancer-related deaths in women. A pathologist's partiality for the last digit of a patient's name can lead to errors in the measurement of malignancies. This means that, rather than recording the exact measurement of a tumor, a pathologist might round it off to his preferred terminal digit.

Methods

It is a retrospective cross-sectional study in which data on primary tumor resection for 1000 breast cancer patients was obtained from KRL Hospital's patient directory from November 2016 to December 2020. The tumors were measured in cm to one decimal point along their longest dimension. Ki-67 markers were used to categorize the tumors into nine categories. Terminal digit preference was evaluated using Benford's law.

Results

The recording of the Ki-67 index revealed evidence of pentameric preference. The numbers three, five, and six appeared more frequently in the histogram of the Ki-67 index distribution measured in percentage. The frequency of nine dropped dramatically. However, the influence of tumor size terminal digits on Ki-67 staining scores (low proliferative vs high proliferative) assessed using the Mann–Whitney U Test demonstrated that tumor size terminal digits had no significant effect on Ki-67 staining scores (p = 0.114).

Conclusion

The Ki-67 index shows evidence of pentameric preference for digits three, five, and six. The frequency of nine has dropped dramatically. The influence of tumor size on terminal digits on staining scores (low proliferative vs. high proliferative) was assessed using the Mann–Whitney U Test.

Highlights

  • Breast cancer is the most common cancer in women worldwide, and it is one of the main causes of cancer-related deaths among women.

  • One of the commonly encountered errors in the measurement of these tumor sizes is the pathologist's partiality for the last digit. This means that rather than recording the exact measurement of a tumor, a pathologist might round it off to his preferred terminal digit.

  • The recording of the tumor size and Ki-67 index revealed evidence of pentameric preference. The numbers three, five, and six appeared more frequently in the histogram of Ki-67 index distribution. The frequency of digit nine dropped dramatically.

  • The influence of tumor size terminal digits on Ki-67 staining scores (low proliferative vs high proliferative) assessed using the Mann– Whitney U Test demonstrated that tumor size terminal digits preference had no significant effect on Ki-67 staining scores.

1. Introduction

Breast cancer represents the most prevalent form of cancer in women all over the world, making it one of the leading causes of cancer-related deaths in women [1]. However, considering recent advances in the field of research, the prognosis and treatment plan for breast cancer have changed dramatically. Hence, physicians must choose from a wide variety of treatment options by the condition of the patient. TNM staging is often used to classify a breast tumor, which in turn determines the management of the patient [3]. The TNM staging system is based on the size of the tumor and its extent of dissemination along with other factors [4].

Hence, the size of a tumor as well as its cytology have a major impact on its staging and subsequently guide the physician in its management. However, errors can be made during the measurement of these tumors. One of these errors is the terminal digit preference of the pathologist. This means that instead of recording the exact measurement of a tumor, a pathologist might round off the measurement to his preferred terminal digit, usually 0 or 5. This error has been documented in previous studies for different kinds of carcinomas [5]. In the case of breast carcinoma, however, more research is required. The analysis of breast cancer 0eas4re0ents established terminal digital preference utilizing histograms in a previous investigation [6]. This terminal digital preference can also occur during a sample's Ki-67 indexing. The Ki-67 index assesses the expression of the Ki-67 marker on cancer cells, which indicates their proliferative ability [7].

This study aims to evaluate terminal digital preference among pathologists using Benford's Law. Benford's Law has previously been used to evaluate COVID-19 data [8,9] and other fabrications in the data [10]. It states that in a large dataset containing naturally recorded measurements, the incidence of digits 1 to 9 is not distributed equally. Instead, the distribution of these digits follows a counter-intuitive trend in which smaller digits occur more frequently than larger ones [11]. As a result, rather than following the normal frequency distribution, in which each digit occurs at a frequency of 11.11%, the digit "1″ occurs at 30.1%, "2″ at 17.6%, "3″ at 12.5%, and so on [12].

By comparing the frequencies of different digits (especially 0 and 5) with the expected frequencies according to Benford's Law, we can establish if terminal digit preference occurs in breast cancer size measurements and the Ki-67 index.

2. Materials and methods

It is a retrospective cross-sectional study. All the recorded data was obtained from the patient directory of KRL Hospital. At KRL Hospital, primary tumor resection for breast cancer patients was done in 1000 breast cancer patients from November 2016 to December 2020. Data related to these specimens were recorded and later interpreted by a histopathologist.

The calculated sample size for the study was 1000 breast cancer specimens by the WHO sample size calculator. Hence, the sizes and Ki-67 indexes of 1000 breast carcinoma tumors were included in the study. Fine needle aspiration tests and benign breast lesions were not included in the study. The expected frequencies of different digits in the data set were calculated according to Benford's Law via the mathematical equation:

F(d) = log (1 + 1/d)

where F is the frequency and d are the digits in question [[1], [2], [3], [4], [5], [6], [7], [8], [9]].

CAP protocols were followed in all the specimens included in the dataset. The recorded specimens were obtained during breast conservative procedures and modified radical mastectomies. The tumors were measured along their greatest dimension and recorded in cm to one decimal point. The tumor diameter was calculated from microscopic slides by measuring the outermost borders of the invasive lesion and measuring to the nearest millimeter using a transparent ruler. If no slides were available, the measurement was based on a macroscopic inspection (formalin-fixed material), either of a single tissue slice or as an average of all tissue slices containing microscopically validated invasive tumor tissue. The study proposal was approved by the ethical review committee of KRL Hospital. Our study is fully compliant with the STROCSS 2021 guidelines [13]. A complete STROCSS 2021 checklist has been provided as a supplementary file. Our study has been registered on Research Registry with the following UIN: researchregistry7969 [14]. Our study is by the Declaration of Helsinki.

The tumors were also stained for Ki-67 markers and were divided into nine groups depending upon the percentage of staining, i.e., (11–29%, 20–29%, 30–39%, 40–49%, 50–59%, 60–69%, 70–79%, 80–89%, 90–99%).

A Pearson's chi-square goodness-of-fit test was performed. The null hypothesis is that the observed distribution of digits in the data set for breast tumor size and Ki-67 index follows Benford's law; hence a distribution having a p-value > 0.05 is considered to adhere to Benford's distribution. Mann–Whitney The U test was used to compare the calculated frequencies of digits 1 to 9 with the expected frequencies calculated according to Benford's Law.

3. Results

Data from 1000 patients with primary malignant breast tumors was retrieved from KRL Hospital and analyzed. We found no pentameric preference for histopathological measurement of tumor size. On computing the last digit frequency for histopathological measurements of tumor size recorded in centimeters to one decimal place, the histogram displayed a higher frequency of digits three and nine. There was a decrease in the frequency around the peak of digit nine which correlates to digits seven, eight, and one.

The evidence of pentameric preference was found in the recording of the Ki-67 index. The histogram of the distribution of the Ki-67 index recorded in percentage revealed that the digits three, five, and six were in higher frequency. There was a marked drop in the frequency of digit nine. The rest of the terminal digits, although differing in frequency from that predicted by Benford's law, somewhat followed Benford's distribution curve, as shown in Fig. I.

Fig. 1.

Fig. 1

Benford's law breakdown of KI-67%..

The effect of terminal digits of tumor size on Ki-67 staining scores (low proliferative vs high proliferative) was evaluated using Mann – Whitney U Test. The test revealed the insignificant effect of terminal digits of tumor size on the Ki-67 staining scores (p = 0.114), as shown in Fig. II.

Fig. 2.

Fig. 2

Benford's distribution curve.

4. Discussion

It has not been thoroughly researched. Our study using 1000 specimens of primary malignant breast tumors collected from November 2016 to December 2020 at KRL Hospital aimed to assessment of terminal digit preference in breast tumor size and Ki–67 indices using Benford's law.

Studies have shown that terminal digit preference occurs in pathological reporting of malignant tumors, including colorectal adenocarcinomas and breast tumors [5,6]. This type of preference can affect T size categorization and thus have serious implications in patient management and prognosis, particularly in tumor types where macroscopic tumor size affects staging, such as breast and thyroid tumors.

The histograms showing the distribution of the first digits of the largest diameter of tumor size and that of Ki-67% showed that some digits were reported at a higher frequency than others. The preferred terminal digits were 3 and 9 in the reporting of histopathological breast tumor size and 3, 5, and 6 in Ki-67 staining scores. The result of digit preference in tumor diameter is inconsistent with the previous nationwide study done in Norway, in which there was clear evidence of terminal digit preference for 0 and 5 (pentameric preference) in the measurements of mammographic and histopathologic tumor diameter [6].

The increased frequency of digits 3 and 9 in tumor size measurement found in our study does not affect T category classification according to the TNM system as 30 mm and 90 mm values do not lie at the border values. For example, a tumor diameter of 28 mm rounded off to 30 mm due to a digit preference of 3 would fall into the T2 category either way. Similarly, tumor diameters falling in the range of 80 mm–89 mm rounded off to 90 mm owing to digit preference would be classified as T3. We can say that histopathologists adopted a practical approach by avoiding T category border values, a finding consistent with a recent Dutch study [15].

The Ki-67% score is defined as the percentage of positively stained cancer cells among the total number of malignant cells evaluated. It is used as a prognostic marker in breast cancer. A higher Ki-67% score shows that the tumor is aggressive, more likely to spread, and is associated with a worse prognosis. In our study, there was a preference in the reporting of Ki-67 staining scores for digits 3, 5, and 6. As there is no common consensus on a cut-off value of Ki-67, the effects of terminal digit preference in the reporting of these scores remain unpredictable and need to be investigated further.

The Mann–Whitney U test was used to assess the effect of tumor size on Ki-67 staining scores, with 10% set as the cut-off value for low and high Ki-67 scores. It has been reported that a higher Ki-67 index is correlated with increasing tumor size [16]. However, in our study, the relationship between tumor size and Ki-67 values was found to be insignificant.

Using immunohistochemistry, researchers from Anyang Tumor Hospital have found evidence that distinguishing between TNBC subtypes with varying degrees of aggressiveness and prognosis can be done using p53 and ki-67 proteins. The prognosis is significantly worse for patients with high p53 or Ki-67 indices or a history of cancer in the family. Prognostic markers in TNBC include p53, Ki-67, and family history [17].

5. Limitations

The limitation of our study is that it included data from a single hospital with small sample size. The study did not include information about individual histopathologists who performed the measurements, and it was not possible to study intra-observer or inter-observer trends. With no knowledge of the specific cases affected by terminal digit preference, it is not possible to postulate the clinical implications of the measurement error. The current TNM staging guidelines do not consider terminal digit preference. More explicit guidelines are required to minimize errors in the pathological reporting of malignant tumors. To make our findings generally applicable, there should be uniformity in using the cut-off value for Ki-67. Further studies are required to probe the consequences resulting from this source of measurement error.

6. Strengths

It is essential to measure the impact of terminal digit preference especially when treatment guidelines can change the course of management. This study quantifies the impact of this covert tendency to alter the exact dimensions of the tumors. Hence, this study is an approach to unveiling the impact of terminal digit preference.

7. Conclusion

The recording of the Ki-67 index revealed evidence of pentameric preference. The numbers three, five, and six appeared more frequently in the histogram of the Ki-67 index distribution measured in percentage. The frequency of nine has dropped dramatically. The influence of tumor size terminal digits on Ki-67 staining scores (low proliferative vs. high proliferative) was assessed using the Mann–Whitney U test, which demonstrated that tumor size terminal digits did not affect Ki-67 staining scores.

Ethical approval

The ethical approval was obtained from ethical review committee of Shifa International Hospital Islamabad.

Reference number: 09-ERC/17/10/2015.

Sources of funding

No funding received.

Author contribution

All the authors contributed to the outlines of the research proposal, data collection, Analysis, writing, editing, proofing, and final approval of the paper.

Availability of data and material

Yes.

Registration of research studies

Name of the registry: Not Applicable.

Unique Identifying number or registration ID: Not Applicable.

Hyperlink to your specific registration (must be publicly accessible and will be checked):

Consent

Not Applicable.

Guarantor

Hassan Mumtaz.

Provenance and peer review

Not commissioned, externally peer-reviewed.

Declaration of competing interest

None to disclose.

Acknowledgements

Nill.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.amsu.2022.103993.

Contributor Information

Shahzaib Ahmad, Email: shahzaib.ahmad@kemu.edu.pk.

Amber Latif, Email: amberlatif332@gmail.com.

Mehwish Mehmood, Email: mehwishmehmood62@gmail.com.

Ramisha Aslam, Email: Ramishaaslam17@gmail.com.

Zain Ul Abiddin, Email: Zawan031@gmail.com.

Hassan Mumtaz, Email: Hassanmumtaz.dr@gmail.com.

Khadija Ahmed, Email: khadijaahmed907@gmail.com.

Waqas Mehdi, Email: Dr.waqasmehdi@gmail.com.

Waheeda Begum, Email: Waheedaamin1996@gmail.com.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.docx (30.8KB, docx)

References

  • 1.Libson S., Lippman M. A review of clinical aspects of breast cancer. Int. Rev. Psychiatr. 2014 Feb;26(1):4–15. doi: 10.3109/09540261.2013.852971.PMID:24716497. [DOI] [PubMed] [Google Scholar]
  • 2.Fahad Ullah M. Breast cancer: current perspectives on the disease status. Adv. Exp. Med. Biol. 2019;1152:51–64. doi: 10.1007/978-3-030-20301-6_4. PMID: 31456179. [DOI] [PubMed] [Google Scholar]
  • 3.Maughan K.L., Lutterbie M.A., Ham P.S. Treatment of breast cancer. Am. Fam. Physician. 2010 Jun 1;81(11):1339–1346. PMID: 20521754. [PubMed] [Google Scholar]
  • 4.Cserni G., Chmielik E., Cserni B., Tot T. The new TNM-based staging of breast cancer. Virchows Arch. 2018 May;472(5):697–703. doi: 10.1007/s00428-018-2301-9. Epub 2018 Jan 27. PMID: 29380126. [DOI] [PubMed] [Google Scholar]
  • 5.Hayes SJTerminal digit preference occurs in pathology reporting irrespective of patient management implications. J. Clin. Pathol. 2008;61:1071–1072. doi: 10.1136/jcp.2008.059543. [DOI] [PubMed] [Google Scholar]
  • 6.Tsuruda Kaitlyn M., Hofvind Solveig, Akslen Lars A., Hoff Solveig R., Veierød Marit B. Terminal digit preference: a source of measurement error in breast cancer diameter reporting. Acta Oncol. 2020;59(3):260–267. doi: 10.1080/0284186X.2019.1669817. [DOI] [PubMed] [Google Scholar]
  • 7.Cheang Maggie C.U., Cheang SMaggie C.U., Chia Stephen K., et al. Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer. JNCI: J. Natl. Cancer Inst. 20 May 2009;101(10):736–750. doi: 10.1093/jnci/djp082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Campolieti M. COVID-19 deaths in the USA: Benford's law and under-reporting. J. Public Health. 2021 May;24:fdab161. doi: 10.1093/PubMed/fdab161. Epub ahead of print. PMID: 34027553; PMCID: PMC8194618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kennedy A.P., Yam S.C.P. On the authenticity of COVID-19 case figures. PLoS One. 2020;15(12) doi: 10.1371/journal.pone.0243123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li Feifei, Han Shuqing, Zhang Hongyv, Ding Jiaojiao, Zhang Jianhua, Wu Jianzhai. Application of Benford's law in data analysis. J. Phys. Conf. 2019;1168 doi: 10.1088/1742-6596/1168/3/032133. [DOI] [Google Scholar]
  • 11.Beer TWTerminal digit preference: beware of Benford's law. J. Clin. Pathol. 2009;62:192. doi: 10.1136/jcp.2008.061721. [DOI] [PubMed] [Google Scholar]
  • 12.Benford Frank. vol. 78. American Philosophical Society; 1938. The law of anomalous numbers; pp. 551–572.http://www.jstor.org/stable/984802 (Proceedings of the American Philosophical Society). 4. [Google Scholar]
  • 13.Mathew G., Agha R., for the STROCSS Group Strocss 2021: strengthening the Reporting of cohort, cross-sectional and case-control studies in Surgery. Int. J. Surg. 2021;96 doi: 10.1016/j.ijsu.2021.106165. [DOI] [PubMed] [Google Scholar]
  • 14.https://www.researchregistry.com/browse-the-registry#home/registrationdetails/629989568acf11001fac6ee4/
  • 15.den Bakker M.A., Damhuis R.A. Pentameric last‐digit preference and stage border avoidance in pathology measurement. Histopathology. 2018 Sep;73(3):510–513. doi: 10.1111/his.13640. [DOI] [PubMed] [Google Scholar]
  • 16.Nigam J.S., Kumar T., Bharti S., Sinha R., Bhadani P.P. Association of ki-67 with clinicopathological factors in breast cancer. Cureus. 2021 Jun 13;13(6) doi: 10.7759/cureus.15621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Pan Y., Yuan Y., Liu G., Wei Y. P53 and Ki-67 as prognostic markers in triple-negative breast cancer patients. PLoS One. 2017;12(2) doi: 10.1371/journal.pone.0172324. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.docx (30.8KB, docx)

Data Availability Statement

Yes.


Articles from Annals of Medicine and Surgery are provided here courtesy of Wolters Kluwer Health

RESOURCES