Degree of Awareness Level of Making a Misdiagnosis as Detected by the Need to Use Immunohistochemistry for P16ink4a and Ki67 among Mexican Pathologists in Cervical Biopsies
- 1. Department of Anatomic Pathology, Grupo Diagnóstico pathology laboratory, Mexico
- 2. Center for research and advanced studies (CINVESTAV), Universidad Nacional Autónoma de México (UNAM), Mexico
- 3. Pathologist Hospital Juarez de México CDMX, Mexico
- 4. Laboratorio Grupo Diagnóstico México, Mexico
Abstract
Context: The criteria for diagnosing intraepithelial neoplasia of the cervix (CIN) are well described. However, there is no acceptable agreement in the diagnosis between different observers. Immunohistochemistry (IHC) primarily with p16INK4A. is used to correct this discordance. Most studies have analyzed the concordance of the interobserver diagnosis
Objective: To assessed whether observers are aware of the possibility of committing diagnostic error with H&E staining as measured by the immediate request and use of IHC. No previous studies have analyzed this variable.
Design: Sixty-six Mexican pathologists attending the congress in Mérida Yucatán conducted a survey on two cases of low-grade lesion and one case of epidermoid metaplasia with H&E staining and also IHC with p16INK4A. and Ki67 at the same time to detect by the observer the possibility of having a wrong diagnosis and correct it immediately when using immunohistochemistry, diagnostic agreement was also measured.
Results: The results show that 45.4% of the observers considered evaluating the immunohistochemistry as a degree of awareness of possible error and 59.6% had an incorrect diagnosis. Only 4.5% of the pathologists corrected their diagnosis. The Kendall concordance coefficient (W) in the three cases was 0.355 (p <0.000), which corresponds to a very poor level in the diagnosis.
Conclusions: In the cervical intraepithelial neoplasia there is little awareness of the possibility of having a wrong diagnosis by the observer and an important overdiagnosis towards high grade intraepithelial lesion, which leads to unnecessary treatment, while low grade intraepithelial lesion is only observed without any further procedure.
Citation
Curiel Valdés JJ, Meza JC, Cortes ME, González Hernández AM (2020) Degree of Awareness Level of Making a Misdiagnosis as Detected by the Need to Use Immunohistochemistry for P16ink4a and Ki67 among Mexican Pathologists in Cervical Biopsies. Med J Obstet Gynecol 8(1): 1133.
Keywords
• Immunohistochemistry
• P16ink4a
• Ki67
• Cervical Biopsies
INTRODUCTION
The criteria for diagnosing intraepithelial lesions in cervical biopsies have been described and accepted uniformly [1-4]. In 2012, the consensus known as Lower Ano-genital Tract Standarization Terminology (LAST) [4], proposed that in uterine cervical biopsies the diagnosis should be divided into two groups: high-grade squamous intraepithelial lesions (HGSIL) and low-grade squamous intraepithelial lesions (LGSIL). However, the terminology of cervical intraepithelial neoplasia (CIN 1,2 and 3), is still used [1-4]. The main diagnostic criteria with usual staining of hematoxylin and eosin (H&E) are based on nuclear alterations, nucleus cytoplasm ratio, polarity loss, altered maturation, and basal or suprabasal mitosis [1-4]. The most important criteria is nuclear atypia, and it is accepted that any degree of CIN can affect three thirds of the cervical epithelium [1-3]. Galgano et al. [5], comment that “The CIN nomenclature is mainly the subjective measure of thickness of the affected epithelium, the percentage of replacement of differentiated or mature epithelial cells by abnormal or dysplastic. At least 2/3 of the epithelium are replaced in CIN3, between one third and two thirds in CIN2, and one third or less in CIN1”, giving more importance to the percentage of epithelium replacement by atypical cells. In contrast with established criteria [1-3], the LAST consensus emphasizes distinction of maturation and mitosis in the upper third of the cervical epithelium, and proposes the use of p16 immunostaining to rule out metaplasia [4].
When requesting IHC, the desire to confirm or amend the diagnosis is implicit [5,6], and likely reflects the acknowledgement of possible diagnostic error. Low levels of interobserver agreement in cervical pathology maybe a reflection of the former [3-6]. McCluggage et al. [7] (1998), reported kappa value of 0.30 (range 0.22 - 0.59) among six expert observers when analyzing 125 cervical biopsies. This is discouraging, since biopsy is the gold standard for diagnosis and treatment decision.
Given the interpretation variability in cervical cases, IHC markers are used. p16INK4A (p16) [6], is an indirect and reliable evidence of damage caused by E7 oncogene of the human papilloma virus.6, being particularly useful in the diagnosis of HGSIL [5,6]. Klaes et al. [6], showed that p16 significantly improves the evaluation of cervical lesions among international experts, from 45% with H&E to 95% with p16. In Mexico, in 2006, 64 pathologists were requested to evaluate two cases (one LGSIL and one HGSIL) using H&E, reaching 60% of diagnostic agreement, which improved significantly to 100% after the use of p16 [8].
Moreover, the cell proliferation index address by Ki67 antibody reflects the cell cycle affected by the HPV E7 oncogene [5]. In cervical epithelium without injury or with squamous metaplasia, Ki67 positivity is observed discontinuously in the basal layer, and positivity in parabasal or superficial layers is an indicator of intraepithelial lesion1-5. When evaluating p16, Ki67 and HPV L1 capsule protein (indicating the end of their reproductive cycle), diagnoses between general pathologists and experts showed 85% concordance in cases without lesion. Concordant cases of CIN1, CIN2, CIN3 and invasive carcinoma were found in 61.9%, 47.6%, 75% and 83.3%, respectively [5]. Darragh and Nucci [4,9], do not advise routine use of p16, although it has recently been established that its use improves interobserver concordance from 0.58 to 0.73 [10]. In daily practice, the subjectivity in interpretation when making a diagnosis, is a common source of error, even if the histological criteria are applied [1-4,9,10]. Despite the efficacy of IHC markers, lack of use is common, which indicates that pathologists may not be aware of a possible wrong diagnosis. No previous work has addressed if pathologists are aware of incorrect diagnosis when evaluating cervical biopsies by histology alone [5,8,10-12]. This study aims to evaluate the use of p16 and Ki67 as an indicator of the pathologist’s awareness about a diagnostic error with H&E. Our hypothesis was that the immediate availability of IHC can confirm or amend diagnosis in three cases of cervical biopsies. No other study has provided an observer with the immediate opportunity to consider whether he or she established an incorrect diagnosis.
An accurate diagnosis in cervical biopsies impacts treatment. LGSIL (CIN 1) is only followed clinically, while most patients with HGSIL (CIN 2 and CIN 3), require treatment [4]. Galgano et al. [5], highlight the “clinical challenge” to differentiate squamous pre-cancerous lesions and over treatment due to interpretation errors.
MATERIALS AND METHODS
During the joint congress of the Federation of Pathological Anatomy of the Republic of Mexico and the Mexican Association of Pathologists in Mérida, Yucatán, in May 2017, 3 cases of cervical biopsies stained with H&E were shown to 66 pathologists. Approval from the Federation of Pathological Anatomy of the Republic of Mexico for the ethics of the protocol was obtained. After the histological evaluation, IHC slides (p16 and Ki67 of each case) where available to correct or confirm their initial diagnosis. Pathologists who mentioned routinely evaluating cases of cervical pathology were included in the study. Of these, 10 pathologists were classified as experts, according to their practice in a specialized gynecological hospital and / or their experience in performing biopsies in colposcopy clinics or in publications on cervical biopsy. The remaining pathologists were considered as general pathologists.
Cervical biopsies
Three cases were selected by one of the authors (JJCV) during clinical practice and submitted to diagnosis for review to the other 2 authors (MEC, AMGH) and a final consensus diagnosis was made. One case corresponded to HPV-negative cervical squamous metaplasia and two cases had LGSIL (CIN1). These underwent real-time polymerase chain reaction for HPV (case 1, HPV type 33 and case 3, HPV type 52 and 58, all high-risk). We chose all cases due to the greater discrepancy in the interobserver evaluation for this group [2-5]. We use diagnostic criteria in the LAST consensus4 . Samples were stained with H&E and monoclonal antibodies for p16 and Ki67 (Bio SB, Santa Barbara, California, USA) with their respective appropriate controls.
Questionnaire
The following items were asked to evaluate for each case: 1) Number of years and geographical location of practice; 2) Histological diagnosis with routine H&E staining; 3) Need for IHC (p16 and Ki67 slides were available for immediate review); 4) After IHC review, confirm or modify initial diagnosis. The objective of the study was explained to all the participants and answering the questionnaire was considered as consent for their participation.
Statistical analysis
We use descriptive statistics to analyze the answers obtained by pathologists. We performed subgroup analysis according to the following variables: expert (yes / no) and years of professional practice ( 15 years). We use the Kruskal-Wallis rank test to evaluate the difference in diagnosis before and after the use of IHC. We choose Wilcoxon Rank-sum test to assess differences between initial and final diagnosis by expertise and years of professional practice. Interobserver agreement was evaluated with the Kendall concordance coefficient (W), since these were ordinal variables. Α <0.5 alpha level was considered statistically significant. We performed all statistical analysis on Stata v12 (College Station, TX) and Prism v7 (GraphPad Software, Inc. CA).
RESULTS
A total of 67 pathologists agreed to answer the survey, one of which did not complete the survey and was not included in the final analysis, 66 pathologists finally entered the survey. Thirtyone (47%) of pathologists practice in Mexico City and the rest in 19 cities in other regions that cover most of the country. Ten (15.2%) pathologists were considered experts in gynecological pathology. The average number of years of professional practice was 16.6 years (standard deviation, 13.35; range, 1-45 years). Considering the years of practice, the sample was subdivided into the following two groups: 1-14 years (n = 31) and >15 years (n = 35). All pathologists provided their initial diagnosis for each of the three cases.
Case analysis
Case 1: The consensus diagnosis of the first case was CIN 1 (Table 1) (Figure 1). Twenty-one pathologists had it correct (31.8%), while only 25 pathologists (37.9%) considered necessary to use IHC. Of the whole group, 55 pathologists (83.3%), evaluated p16 (Figure 1C), and Ki67 (Figure 1D). Using p16, 33.3% (n = 22), pathologists correctly diagnosed the case, using Ki67, 36.4% (n = 24), of the pathologists had a correct diagnosis. After IHC analysis, 22 of 61 pathologists diagnosed CIN 1 (36.1%), of which only 5 (8.2%), modified their initial diagnosis. Of the 21 pathologists who initially diagnosed correctly, in the end, two did not answer and two changed their diagnosis to metaplasia (6%). On the other hand, four pathologists who initially answered CIN2 and one who answered CIN3, changed their diagnosis to the correct one (7.6%; Table 2). The change in diagnosis after the use of IHC was statistically significant (Kruskal-Wallis 27.895 rank test, gl 3; p = 0.0001). There was no difference according to expertise (Wilcoxon rank test = 0.028; p = 0.98), nor years of professional practice (Wilcoxon rank test = -0.754; p = 0.45).
Case 2: The consensus diagnosis was squamous metaplasia (Table 1) (Figure 2). Twenty-five (37.9%) pathologists responded correctly, while 32 (48.5%), considered necessary to use IHC. Of the hole group, only 18.2% (n = 12), modified their initial diagnosis. Evaluation of p16 yielded 31.8% (n = 21), correct diagnoses, while using Ki67, 37.9% (n = 25), pathologists had a correct diagnosis (Figure 2 C-D). After IHC review, 28 of 63 (42.4%) pathologists diagnosed metaplasia, of which 6 (9.1%), actually modified their previous diagnosis. Of 25 pathologists who initially diagnosed metaplasia, in the end one did not answer and two (4.5%) changed to CIN1 and CIN2, respectively. Five pathologists initially diagnosed CIN1 and another CIN2, all of the six (9.1%) amended to the correct diagnosis. The change in diagnosis after the use of IHC was statistically significant (Kruskal-Wallis 30,822 rank test, gl 4; p = 0.0001) (Table 2).
Case 3: The consensus diagnosis of the third case was CIN1 (Table 1) (Figure 3). Thirty-four (51.5%) pathologists responded correctly, while half of the participants (n = 33; 50%) requested to use IHC. Fifty percent (n=33), of the whole group modified their initial diagnosis. Fifty-seven (86.4%) pathologists evaluated p16, and only 15 (22.7%), had a correct diagnosis. Of the 58 (88%), pathologists that evaluated Ki67, only 23% (n = 15), considered the correct diagnosis. After analyzing IHC slides, 16 of 62 pathologists diagnosed CIN 1 (25.8%), of which only 4 (6.1%), modified their previous diagnosis. Of the 34 pathologists who answered correctly at the beginning, 22 (64.7%) subsequently changed their diagnosis (Metaplasia = 1; CIN2 = 13; CIN3 = 4; CIS = 1; Did not answer = 3). In contrast, four pathologists (6.1%), initially diagnosed CIN2 and finally modified their diagnosis to CIN1. We did not find significant differences in diagnosis after the use of IHC (Kruskal-Wallis 4.565 range test, gl 3; p = 0.2) (Table 2).
Interobserver agreement
At the initial evaluation with H&E, the Kendall concordance coefficient (W) was 0.355 (p <0.000), which corresponds to a not acceptable level of agreement. However, this figure diminishes with the use of p16 (W= 0.129; p <0.000) and Ki67 (W= 0.205; p <0.000). After complete evaluation with IHC, interobserver concordance decreased to W= 0.267 (p <0.000), which represents a very low level of agreement.
Table 1: Pathologist responses.
Diagnosis (%) | Immunohistochemistry (%) | |||||||
Metaplasia | CIN1 | CIN2 | CIN3 | CIS | Yes * | No | No answer | |
Case 1 | 25 (37.9) | 40 (60.6) | 1 (1.5) | |||||
Initial diagnosis (n = 66) | 0 | 21 (31.8) | 24 (36.4) | 13 (19.7) | 8 (12.1) | |||
p16 ( n = 55) | 2 (3.0) | 22 (33.3) | 15 (22.7) | 13 (19.7) | 3 (4.6) | 11 (16.7) | ||
Ki67 (n = 55) | 2 (3.0) | 24 (36.4) | 19 (28.8) | 7 (10.6) | 3 (4.6) | 11 (16.7) | ||
PostIHC diagnosis (n = 61) | 3 (4.6) | 22 (33.3) | 21 (31.8) | 10 (15.1) | 5 (7.6) | 5 (7.6) | ||
Case 2 | 32 (48.5) | 33 (50.0) | 1 (1.5) | |||||
Initial diagnosis (n = 66) | 25 (37.9) | 26 (39.4) | 8 (12.1) | 4 (6.1) | 3 (4.5) | |||
p16 (n = 53) | 21 (31.8 ) | 13 (19.7) | 18 (27.3) | 1 (1.5) | 13 (19.7) | 13 (19.7) | ||
Ki67 (n = 54) | 25 (37.9) | 18 (27.3) | 5 (7.6) | 4 (6.1) | 2 (3.0) | 12 (18.2 ) | ||
PostIHC diagnosis (n = 63) | 28 (42.4) | 20 (30.3) | 7 (10.6) | 5 (7.6) | 3 (4.5) | |||
Case 3 | 33 (50.0) | 31 (47.0) | 2 (3.0) | |||||
Initial diagnosis (n = 66) | 1 (1.5) | 34 (51.5) | 27 (40.9) | 4 (6.1) | 0 | |||
p16 (n = 57) | 0 | 15 (22.7) | 30 (45.5) | 8 (12.1) | 4 (6.1) | 9 (13.6) | ||
Ki67 (n = 58) | 2 (3.0) | 15 (22.7) | 25 (37.9) | 12 (18.2) | 4 (6.1) | 8 (12.1) | ||
PostIHC diagnosis (n = 62) | 1 (1.5) | 16 (24.2) | 29 (43.9) | 12 (18.2) | 4 (6.1) | 4 (6.1) |
Table 2: Variation between initial and final diagnosis after evaluation.
IHC Case 1: Correct CIN1 | |||||||
Initial | Diagnosis Final diagnosis (%) | ||||||
Metaplasia | CIN1 | CIN2 | CIN3 | CIS | Did not answer | Total | |
Metaplasia | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
CIN1 | 2 (3.0) | 17 (25.8) | 0 | 0 | 0 | 2 (3.0) | 21 (31.8) |
CIN2 | 1 (1.5) | 4 (6.1) | 17 (25.8) | 0 | 0 | 2 (3.0) | 24 (36.4) |
CIN3 | 0 | 1 (1.5) | 2 (3.0) | 9 (13.6) | 0 | 1 (1.5) | 13 (19.7) |
CIS | 0 | 0 | 2 (3.0) | 1 (1.5) | 5 (7.6) | 0 | 8 ( 12.1) |
Total | 3 (4.5) | 22 (33.3) | 21 (31.8) | 10 (15.1) | 5 (7.6) | 5 (7.6) | 66 (100) |
Case 2: Correct Metaplasia | |||||||
Final diagnosis (%) | |||||||
Metaplasia | CIN1 | CIN2 | CIN3 | CIS | Did not answer | Total | |
Metaplasia | 22 (33.3) | 1 (1.5) | 1 (1.5) | 0 | 0 | 1 (1.5) | 25 (37.9) |
CIN1 | 5 (7.6) | 18 (27.3) | 1 (1.5) | 0 | 0 | 2 (3.0) | 26 (39.4) |
CIN2 | 0 | 1 (1.5) | 5 (7.6) | 2 (3.0) | 0 | 0 | 8 (12.1) |
CIN3 | 1 (1.5) | 0 | 0 | 3 (4.5) | 0 | 0 | 4 (6.1) |
CIS | 0 | 0 | 0 | 0 | 3 (4.5) | 0 | 3 (4.5) |
Total | 28 (42.4) | 20 (30.3) | 7 (10.6) | 5 (7.6) | 3 (4.5) | 3 (4.5) | 66 (100) |
Case 3: Correct CIN1 | |||||||
Final diagnosis (%) | |||||||
Metaplasia | CIN1 | CIN2 | CIN3 | CIS | Did not answer | Total | |
Metaplasia | 0 | 0 | 1 (1.5) | 0 | 0 | 0 | 1 (1.5) |
CIN1 | 1 (1.5) | 12 (18.2) | 13 (19.7) | 4 (6.1) | 1 (1.5) | 3 (4.5) | 34 (51.5) |
CIN2 | 2 0 | 4 (6.1) | 14 (21.2) | 5 (7.6) | 3 (4.5) | 1 (1.5) | 27 (40.9) |
CIN3 | 0 | 0 | 1 (1.5) | 3 (4.5) | 0 | 0 | 4 (6.1) |
CIS | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Total | 1 (1.5) | 16 (24.2) | 29 (43.9) | 12 (18.2) | 4 (6.1) | 4 (6.1) | 66 (100) |
DISCUSSION
In the present study, 66 Mexican pathologists evaluated three cervical biopsies and were given the immediate option of complementing the diagnosis with IHC markers, p16 and Ki67. The main objective of the study was to evaluate the awareness of diagnostic error. We defined it as the acceptance to evaluate IHC slides in order to establish a definitive diagnosis. This is the first study to assess awareness of diagnostic error among pathologists. Our results show that diagnosis of CIN is very poor in a Mexican pathologist’s cohort, even when using IHC. We also found no difference between those considered experts and general pathologists (Figure 4).
The percentage of correct diagnosis for each case was low (32%, 37.9%, and 51.5%, respectively) with an average of 40.4%. The evaluation of IHC did not improve diagnostic certainty, which only rose 4.5%. A greater number of pathologists evaluated IHC slides: 45.4% initially requested but 83.3% and 84.3% observed p16 and Ki67 slides, respectively, although did not modify their initial diagnosis. This reflects that the criteria for p16 evaluation, with or without Ki67, is not applied correctly, despite its description in several studies of cervical pathology [1,5,7,8,10- 12]. A meta-analysis on the use of p16 in biopsy and cervical cytology found that an interpretation criterion is not uniform [12]. We can assume that there is little (or lack of) experience in p16 evaluation. This IHC marker is not widely used in Mexico, therefore its use and interpretation may not be proper. Ki67 antibody is widely used to evaluate different entities in surgical pathology: our results show better interpretation scores (36.4%, 37.9%, and 23%, respectively), compared to p16.
The level of agreement among pathologists in this field generally improves with the conjunct use of IHC slides [4-9]. Stoler et al. (2018), found a kappa index of 0.58 with H&E that increased to 0.73 with p16. In our study, interobserver agreement, as measured by Kendall concordance coefficient, yield an initial value of 0.355 with H&E. Moreover, after IHC interpretation, this coefficient decreased to 0.267, which means a poor level of agreement and highlights the lack of correct criteria application.
The clinical implications of these results are important: both squamous metaplasia and LGSIL are frequently upgraded to HGSIL. Diagnosis of LGSIL, that denotes a benign lesion, is only followed by clinical observation [5]. The excessive diagnosis of HGSIL in our study will reflect aggressive treatment [4,5]. In addition, the degree of distress in patients attributed to the diagnosis of lesions caused by HPV is important [13]. Why does the pathologist fail to make a correct diagnosis? The interpretation of the criteria is subjective in daily practice1-4, and even experts cannot achieve unanimous consensus1 . As mentioned before, the main diagnostic criteria with (H&E) on nuclear alterations and also in graduating nuclear atypia, is very subjective [1-4]. Also, the difference of criteria on how to graduate CIN if the atypia affects the 3/3 or joust 1/3 or 2/3 of the cervical epithelium [1,3,5], giving more importance to the percentage of epithelium replacement by atypical cells than the LAST consensus, that emphasizes distinction of maturation and mitosis in the upper third of the cervical epithelium, and encourage the use the 2 tear of high and low, instead of CIN.
The economic repercussions for the health system are not minor. In Mexico, 20 million (of approximately 68 million women [14], are candidates for cervical-vaginal cytology screening. Between 4% and 6% may have abnormal results [4]. As the results of this study suggest, if we biopsy 50% of them (800,000 biopsies per year), more than half (480,000 biopsies) would have an incorrect diagnosis (mostly over-diagnosis of HGSIL). Moreover, if a treatment costs $10,000 (ten thousand Mexican pesos) per patient, our institutions would pay $ 4,800,000,000.00 (four thousand eight hundred million pesos) as result of diagnostic error. This figure represents 0.8% of the budget approved for health sector in 2019 [15].
It is known that the diagnostic categories with the lowest degree of interobserver agreement are the distinction between squamous metaplasia versus LGSIL (CIN1), and between LGSIL (CIN1) and HGSIL (CIN 2-3) [3-5,7,10]. However, most courses and literature on cervical pathology focuses on the evaluation of high-grade lesions and invasive neoplasia (unpublished personal communication).
Therefore, it is necessary to emphasize the awareness of pathologists to apply established histological criteria more effectively and enhance the correct evaluation of useful IHC markers.
CONCLUSIONS
Our study shows that a sample of 66 Mexican pathologists had a 45.4% degree of awareness of diagnostic error, as measured by the evaluation of p16 and Ki67, in three cases of cervical biopsies. Only 38.8% had the correct diagnosis with low concordance after H&E review (Kendall’s W= 0.355) that dropped further to 0.267 after IHC evaluation. There was no difference between experts and general pathologist. It is necessary to emphasize and update programs in cervical pathology to improve the diagnostic level and awareness of the pathologist regarding the clinical implications of patients.