Background: Although reproducibility in reading MRI images amongst radiologists and clinicians has been studied previously, no studies have examined the reproducibility of inexperienced clinicians in extracting pathoanatomic information from magnetic resonance imaging (MRI) narrative reports and transforming that information into quantitative data. However, this process is frequently required in research and quality assurance contexts. The purpose of this study was to examine inter-rater reproducibility (agreement and reliability) among an inexperienced group of clinicians in extracting spinal pathoanatomic information from radiologist-generated MRI narrative reports.
Methods: Twenty MRI narrative reports were randomly extracted from an institutional database. A group of three physiotherapy students independently reviewed the reports and coded the presence of 14 common pathoanatomic findings using a categorical electronic coding matrix. Decision rules were developed after initial coding in an effort to resolve ambiguities in narrative reports. This process was repeated a further three times using separate samples of 20 MRI reports until no further ambiguities were identified (total n=80). Reproducibility between trainee clinicians and two highly trained raters was examined in an arbitrary coding round, with agreement measured using percentage agreement and reliability measured using unweighted Kappa (k). Reproducibility was then examined in another group of three trainee clinicians who had not participated in the production of the decision rules, using another sample of 20 MRI reports.
Results: The mean percentage agreement for paired comparisons between the initial trainee clinicians improved over the four coding rounds (97.9-99.4%), although the greatest improvement was observed after the first introduction of coding rules. High inter-rater reproducibility was observed between trainee clinicians across 14 pathoanatomic categories over the four coding rounds (agreement range: 80.8-100%; reliability range k=0.63-1.00). Concurrent validity was high in paired comparisons between trainee clinicians and highly trained raters (agreement 97.8-98.1%, reliability k=0.83-0.91). Reproducibility was also high in the second sample of trainee clinicians (inter-rater agreement 96.7-100.0% and reliability k=0.76-1.00; intra-rater agreement 94.3-100.0% and reliability k=0.61-1.00).
Conclusions: A high level of radiological training is not required in order to transform MRI-derived pathoanatomic information from a narrative format to a quantitative format with high reproducibility for research or quality assurance purposes.
This abstract is reproduced with the permission of the publisher. Click on the above link for free full text.