Statistics from Altmetric.com
Comparing a new test with a standard involves measuring disagreement. In the case of measuring carotid artery stenosis, some of the disagreement between different tests is because of inherent differences in how the stenosis is demonstrated (test characteristics). This is what we are most interested in when assessing a new technology. However, some of the disagreement simply reflects variability in how we physically make the measurement with the standard technique. Choosing the point of maximum stenosis, choosing the point in the common carotid artery for use as a denominator, measuring from an eyepiece, or measuring from calipers all introduce variation when measuring carotid stenosis. The resulting observer variability in reporting contributes to disagreement between methods but to some extent is independent of the method used to generate the angiogram in the first place.
In the medical literature, disagreement between methods is often attributed entirely to test characteristics, with little appreciation of the role of observer variability in reporting. When one method is compared with another and disagreements emerge, it is not readily apparent how much of the disagreement is caused by the method used and how much by the process of measurement, unless observer variability data are also presented. In the recent paper from Patel et al, interobserver variability data are presented but their significance in relation to overall agreement does not appear to have been appreciated.1
Using the data from Patel et al (tables 2 and 4) for symptomatic carotid arteries, it is noted that when 34 carotid digital subtraction angiograms (DSA) are measured by one radiologist, there was disagreement in seven cases when the same films were reported by a second radiologist. Therefore if only DSA was used, seven patients would have had “inappropriate” surgery according to which radiologist read the angiogram. This is not surprising, and such disagreement is a consistent finding in observer variability studies.2,3 Observer variability in reporting DSA therefore accounted for approximately 20% of disagreement in this particular series of angiograms. This sets a limit on the maximum agreement that any alternative method can demonstrate when compared with DSA. It is clearly not reasonable to expect better agreement from another method than can be obtained by re-reporting the DSAs themselves. In Patel’s table 2, when the same arteries are assessed by computed tomographic angiography (CTA) there was disagreement with DSA in seven cases, while with magnetic resonance angiography (MRA) and ultrasound there was disagreement in six and seven cases, respectively. The three alternatives thus disagree with DSA to the same extent as can be attributed to observer disagreement in reporting DSA. Put simply, the same number of missed or unnecessary operations would have occurred (roughly 20% in this series) whatever method was used, including DSA alone. Observer variability is not confined to DSA, and the scatter plots from Patel et al (fig 2) would suggest—in keeping with other studies—that observer variability is greater for MRA and CTA than for DSA.1 It is surprising that this did not translate into more clinically important disagreements when MRA and CTA were compared with DSA. This is probably accounted for by the fact that in this study, for MRA and CTA, consensus views were taken for any disagreements greater than 10% between observers.
This highlights the important point that combining multiple observations made on the same data will reduce observer variability, and ultimately improve agreement with other methods. Partly for this reason, but also because to some extent the strengths and weaknesses of CTA, MRA, and duplex ultrasound are complementary, we would suggest that a combination of tests (we use the combination of ultrasound and MRA) should be used in preference to DSA.
What is clear from this study is that most of the disagreement between the different methods of measuring carotid stenosis can be attributed to observer variability in reporting rather than to the test characteristics of the individual methods themselves. The 10% of patients injured as a result of DSA in this study, and those who continue to be put at risk from catheter angiography in these circumstances, would be quite entitled to ask why they are exposed to a procedure which appears to offer no great advantage over safer alternatives. We suggest that more studies are not required, simply a more thorough understanding of presently available information.
Doctors Young and Humphrey highlight that differences between tests arise from several factors, some of which are inherent in the test and some of which arise from aspects attributable to observer variation. Some of the aspects to do with observer variation apply to interpretation of all tests and some are specific to certain tests. In our study we were endeavouring to quantify the effect on patient management if non-invasive tests were used instead of intra-arterial angiography to assess carotid stenosis. Our study has several limitations, including a relatively small sample size, and the fact that we were not able to get all scans read by all observers but rather had to get pairs of observers to concentrate on reading only CTA, or MRA, or DSA. A better design would have been to keep the same workers together in pairs but randomly assign the CTA, MRA, or DSA films to each pair. As it is, it is possible that some of the apparent difference between imaging modalities is specific to the pair of observers, not to the modality. However, imaging studies are difficult to fund and expensive to do, and the result and design of our study was a compromise involving all these factors.
We identified that the observer reliability of CT angiography or MR angiography was worse than that for digital subtraction angiography, as highlighted by Drs Young and Humphrey. Also in general there was more scatter between the observers for the reading of asymptomatic stenoses than for symptomatic stenoses (emphasising the importance of considering patient characteristics, not just the imaging technique). In the determination of the effect that this disagreement might have on patient management, we used nomograms derived from the European carotid surgery trial which were based on intra-arterial angiographic measurement of stenosis. We therefore had to use the comparison of non-invasive test reading with DSA rather than being able to use the individual observers readings of non-invasive tests. Thus as Drs Young and Humphrey point out, the actual effect of using non-invasive tests maybe worse than we have estimated.
Finally, Drs Young and Humphrey suggest that more studies are not required but we are not entirely sure that that is completely true. Non-invasive imaging tests are continually undergoing modifications, many of which may be improvements in accuracy or practicality, but this cannot be assumed to be the case. Much of this tinkering with technology is driven by the manufacturer’s desire to encourage purchase of new machines. Improvements have also occurred in intra-arterial angiography with smaller and more manoeuvrable catheters and greater awareness of the risks, which may have helped to reduce the risk of angiography. Our “snap shot” of CTA, MRA, and ultrasound is already out of date because contrast MRA is now increasingly used. While we would hope that non-invasive tests (probably in combination rather than alone) would eventually replace intra-arterial angiography in the majority of patients being considered for carotid intervention, we feel it likely that there will always be a need for some intra-arterial angiography in specific cases, or depending on local resources. In any case DSA did not appear less popular than MRA among the patients in our study. There is certainly room for much more in depth examination of existing data but we shouldn’t close the door on the need for further studies.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.