Statistics from Altmetric.com
We read with interest the article by Stocchettiet al on the accuracy and feasibility of the ellipsoid and the Cavalieri method in assessment of the volume of intracranial mass lesions in patients with severe head injury.1 We agree with the authors that the volume of intracranial lesions, and its change over time, is important in the diagnosis and management of patients with head injury and in the evaluation of clinical trials.2
However, the methodology used in the study raised our concern. We have several comments on their statements, because they are potentially misleading.
(1) The statement that computer based reading of mass lesions is the choice when accurate volume estimation is necessary, is insufficiently founded. Tracing CT lesions on a digitised screen automatically calculating area and hence volume, is a hazardous task: delineating hyperdense and hypodense lesions from normal surrounding intracranial structures cannot always be performed reliably, due to isodensity of normal brain tissue at some edges and due to partial volume effects. Moreover, lesion tracing is the same as area estimation using a simple device such as a point counting grid with sufficient grid points, and is in fact not superior at all.3
(2) Volume estimations based on Cavalieri's principle have to fullfill one absolute requirement: randomness.4 5 The volume of any object may be estimated from randomised and parallel sections separated by a known distance by summing up the areas of all cross sections of the object and multiplying this sum by the known intersection distance. The total area of all cross sections may be estimated by a stereological point counting method.3 A systematic array of grid intersection points is superimposed on each section. Giving random positioning of the test array on each section, the total number of grid intersection points hitting the object of interest affords an unbiased estimate of the total area. In the study of Stocchetti et al randomness has most probably not been accounted for, as it has not been mentioned in the text and as the grid has not randomly been placed onto the CT slices.1
(3) When applying Cavalieri's principle it becomes mathematically possible to calculate the coefficient of error of the individual lesion mass volume. It declines in direct proportion to the total number of CT planes and to the total number of grid intersection points. Generally, a coefficient of error of less than 0.05 is obtained if the number of CT planes used is 10 or more, and the number of grid intersection points is 50 or more.3 5 From our own practical experience we and others know that CT at standard settings (5 or 8 mm slice thickness) almost never renders sufficient sections through the lesion mass, except for very huge subdural or extradural haematomas. Spiral CT with a 3 mm section distance may overcome this problem. Another advantage of Cavalieri's principle is its applicability to any mass lesion irrespective of size and form.
(4) The average difference between the applied technique and the reference computer based value is 0.57 (SD 9.99!) ml for the Cavalieri method and 0.20 (SD 15.48!) ml for the ellipsoid method, suggesting on average acceptable agreement. However, what really matters is the accuracy, validity and reliability of the individual volume measurements. That these are not very high can be derived from the huge standard deviations of the average differences and from the considerable limits of agreement in the graphical depiction of the results.
Accuracy of the individual measurements has to be high as in the trauma coma data bank (TCDB) classification a volume of greater than 25 ml is defined as a mass lesion.2
(5) Although three examiners read the scans, the interobserver variability was calculated with analysis of variance (ANOVA) on the mean volumes. No intraobserver variability studies were conducted which can be considered as an omission.
(6)The TCDB CT classification being the resultant of the status of the mesencephalic cisterns, the degree of midline shift, and the presence of a mass lesion provides a ranking order of the severity of the initial injury (I=normal, II=diffuse injury, III=diffuse injury with swelling, IV=diffuse injury with shift, V=operated mass lesion, and VI=non-operated mass lesion). Intracranial mass lesion volume, although important, is but one of the measured indices in the TCDB classification. We showed recently that the TCDB CT classification system for patients with severe head injury has in fact a high interobserver and intraobserver reliability when used by clinicians without special training in neuroradiology.6
Stocchetti and Colombo reply:
We are grateful to Vos et al for their comments. Because the main reason for our research was the fact that intracranial mass lesions are rarely measured, it is reassuring that some centres have documented expertise on such measurement.
We do not think, however, that our data, and the conclusions drawn from them, are potentially misleading, and we will try to clarify our arguments.
As indicated in the paper, we compared two pragmatic methods and a computer based method. There are, of course, limitations to each method, and tracing on the computer screen can be tricky; however, a careful tracing is feasible in expert hands and we think that the resulting calculation still gives a very acceptable reference point. If this reference method is to be questioned, an appropriate, preferably superior, method should be identified. We cannot think of any pragmatic method that would be the best choice.
Regarding the other points that aroused the concern of Voset al, we agree on many and will try to clarify them.
(1) Randomness is an important prerequisite; it was not mentioned in the paper but the grid was placed on the CT slices at each reader’s convenience and choice. Whether this was random enough can be debated, but it seems to ensure an adequate guarantee against systematic error.
(2) Our data did not obscure the beauties of the Cavalieri method. The direct estimator performed better than the Ellipsoid method, particularly for irregularly shaped lesions. We agree that the method performs even better with bigger lesions reconstructed using thin slices. As one of our goals was to describe feasibility, however, the comments of Vos et al further stress that the best can be obtained from the Cavalieri method at the price of more time and work, as we verified and reported in the paper. Counting more than 50 points in more than 10 slices adds to the precision, but seriously increases the burden of measurements.
(3) We did not base our comparison only on mean data. We share the concerns of Vos et al on the analysis of mean data to the extent that we have used another method, based on that of Bland and Altman. This method compares every single lesion, obtained with each method, against the corresponding reference value. The results of this detailed comparison are illustrated in figs 1 and 2 of our paper. Numerical data, summarising the analysis according to Bland and Altman, are reported in the text and table. Both in the results section and in the discussion we stated that the mean data were not able to describe the discrepancies found in single cases.
(4) Accordingly, we assessed interobserver variability by ANOVA of individual measurements, and not of the mean data. In other words, we asked whether the measurement of any specific lesion by one examiner was significantly different from the other examiners' results.1-1 The ANOVA on the readings by the three examiners using the Ellipsoid method gave a p value of 0.86, and the same analysis applied to the Cavalieri direct estimator gave a p value of 0.81; we therefore concluded that this analysis excluded significant differences. It seems that the intraobserver variability was not omitted, but was in fact logically considered, and we must apologise if the text was not clear.
(5) We agree with the final paragraph of Vos et al on the structure of the Marshall classification, in which volume is one part of the grading. That was correctly indicated in our paper. From our experience in multicentre, international clinical trials, we are less optimistic about the proper application of the TCDB CT classification, but that is another point in favour of improving the methods for CT readings.
In conclusion we have applied a methodology that seems solid enough to substantiate our conclusion and, we hope, to fulfil the requirements of careful and competent readers.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.