Article Text

Research paper
Interobserver agreement and validity of bedside ‘positive signs’ for functional weakness, sensory and gait disorders in conversion disorder: a pilot study
  1. Corinna Daum1,
  2. Fulvia Gheorghita1,
  3. Marianna Spatola1,
  4. Vesna Stojanova1,
  5. Friedrich Medlin1,
  6. François Vingerhoets1,
  7. Alexandre Berney2,
  8. Mehdi Gholam-Rezaee3,
  9. Giorgio E Maccaferri2,
  10. Monica Hubschmid2,
  11. Selma Aybek1
  1. 1Department of Neurology and Clinical Neurosciences, University Hospital (CHUV), Lausanne, Switzerland
  2. 2Psychiatric Liaison Service, Psychiatry Department, University Hospital (CHUV), Lausanne, Switzerland
  3. 3Department of Psychiatry, Lausanne University Hospital (CHUV), Lausanne, Switzerland
  1. Correspondence to Dr Corinna Daum, Department of Neurology, University Hospital (CHUV), Rue du Bugnon 46, Lausanne 1011, Switzerland; corinna.daum{at}


Background Conversion disorder (CD) is no longer a diagnosis of exclusion. The new DSM-V criteria highlight the importance of ‘positive signs’ on neurological examination. Only few signs have been validated, and little is known about their reliability.

Objective The aim was to examine the clinical value of bedside positive signs in the diagnosis of CD presenting with weakness, gait or sensory symptoms by assessing their specificity, sensitivity and their inter-rater reliability.

Patients and methods Standardised video recorded neurological examinations were performed in 20 consecutive patients with CD and 20 ‘organic’ controls. Ten previously validated sensory and motor signs were grouped in a scale. Thirteen additional motor/sensory ‘positive signs’, 14 gait patterns and 1 general sign were assessed in a pilot validation study. In addition, two blinded independent neurologists rated the video recordings to assess the inter-rater reliability (Cohen's κ) of each sign.

Results A score of ≥4/14 on the sensory motor scale showed a 100% specificity (CI 85 to 100) and a 95% sensitivity (CI 85 to 100). Among the additional tested signs, 10 were significantly more frequent in CD than controls. The interobserver agreement was acceptable for 23/38 signs (2 excellent, 10 good, 11 moderate).

Conclusions Our study confirms that six bedside ‘positive signs’ are highly specific for CD with good-excellent inter-rater reliability; we propose to consider them as ‘highly reliable signs’. In addition 13 signs could be considered as ‘reliable signs’ and six further signs as ‘suggestive signs’ while all others should be used with caution until further validation is available.

  • Hysteria

Statistics from


There were several attempts to recognise positive aspects of diagnosing conversion symptoms. Indeed, it contains psychiatric and neurological contributions. Traditionally for patients with unexplained somatic symptoms presented in the neurology department doctors use several tests to capture non-organic features of possible neurological symptoms. The recently published fifth revision of DSM1 has further emphasised the role of neurologists towards making a positive diagnosis of conversion disorder (CD). A new criteria (B) requires that “clinical findings provide evidence for incompatibility between the symptom and recognised neurological or medical conditions”. This incompatibility is established by negative findings and exclusion and relies on the presence of positive signs such as the Hoover's sign2 ,3 in motor weakness, the ictal eye closure sign in seizure4 or an entrainment phenomenon in tremor.5 The neurological examination is very rich and most of these numerous bedside clinical signs are traditionally transmitted through generations of neurologists during their training. In the era of evidence-based medicine however, there is a need for objective criteria regarding the validity of these signs and in particular a need to verify whether they are reproducible by different examiners. A recent systematic review6 identified 14 clinical signs—for motor, sensory or gait functional symptoms—which had been validated in a controlled sample. These 14 signs had good specificity—from 93% to 100%—suggesting their use should be recommended in the diagnostic process of functional disorders. As patients often present with more than one of such ‘positive signs’, combining them in a scale could increase the discriminative power and help in differentiating functional from ‘organic’ causes to the symptoms at the bedside. The first aim of our study was thus to develop a scale, based on previously validated signs, and test its specificity and sensitivity in a sample of functional patients (with weakness and/or sensory deficits) compared with patients suffering from similar symptoms in whom organic lesions of the central nervous system have been proved by paraclinical examinations (further referred to as ‘organic’). The second aim of our study was to validate additional signs described in the literature, which were never properly evaluated in controlled samples. The systematic review6 also revealed that none of the 38 positive signs found in the medical literature had a proper estimation of inter-rater reliability. Our third aim was thus to assess the interobserver agreement of these signs in a blinded design.



Twenty patients with a diagnosis of CD (according to DSM-IV-TR)7 and a control group of 20 patients with an organic lesion (14 ischaemic, 2 haemorrhagic, 1 inflammatory, 1 infectious and 2 tumoral) were included in the study using a prospective study design. Patients were selected from consecutive admissions to our inpatient and outpatient University tertiary care Neurology Department between 1 March 2011 and 1 March 2012. Subjects were included if they had a measurable sensorimotor deficit with at least 1 point on the NIHSS item 5 (motor arm) or 6 (motor leg). This study focused on patients with a functional motor/sensory or gait disorder. Patients with psychogenic non-epileptic seizures, tremor and other abnormal movements were not included, neither were those with speech or swallowing problems, nor those with visual, olfactory or hearing disturbances. Exclusion criteria were age of <18 years and >85 years, severe aphasia, dementia, neglect and acute confusional state, severe pain and former history or active comorbidity of alcoholic or substance abuse and/or psychosis. All patients had a detailed neurological examination, brain imaging (CT and/or MRI) and if necessary other evaluations (electroneuromyography, lumbar puncture, spine MRI, others) for diagnostic workup. All subjects with CD were assessed by a trained liaison psychiatrist to confirm the diagnosis (during the inclusion period, DSM-IV-TR criteria were applied where the requirement of a psychological associated factor was mandatory). All subjects gave their written informed consent and the study was approved by the local ethics committee (Commission cantonale d’éthique de la recherche sur l’être humain, Université de Lausanne, protocole 03/11).


All subjects were examined by one of two authors (CD or SA) and a video recording was simultaneously taken. The neurological examination was standardised in order to assess 38 clinical signs (1 general, 18 motor, 5 sensory and 14 gait signs) (see online supplementary appendices 1 and 2 for details). Each individual sign was judged by the examining neurologist (CD or SA) in a dichotomised manner as being present or absent. No comments were made during the video-recorded examination but right after the examination was finished, the examiner filled in a form indicating if the sign was judged as present or not—if necessary part of the examination could be reviewed on the video recording. The blinded raters filled in the exact same forms (see online supplementary appendix 1 for detailed interpretation of the signs). In addition, two independent neurologists who were kept blind to the diagnosis watched the video recordings and made the same judgement. These neurologists had at least 3 years of neurological training. For the purpose of this study they underwent one session of explanation looking at how to rate each sign and they were provided with written information they could consult when performing their ratings. In order to assure blinding, all patients were examined in the same room, wearing a hospital gown. The video recordings were cut in two sections (motor/sensory and gait examination) and presented in a pseudorandom order to the blinded neurologists.


Functional neurological symptom scales

Signs selection

We built a scale combining sensory and motor ‘positive signs’, that had already been validated in the literature6 and that were observable on a video recording. Among the seven validated motor signs,6 we thus excluded the abductor sign8 and the abductor finger sign,9 as well as motor inconsistency,10 as we felt the short standardised recording would not allow a proper judgment on these signs. We kept the Hoover's sign,2 as it is the best validated sign and as it could reveal co-contractions11 ,12 (another positive sign) observable on the video recording. In addition, we included the Spinal injury test13 and co-contraction sign.11 ,12 Regarding collapsing weakness and give-way weakness2 ,10 ,14sometimes described as a single sign in the literaturewe defined them as two different signs; give-way weakness can be seen when testing segmental strength and collapsing weakness is defined by a sudden loss of tone during the arm stabilisation test, when being touched. We also added the recently validated drift without pronation sign.15 Among the five validated sensory signs,6 we excluded the changing pattern of sensory loss,10 ,12 again because we felt the short design of our examination would not allow detecting such changes.

Scale scoring

We attributed two points to signs which had a robust validation (either validated in several studies or validated in a controlled sample with well-defined measures)—the Hoover's sign,2 ,3 the give-way sign,2 ,10 ,14 the drift without pronation sign15 and splitting the midline sign2 ,10 ,14 ,16 and attributed one point to all other signs (see scale in table 2).

Scale analysis

The choice of a threshold to dichotomise a scale variable is usually a tricky choice, especially because dichotomisation induces a loss of information. We decided to monitor the predictive properties of each scale variable by controlling the specificity and sensitivity of the dichotomised version. It is clear that a trade-off between these two characteristics leads to the optimal threshold; thus for each scale variable we chose the threshold with the highest possible specificity for a high enough sensitivity (more than 90% whenever possible).

Controlled unblinded validation study analysis

Besides the 10 previously validated motor/sensory signs included in the functional scale we wanted to validate 13 additional motor/sensory signs, 14 gait patterns and one general-sign described in the literature but not validated yet (see table 2: additional signs). We compared the observed frequency of each individual sign between patients with CD and the ‘organic’ control group using the two-tailed Fisher's exact test (p<0.05 was considered statistically significant) and calculated their individual specificity and sensitivity.

Controlled blinded interobserver study analysis

In a third step we assessed the interobserver agreement between the two blinded neurologists by calculating the Cohen's κ inter-rater reliability of each ‘positive sign’. The agreement was considered poor (<0.20), fair (0.21–0.40), moderate (0.41–0.60), good (0.61–0.80) or excellent (0.81–1).


Clinical and demographic data are presented in table 1. Patients with CD were significantly younger than ‘organic’ patients. Disease duration was significantly longer in patients with CD than in ‘organic’ patients. This reflects the fact that all ‘organic’ patients were recruited from the inpatient setting—where patients present with acute symptoms—and patients with CD were recruited from inpatient and outpatient clinics where more chronic patients are referred. Both groups presented similar types and distributions of symptoms. Patients with CD had a slightly more severe physical handicap on the Rankin scale which did not reach statistical significance and both groups had similar degrees of paresis as measured by the scale item 5 (motor arm) and scale item 6 (motor leg) of the NIHSS scale.

Table 1

Demographic and clinical data

Results for the functional neurological symptom scale

The frequency of the positive signs in our total sample is presented in table 2. The search for a cut-off score on the sensorimotor scale revealed that a score of ≥4 out of 14 yielded a 95% sensitivity (CI 85 to 100) and 100% specificity (CI 85 to 100).

Table 2

Results: (1) The sensorimotor scale and (2) The ADDITIONAL signs

Results for the unblinded controlled validation study

Among the 10 previously validated motor/sensory signs included in the functional scale 8 showed a significant difference between the CD and the ‘organic’ groups. The Spinal Injury or SIC test and the systematic failure test failed to reach statistical significance. Of note, both signs have a low frequency of 10% and 7.5%, respectively.

Five of the 13 additional motor/sensory signs were significantly more frequent in patients with CD: non-concavity of the hand, inconsistence of direction during the arm stabilisation test, the sternocleidomastoid test, a non-pyramidal distribution of paresis (NOT distal>proximal) and irregular drift (during the leg stabilisation test). We also examined 14 gait signs and one general sign (expressive behaviour) (also see table 2 for detailed results).

Only dragging monoplegic gait—or leg dragging2 ,17 ,18 has already been validated in the literature. The other gait patterns have been described in patients with CD, but never been examined in a controlled study design.17 ,19–21 Only four of the 14 gait signs showed a statistically significant difference between the two groups in our study: hesitation, flailing arms, bizarre excursion of the trunk and psychogenic Romberg. Of note the others had an overall frequency of ≤15%.

Expressive behaviour19 (observed as general sign during the whole neurological examination) also showed a statistically significant difference between patients with CD and ‘organic’ patients.

Results for the blinded interobserver study

Overall the inter-rater agreement was excellent in 2 signs (sternocleidomastoid test and always falling towards support during walking), good in 10 signs (give-way weakness, drift without pronation, co-contraction, splitting the midline, splitting of vibration sense, irregular drift (during arm stabilisation), non concavity of the palm, expressive behaviour, hesitation and tremulousness) and moderate in 11 signs (collapsing weakness, SIC Test, inconsistence of direction (during arm stabilisation), non digiti quinti sign, irregular drift (during Mingazzini), excessive slowness, psychogenic Romberg, leg dragging, non-economic posture, sudden knee buckling, bizarre excursion of trunk). So altogether 23 of 38 signs had acceptable (>0.41) inter-rater reliability while 15 had poor reliability.

Among the 10 sensorimotor signs in our scale, 5 had a good inter-rater reliability, 2 a moderate, 1 a fair, 1 a poor and 1 (Hoover's sign) was not applicable (for details please refer to table 2).


The current validation pilot study replicates previous findings for 8 out of 10 previously validated sensorimotor bedside clinical signs, by confirming that they are significantly more often observed in conversion patients than in ‘organic’ patients. When combining these 10 previously validated signs in a scale, we were able to show that a low cut-off score was yielding excellent results: with four points or more a 100% specificity could be achieved (CI 85 to 100). These findings suggest that for diagnosing functional sensorimotor disorders, the use of a combined scale can be clinically useful. We found other useful clinical ‘positive signs’ for diagnosing CD (in a controlled unblinded validation study analysis) and for the first time examined the inter-rater reliability of the different ‘positive signs’ of functional weakness, and sensory and gait disorders (in a controlled blinded interobserver study analysis).

Our study has some limitations. First, it was designed to assess a large number of clinical signs in a single neurological examination, but it included signs of very different frequencies. For example, the Hoover's sign can be considered a frequent sign (found in 45% of subjects in our sample and in 41% in another sample2—60 patients and 1 control out of 147 subjects), when leg dragging is rare (found in 5% of subjects in our sample and 9% in another2). Thus, negative findings in our current study should not suggest these signs are of no clinical value, but are rather due to a lack of power and warrant further investigations.

Second, this current pilot study provides useful information on specificity and sensitivity but one cannot infer values for positive predictive values of the sign, as this depends on the prevalence of the disease in the tested population. Our population was highly selected—in a case-control design—and is thus not representative of the distribution of functional symptoms in the general population. To assess this issue, further studies with a different design should be performed, including ‘de novo’ patients (ie, before the diagnosis is definitely established).

Third, our patients differed in terms of disease duration (shorter time for the organic control group) and one cannot exclude that it played a role in the frequencies of the signs we report.

Finally, one should stress out the fact that the low sample size of this pilot study did not allow for subgroup analysis (including only motor or only sensory clinical presentation) and did not allow to further refine the individual weight of each sign on the scale. Indeed the low cut-off score we found on our scale (4/14) suggests that having only between 2 or 4 positive signs is specific enough for a diagnosis of functional neurological disorders but further studies should aim at defining which signs should be routinely tested in a shorter scale. The current study, however, still brings valuable clinical information as it provides novel data on the interobserver agreement as well as pilot data on additional signs that were never previously validated.

When looking at the interobserver agreement values, our study confirms good to excellent values for 5 out of 10 previously validated motor/sensory signs. It also demonstrated good to excellent values for three additional motor/sensory signs and three gait signs (see table 2). Here again, one should be careful in interpreting poor values for signs that are rare, in light of power issues.

The results of the study, along with the considerable body of literature, underline the fact that many clinical signs, despite a lack of strong evidence to support its use, are very specific and reliable and can be clinically very helpful. In order to help clinicians deal with this issue and sort out which signs are of greater value, we propose to classify them in three categories. We propose to consider as ‘highly reliable signs’ the signs that had (1) been previously validated in other samples AND at the same time (2) shown significant differences between the two groups in the current sample (Fisher p<0.05) AND (3) showed a good to excellent inter-rater reliability (κ>0.6) (table 3). We would consider signs as ‘reliable signs’ when they had (1) been previously validated in other samples OR (2) shown significant differences between the two groups in the current sample (Fisher p<0.05) AND (3) shown a moderate to excellent inter-rater reliability (κ>0.4). Finally, we propose to consider as ‘suggestive signs’ the signs that had (1) a high individual specificity >95% AND (2) a moderate to excellent inter-rater reliability (κ >0.4). All other signs should be used with caution until further validation is available (see table 4).

Table 3

Proposed classification of positive signs

Table 4

Signs that need further validation and should currently be interpreted with caution

In the future, the use of these validated signs and/or scale may also prove useful when assessing a patient whose chief complaint is not motor or sensory deficit, but non-epileptic seizures or involuntary movements, as these symptoms frequently overlap in patients with CD.

In conclusion, our study provides additional objective data on bedside ‘positive’ clinical signs, showing that for sensory and motor deficits several ‘highly reliable signs’ are available and should be used when diagnosing functional neurological symptoms, following the DSM V recommendation. Gait signs are less robust—because rarer—even though ‘reliable or suggestive signs’ are available.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Contributors CD conceived the study, acquired the data (examination of patients, performing of video recordings), performed parts of statistical analysis and wrote the draft of the manuscript. FG, MS, VS and FM participated in data acquisition and reviewed the manuscript. MG-R performed and supervised the statistical analysis and reviewed the manuscript. FV, AB, GEM and MH reviewed the manuscript. SA conceived the study, acquired the data, performed and supervised the statistical analysis and reviewed the manuscript.

  • Competing interests SA was supported by the Swiss National Research Foundation (Advanced researcher grant) and the Bourse Pro-Femme for Lausanne University.

  • Patient consent Obtained.

  • Ethics approval Commission cantonale d’éthique de la recherche sur l’être humain, Université de Lausanne, protocole 03/11.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.