The performance of different classification criteria sets for spondyloarthritis in the worldwide ASAS-COMOSPA study

Background In this study, we sought to compare the performance of spondyloarthritis (SpA) classification criteria sets in an international SpA cohort with patients included from five continents around the world. Methods Data from the (ASAS) COMOrbidities in SPondyloArthritis (ASAS-COMOSPA) study were used. ASAS-COMOSPA is a multinational, cross-sectional study with consecutive patients diagnosed with SpA by rheumatologists worldwide. Patients were classified according to the European Spondyloarthropathy Study Group (ESSG), modified European Spondyloarthropathy Study Group (mESSG), Amor, modified Amor, Assessment of SpondyloArthritis international Society (ASAS) axial Spondyloarthritis (axSpA), ASAS peripheral spondyloarthritis (pSpA) and ClASsification criteria for Psoriatic Arthritis (CASPAR) criteria. Overlap between the classification criteria sets was assessed for patients with and without back pain. Furthermore, patients fulfilling different arms of the ASAS axSpA criteria (imaging arm, clinical arm, both arms) were compared on the presence of SpA features. Results A total of 3942 patients (5 continents, 26 countries) were included. The mean age was 43.6 years, 65.0% were male, 56.2% were human leucocyte antigen B27-positive and 64.4% had radiographic sacroiliitis (based on modified New York criteria). Of the patients, 85.5% were classified by the ASAS SpA criteria (87.7% ASAS axSpA, 12.3% ASAS pSpA). Fulfilment of the Amor, ESSG and CASPAR criteria was present in 83.3%, 88.4% and 21.6% of patients, respectively. Of the patients with back pain (n = 3227), most were classified by all three of Amor, ESSG and ASAS axSpA criteria (71.4%). Patients fulfilling the imaging arm and the clinical arm of the ASAS axSpA criteria had similar presentations of SpA features. In patients without back pain, overlap between classification criteria sets was seen, although to a lesser extent. Conclusions Most patients with a clinical diagnosis of axial SpA in the worldwide ASAS-COMOSPA study fulfil several classification criteria sets, and a substantial overlap between different criteria sets is seen, which suggests a high level of credibility of the criteria. Large inter-regional differences in the fulfilment of classification criteria were not found. Patients fulfilling the clinical arm were remarkably similar to patients fulfilling the imaging arm with respect to the presence of most SpA features. Electronic supplementary material The online version of this article (doi:10.1186/s13075-017-1281-5) contains supplementary material, which is available to authorized users.


Background
Spondyloarthritis (SpA) encompasses a group of interrelated rheumatic conditions: ankylosing spondylitis (AS), including earlier forms of the disease that do not yet exhibit definitive structural damage on radiographs; psoriatic arthritis (PsA); arthritis associated with inflammatory bowel disease (IBD); and reactive arthritis [1]. Because SpA may have a heterogeneous presentation, a correct diagnosis is challenging. Rheumatologists make a diagnosis on the basis of what they have been taught during rheumatology training. The 'art of diagnosing' starts with a list of potential differential diagnoses, from among which the trained clinician deducts the most appropriate disease based upon the recognition of the 'Gestalt' and exclusion of other diagnoses.
Classification serves a completely different purpose, and several classification criteria sets of SpA are available. These classification criteria should be applied only in patients who have been diagnosed with SpA by a rheumatologist, and they cannot be used as a check box to be ticked in order to make the diagnosis. But the components of classification criteria may remind the clinician of the clinical picture of the disease. Different criteria sets put an emphasis on different features, and we do not know to what extent different criteria sets have penetrated different parts of the world. Therefore, we do not know which sets have influenced clinicians in particular regions most or to what extent these various sets of criteria describe more or less similar patients. Consequently, we do not know if rheumatologists around the world diagnose patients with a similar clinical picture of the disease.
The European Spondyloarthropathy Study Group (ESSG) criteria and the Amor criteria were developed to classify patients with SpA as a whole [2,3]. In clinical practice, rheumatologists tend to distinguish patients with SpA according to their primary clinical presentation as patients with predominantly axial or predominantly peripheral complaints (with some overlap between these subtypes). The Assessment of SpondyloArthritis international Society (ASAS) has developed new criteria to better accommodate this distinction [4,5]. These criteria sets can classify patients with predominantly axial symptoms as having axial spondyloarthritis (axSpA) and patients with predominantly peripheral symptoms as having peripheral spondyloarthritis (pSpA). The ASAS axSpA criteria consist of two arms: the imaging arm classifies patients who have sacroiliitis visualised on conventional radiographs and/or bone marrow oedema on magnetic resonance imaging (MRI), and the clinical arm classifies patients with normal imaging results. In 2006, a specific classification criteria set for PsA was developed, known as the ClASsification criteria for Psoriatic ARthritis (CASPAR) [6].
Classification criteria are used to include patients in clinical trials, cohort studies and other types of research. These criteria are frequently validated in restricted patient populations. We took the opportunity to investigate if rheumatologists worldwide diagnosed similar types of patients as having SpA by testing if patients fulfil similar criteria sets in the Assessment of SpondyloArthritis international Society COMOrbidities in SPondyloArthritis (ASAS-COMOSPA) study. Our assumption was that the more criteria sets a patient fulfils, the higher the likelihood that a patient with a diagnosis of SpA truly has SpA. The ASAS-COMOSPA study provides a unique opportunity to investigate this research question because it is, to our knowledge, the first observational study with such a large, worldwide population of patients with SpA, with axial and/or peripheral symptoms included [7].

Study population
The ASAS-COMOSPA study is an observational, crosssectional, multicentre study which has been introduced elsewhere [7]. Participating rheumatologists were asked to include consecutive patients with a diagnosis of SpA from routine care. These patients had to fulfil the ASAS axSpA or pSpA criteria, but fulfilment of the ASAS criteria was not checked before inclusion. All information required to judge the fulfilment of various criteria sets, including the ASAS criteria, was collected in a random order (not grouped by criteria set) in the case report form.
Patients from 26 participating countries in 6 regions across the world (Western Europe, Central Europe, North America, Latin America, North Africa and Asia) were included. Western Europe was represented by Belgium, France, Germany, Hungary, Italy, the Netherlands, Portugal, Spain and the United Kingdom. Poland, Russia, Turkey and Ukraine were grouped into Central Europe. North America encompasses Canada and the United States, and Argentina, Brazil, Colombia and Mexico were summarized as Latin America. North Africa comprised Egypt and Morocco. China, Japan, Korea, Singapore and Taiwan were grouped and referred to as Asia. Approval by the local medical ethics committees, as well as written informed consent from all patients, was obtained before inclusion.

Classification criteria
Patients were classified according to the following criteria sets: ESSG, Amor, ASAS SpA, ASAS axSpA, ASAS pSpA, imaging arm of ASAS axSpA, clinical arm of ASAS axSpA and CASPAR criteria [8]. The presence of either inflammatory back pain (IBP) or peripheral arthritis is a mandatory entry criterion of the ESSG criteria. According to the ESSG criteria, patients with at least one of the entry criteria in combination with one other minor criterion, such as enthesitis or psoriasis, are classified as having SpA [2]. Human leukocyte antigen B27 (HLA-B27) is not incorporated in this criteria set. The Amor criteria include a list of features with different weights, none of which is essential to classify a patient as having SpA, but a classification of SpA depends on the sum of weights [3]. Because patients in the COMOSPA study were not asked about the presence of balanitis, night pain and buttock pain, these items have not been taken into account, and therefore patients cannot collect points on these items in the Amor and ESSG criteria. The ESSG and Amor criteria were developed before MRI became widely available. In the present analysis, we also investigated the possibility of including inflammatory findings on MRI (ASAS definition [9]) as a feature in both the ESSG and Amor criteria, resulting in the modified Amor (mAmor) and modified European Spondyloarthropathy Study Group (mESSG) criteria.
The ASAS axSpA criteria consist of two arms, the imaging arm and the clinical arm, and can be applied only to patients with back pain of ≥3 months' duration and an age of onset <45 years [10]. In patients with sacroiliitis visualised on pelvic radiographs or MRI, at least one other SpA feature should be present in order to be classified as axSpA according to the imaging arm [4]. In HLA-B27-positive patients, at least two other additional SpA features should be present in order to be classified as axSpA according to the clinical arm [4]. In patients without current back pain but with current peripheral manifestations, the classification for peripheral SpA can be applied. If a patient satisfies the entry criterion (current arthritis, enthesitis or dactylitis), the patient should have at least one other SpA feature if this is a specific SpA feature or at least two SpA features for less specific features [5]. Altogether, when current back pain (as defined above) is the presenting symptom, the ASAS axial SpA criteria should be applied. If arthritis/enthesitis/dactylitis is the presenting symptom, the peripheral SpA criteria should be applied. Together, these two sets form the ASAS SpA criteria.
A separate classification criteria set has been developed for PsA: the CASPAR criteria [6]. To meet the CASPAR criteria, the stem of the criteria demands first the presence of inflammatory articular disease and a score of at least 3 points derived from the presence of features such as skin psoriasis, dactylitis, nail lesions or juxta-articular bone formation visualised on radiographs (each feature is assigned a certain number of points). All above-described criteria sets are depicted in Additional file 1.

Data analysis
Disease characteristics were described using descriptive statistics. The fulfilment of classification criteria was calculated for the cohort as a whole and thereafter per region. Subsequently, overlap between the different classification criteria was investigated and presented in Venn diagrams. This was done for patients with back pain and patients without back pain separately. Next, we looked in detail at the fulfilment of the ASAS axSpA criteria, comparing patients fulfilling only the clinical arm, patients fulfilling only the imaging arm and patients fulfilling both the clinical and imaging arms with regard to demographics and the presence of SpA features. Information on HLA-B27 must be available to be able to classify patients in the 'imaging arm-only' group, and information on imaging must be available to be able to classify patients in the 'clinical arm-only' group. IBM SPSS Statistics version 20.0 software (IBM, Armonk, NY, USA) was used for statistical analysis.

Results
In total, 3984 patients were included in the COMOSPA study, with a mean number of SpA features of 5.5 (SD 1.8). The most common missing items were MRI of the sacroiliac joints (missing in 1951 patients), the presence of juxta-articular bone formation (missing in 999 patients) and HLA-B27 status (missing in 882 patients). There were 251 patients (6.4%) for whom both sacroiliac joint MRI and radiographs were not performed and 180 patients (4.6%) for whom HLA-B27 in addition was missing.
On the other hand, information on extra-articular manifestations was missing in none of the cases. Arbitrarily, a maximum of 6 missing items (total number of items 18) per patient was accepted. Patients with 7 or more missing items (n = 42) were left out of the analysis, which brings the total number of patients for this analysis to 3942. To define SpA features as present or absent, in order to apply the classification criteria, missing items were regarded as absent.
Demographics and disease characteristics are depicted in Table 1. Patients had a mean age of 44 years, and 65% were male. In the total cohort (patients with available data), HLA-B27 positivity was seen in 56% (73.0%) of the patients, and 54% (57.7%) had an elevated C-reactive protein level. Regarding the presence of sacroiliitis visualised on imaging, 64% (70.0%) presented with sacroiliitis seen on radiographs and 34% (94.8%) with sacroiliitis seen on MRI.

Fulfilment of classification criteria
Most (92.6%) of the 3942 patients fulfilled the mESSG criteria. Fulfilment of Amor, mAmor, ESSG and ASAS criteria was all above 80% (Table 2). A minority (12.3%) of the patients fulfilled the ASAS pSpA criteria, whereas 21.6% of the patients fulfilled the CASPAR criteria. We emphasise that the criteria were applied to all patients; only the patients with seven or more missing values were left out, and missing items were regarded as absent.
Most patients (n = 1507) were included in Western Europe (85 centres from 26 countries), followed by 1073 patients in Asia, 438 patients in Central Europe, 337 patients in Latin America, 337 patients in North Africa and 239 patients in North America. Regional differences in fulfilment of classification criteria are depicted in Table 3. When we looked in detail at the ASAS SpA criteria, we found that in Central Europe, 84% of the patients fulfilled the ASAS axial SpA criteria (ASAS peripheral criteria 5.3%), whereas in contrast, in North America, 51% of the patients fulfilled the axial SpA criteria (ASAS peripheral criteria 22.6%). In both Asia and Central Europe, a small minority of the patients fulfilled the ASAS pSpA criteria, and the axial complaints were by far the predominant symptoms. A relatively high percentage of patients fulfilled the CASPAR criteria in North America compared with the other regions. Less pronounced regional differences were seen regarding criteria sets that cover the whole spectrum of SpA, namely the Amor and ESSG criteria.

Overlap in classification criteria
Venn diagrams representing the overlap between the different criteria sets are shown in Figs. 1 and 2. Regarding the patients without current back pain (peripheral complaints), again substantial overlap between the criteria was seen (ASAS pSpA, Amor, ESSG, CASPAR) (Fig. 2). Most of the patients fulfilled all four criteria sets (n = 224 [31.3%]). Subsequently, 125 patients (17.5%) fulfilled all criteria, except those for PsA-specific CASPAR criteria, which is not surprising, because the CASPAR criteria are focussed on the clinical disease PsA and not on other forms of pSpA. Only six patients (0.8%) fulfilled only the CASPAR criteria, and only four patients (0.6%) fulfilled only the ASAS pSpA criteria. Regarding overlap between the different criteria sets in the different regions, the same trends were seen, and no substantial interregional differences were found (data not shown).

Comparison between patients fulfilling the ASAS imaging arm split by presence of HLA-B27 and the clinical arm only
Disease characteristics of patients fulfilling the imaging arm or the clinical arm only are depicted in Table 4. In addition, characteristics of patients fulfilling the imaging arm are presented on the basis of the presence or absence of HLA-B27. Only patients who have data available on HLA-B27 and imaging are included in Table 4. There were more male patients in the HLA-B27-positive imaging arm (74.1%) than in the HLA-B27-negative imaging arm (50.4%) and the clinical arm (53.1%). Psoriasis was seen more frequently in the group of HLA-B27-negative patients fulfilling the imaging arm. On the contrary, enthesitis and dactylitis were relatively more common in the patients who fulfilled only the clinical arm. A positive family history was also more frequently seen in the clinical arm than in the imaging arm (independent of HLA-B27 status).

Discussion
Appropriate diagnostic criteria for axSpA and pSpA do not exist and, in the absence of an unequivocal gold standard, will never be developed, but various classification criteria are available. These classification criteria have in common that they have been developed using the external standard 'expert opinion'. But expert opinion is not an equivocal and homogeneous construct and may potentially integrate different pictures of the disease SpA. The present study reveals that, in our cohort, most patients diagnosed as having SpA fulfilled multiple classification criteria sets, which adds to the credibility of the construct of SpA as a recognizable entity. Although the substantial overlap between the different criteria sets for patients with both axial and with peripheral symptoms could be expected, the fact that different criteria sets have been developed for different target populations (e.g., the ESSG, focussed on the whole concept of SpA; the ASAS axial SpA criteria for patients with SpA axial symptoms) could have precluded overlap in different regions of the world. In the present study, we have shown that the significant overlap was consistent all over the world, thus suggesting that rheumatologists worldwide use similar 'pictures' of what SpA is. In other words,  they operationalise the construct of SpA approximately similarly. In addition, the huge overlap (e.g., 74.1% of the patients fulfilled all three criteria sets, and only 7.6% fulfilled one set only) confirms that the criteria for SpA are highly credible. As mentioned already, large interregional differences in the fulfilment of classification criteria were not found. This is remarkable in the light of all genetic and environmental differences, as well as differences in resources and health care systems around the world. In fact, it appears that the clinical picture-and consequently the diagnosis-of SpA is remarkably homogeneous around the world, despite all possible differences in, for example, genetic background, prevalence and medical training.
Of course, there were some notable differences. The most important one was that more patients with PsA and fewer patients with axial disease were included in North America than in other regions. We do not think this reflects a true difference in the prevalence of the different subtypes of the disease. This is supported by a recent systematic review that pooled population prevalence estimates for SpA, AS and PsA in geographic areas [11]. The prevalence of both the axial and peripheral subtypes was, on average, comparable in North America to other parts of the world. More likely, the difference could be due to local factors, such as a difference in areas of interest of the doctors including patients or referral centres for a certain disease. One reason may be the perception of PsA as belonging to SpA or not. It is well known that some rheumatologists view PsA as a separate entity and others view PsA as a subtype of SpA. Apparently, more doctors in North America than in other parts of the world consider patients with PsA as having SpA.
Regarding the inclusion criteria of the study, doctors were required to include patients with SpA only if they thought the patient would fulfil the ASAS SpA criteria (either peripheral or axial). However, fulfilment of the ASAS criteria was not formally checked before inclusion, as described in the Methods section above. When analysing the data, it became clear that only 85.5% of the patients actually did fulfil the ASAS SpA criteria, ranging from 73.6% in North America to 91.4% in North Africa. This implies that the large majority of patients with SpA are indeed covered by the criteria, pointing to high sensitivity but also indicating that doctors diagnose SpA in patients who do not fulfil the ASAS criteria. However, we would like to make a critical comment which relates to a limitation of the present study. The fact that rheumatologists were initially asked to include ASAS SpA patients (although fulfilment of the ASAS criteria was not met in all patients) could very well have led to an 'a priori' high percentage of ASAS classification criteria fulfilment. This could have led to an overestimation of performance of sensitivity of the criteria.
The ASAS classification criteria were developed in recent history. The criteria were validated in an international study of more than 600 patients with chronic back pain of unknown origin. In the ASAS study population, the ASAS criteria compared favourably with other previously established criteria sets with regard to sensitivity and specificity. In our study, if patients with axial symptoms were picked up by one criteria set only, of all sets tested, the ASAS axSpA criteria were most sensitive (although the others performed well, too). The latter could be due to the fact that the ASAS-COMOSPA study is not a cohort of early disease (as reflected by 65% modified New York criteria positivity). Prior studies have shown that performances of, for example, ESSG and Amor criteria in early disease were (slightly) worse than the ASAS criteria [12]. A more likely explanation is that the rheumatologists were asked to include patients fulfilling the ASAS criteria.
Although the imaging arm of the ASAS classification criteria is broadly recognized as highly specific, there has been debate on the validity of the clinical arm of the ASAS criteria, which has not been well received by different national and international health care systems. In the literature, it has been argued that patients fulfilling only the clinical arm of the ASAS axSpA criteria should not be considered as having 'true axSpA'. A reason why the clinical arm of the ASAS axSpA criteria has been developed is that MRI is not universally available. In our cohort, in which a large proportion of patients did not undergo MRI, our results demonstrate the value of the clinical arm of the ASAS criteria for scientific research. We found that patients fulfilling the clinical arm were remarkably similar to patients fulfilling the imaging arm with respect to the presence of many SpA features.
Strengths of the study are the multi-national cohort and the large number of patients included, which is unique, to our knowledge. Unfortunately, no control group was available, and therefore true specificity of the different classification criteria sets could not be calculated. Another limitation of the study is the relatively high number of missing values, especially when it comes to key items such as HLA-B27 and MRI. Unfortunately, this is a direct consequence of normal clinical practice: If sufficient information has been collected to make a diagnosis, further testing is often not performed (e.g., to save expenses).
We can conclude that, despite the heterogeneous character and varying prevalence of SpA as a disease across the world, similar patients are identified as having SpA by rheumatologists worldwide. Moreover, patients with the diagnosis of SpA usually fulfil multiple criteria sets, providing validity to the criteria, including the relatively new ASAS SpA criteria as well as to the concept of SpA. We emphasize that classification criteria for SpA were developed for use in epidemiological and clinical research and are not suitable for use as diagnostic tools in clinical practice. Abbreviations: HLA-B27 Human leucocyte antigen B27, IBP Inflammatory back pain, IBD Inflammatory bowel disease, NSAID Non-steroidal anti-inflammatory drug, CRP C-reactive protein