The agreement between ultrasound-determined joint inflammation and clinical signs in patients with rheumatoid arthritis

Background Ultrasound (US) is sensitive for detecting joint and tendon inflammation in patients with rheumatoid arthritis (RA). So far, which grade of abnormalities on US corresponds to clinical manifestations is unclear. This study aimed to investigate the agreement between US-detected joint inflammation and clinical signs (joint swelling and tenderness). Methods In this cross-sectional study, 22 joints of the wrists and hands were, respectively, evaluated by physical examination (PE) and ultrasound in RA patients. Gray scale (GS) and power Doppler (PD) of synovitis, detected by ultrasound, were graded by semi-quantitative scoring systems (0–3). Tenosynovitis and peritendinitis were assessed qualitatively (0/1). Results A total of 258 consecutive RA patients were included, with median disease duration of 57 months and mean Disease Activity Score based on 28 joints (DAS28)-ESR/DAS28-CRP of 4.47/3.99. In a total of 5676 joints assessed, the overall concordance rate between positive clinical signs and ultrasound-determined joint inflammation was fair (κ = 0.365, p < 0.01). In wrists, joint tenderness showed higher κ coefficient (κ = 0.329, p < 0.01) with ultrasound-determined joint inflammation than swelling (κ = 0.263, p < 0.01); however, swelling showed higher κ coefficient (κ = 0.156–0.536, p < 0.01) with ultrasound-determined joint inflammation than tenderness (κ = 0.061–0.355, p < 0.01) in metacarpophalangeal (MCP) and proximal interphalangeal (PIP) joints. Synovitis had consistently higher agreement with tenderness and swelling than tenosynovitis/peritendinitis. Tenderness and swelling had the highest κ coefficient with GS ≥ 1 synovial hyperplasia in most MCP and PIP joints, while with GS ≥ 2 synovial hyperplasia in wrists. For all 22 joints, PD ≥ 1 synovitis had the highest κ coefficient with clinical tenderness and swelling. Conclusions Synovitis had better agreement with clinical signs than tenosynovitis/peritendinitis. Joint swelling showed better agreement with US-determined inflammation than tenderness for MCP and PIP joints, while the opposite for wrists. Both tenderness and swelling are more likely to correspond to GS ≥ 2 for wrists, GS ≥ 1 for MCP and PIP joints, and PD ≥ 1 for any joint.


Background
Rheumatoid arthritis (RA) is an inflammatory disease characterized by chronic intra-articular and peri-articular synovial inflammation associated with joint destruction and function impairment [1]. Patients with RA have polyarthritis appearing as joint swelling and tenderness. These signs are identified as joint inflammation by clinicians through physical examination (PE). Swollen and tender joint counts are essential parameters to access clinical disease activity and further formulate the treatment target in RA patients, including the Disease Activity Score based on 28 joints (DAS28) [2], Clinical Disease Activity Index (CDAI), Simplified Disease Activity Index (SDAI) [3], American College of Rheumatology (ACR) response criteria [4], and the new Boolean-based remission criteria [5], in both clinical practice and trials. But joint tenderness is to some extent subjective depending on the judgment of an individual patient and clinician. Besides, it is sometimes difficult to accurately determine joint swelling by PE alone if the patient accompanied by various factors such as obesity and edema. Ultrasound (US) is a non-invasive, inexpensive, and free-of-radiation imaging technique allowing a quick and sensitive assessment of soft tissue inflammation. US shows superior sensitivity and inter-observer reliability in reflecting joint inflammation in RA patients compared to PE [6][7][8], and equivalent accuracy in detecting pathological abnormalities as magnetic resonance imaging (MRI) at finger level [9]. US examination can be used to visualize anatomically involved joints with synovial hypertrophy/ effusion using the gray scale (GS) mode, to assess the degree of synovial inflammation, and predict subsequent joint damage using the power Doppler (PD) mode [10]. At present, the most frequently employed semi-quantitative scoring system to grade the synovial hypertrophy/synovitis is proposed by Szkudlarek et al. [11]. Both GS and PD are graded on a scale of 0-3 according to the severity of synovial hypertrophy and vascularization. Several single-center studies and metaanalyses have demonstrated a predictive value of PD positivity for flare and progressive bone erosion in patients with RA [12][13][14][15]. But it remains unclear which grade of GS indicates a pathological finding [16][17][18][19][20].
Wrists, metacarpophalangeal (MCP), and interphalangeal (PIP) joints are the most frequently affected in RA [21]. Pathologic findings in these joints are considered to be representative and precursors of overall joint damage. The aim of this study was to investigate the agreement between clinical-detected signs and US features of joint inflammation in wrists and hands and further determine the grades of GS synovial hyperplasia and PD synovitis which correspond to the presence of tenderness and swelling in an individual joint in RA patients.

Methods
Patients This is a cross-sectional study which consecutively enrolled 258 patients with RA who visited the rheumatology clinic of Peking University First Hospital between February 2014 and May 2017. All these patients fulfilled the 2010 ACR/European League Against Rheumatism (EULAR) classification criteria and were more than 18 years old [22]. All patients had at least 1 tender or swollen joint out of 22 joints (bilateral wrists, MCP1-5, and PIP1-5 joints). Both clinical and US data of all patients were collected and analyzed. The usage of conventional synthetic disease-modified anti-rheumatic drugs (csDMARDs) (methotrexate, leflunomide, hydroxychloroquine, sulfasalazine); glucocorticoids; biological (b) DMARDs (adalimumab, etanercept, abatacept, rituximab); and non-steroidal anti-inflammatory drugs (NSAIDs) were recorded. Those patients with other comorbidities, for instance, psoriatic arthritis, gout, history of trauma, and/or joint replacement of a wrist/ finger, with obvious joint deformity or mutilation, were excluded in this study. This study was conducted in accordance with the Declaration of Helsinki and was approved by Institutional Medical Ethics Review Boards of Peking University First Hospital. The informed consent was obtained from each patient on entry.

Joint and laboratory assessment
Independent joint assessment for tenderness and swelling was performed by a rheumatologist who was blinded to both clinical and ultrasound data. Tender and swollen joints among 28 areas (bilateral shoulders, elbows, wrists, knees, MCPs, and PIPs) were counted. The patient's global assessment (PGA; 0-100 mm visual analog scale) and evaluator's global assessment (EGA; 0-100 mm visual analog scale) were rated individually by each patient. Serum concentrations of C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR) were measured. Disease activity was assessed by DAS28-ESR, DAS28-CRP, CDAI, and SDAI.

US assessment
Ultrasound machine LOGIQ E9 and ML 6-15-Hz linear probe were used to scan 22 joints (bilateral wrists, MCP1-5, and PIP1-5 joints) for all patients, by both GS and PD modes. The US examinations were performed in a standardized manner according to the EULAR guidelines for musculoskeletal US in rheumatology [23]. The PD settings included a pulse repetition frequency (PRF) of 500-750 Hz, low wall filter, and Doppler gain, which were adjusted to produce the higher sensitivity, but avoiding random noise visualization. The interpretation of lesions was based on the published literature of Outcome Measures in Rheumatology Clinical Trials (OMERACT) [24]. US-determined joint inflammation was defined as synovitis and/or tenosynovitis/peritendinitis. Synovitis was assessed by semi-quantitative scoring systems (0-3) proposed by Szkudlarek et al. [11], tenosynovitis (wrists and flexor tendons in fingers)/peritendinitis (extensor tendons in fingers) was qualitatively scored by 0/1. For synovitis, the maximum GS and PD grade recorded on volar and dorsal aspects for a given joint region was recorded as the GS and PD grade for the respective joint. Synovitis was defined as GS ≥ 1 and/or PD ≥ 1. Tenosynovitis was evaluated in six extensor compartments and flexor tendons within the wrist region. The presence of at least one extensor compartment or any flexor tendon in the wrist was considered to be positive. Peritendinitis was defined as the inflammation surrounding extensor tendons in fingers lacking of tendon sheaths. Both tenosynovitis and peritendinitis were defined as the presence of GS or PD signal. Either tenosynovitis or peritendinitis at the level of the MCPs and PIPs was regarded as positive finding. All the US scanning was done by one of three ultrasonographers with over 5 years of experience in maneuvering musculoskeletal ultrasound. In all patients, the clinical examination and US were blindly assessed on the same day. Five patients were randomly selected to test the inter-observer reliability of the US evaluation between the operators and analyzed by intra-class correlation coefficient (ICC). The inter-observer reliability for GS was 0.986 (95% CI 0.981-0.990) and 0.988 (95% CI 0.983-0.991) for PD, indicating the reliability was excellent.

Statistical analysis
Statistical analysis was performed with SPSS 21.0. For the descriptive analyses, continuous variables were presented as mean and standard deviation (SD) if normally distributed and median and interquartile range (IQR) if non-normally distributed. Independent t test and Wilcoxon signed test were applied, as appropriate. Dichotomous variables are presented as frequencies and percentages and were compared by χ 2 test. Absolute agreements and Cohen's kappa (κ) between clinical and sonographic findings were calculated. The κ coefficients were divided as follows: 0-0.20 = poor, 0.20-0.40 = fair, 0.40-0.60 = moderate, 0.60-0.80 = good, and 0.80-1.00 = excellent. Kappa values represent a measure of by how much the observed agreement exceeds agreement by chance. Kappa values tend to be low if data is skewed, even if agreement is very good. P values < 0.05 were considered statistically significant.

Demographics and clinical characteristics of patients
The characteristics of the 258 enrolled patients are illustrated in Table 1. Their median age was 51.2 years and median disease duration was 57 months, with 83.33% being females. The mean (SD) DAS28-ESR and DAS28-CRP were 4.47 ± 1.62 and 3.99 ± 1.51, respectively. The median (IQR)  (Fig. 1a, b).

The concordance between tenderness or swelling by PE and US-determined inflammation
The agreement of clinical signs and US-determined inflammation in all joints was fair (78.38%, κ = 0.365, p < 0.01) ( Table 3). The highest κ coefficient between clinical signs and ultrasound-determined inflammation Compared with PIP joints, MCP joints showed higher agreements and concordance rates. In PIPs, there were more tender/swollen joints without USdetermined inflammation (PE+/US−) than those non-tender/swollen joints but with US-detected inflammation (PE−/US+). The differences between clinical signs and US-determined inflammation in both wrists and the MCP joints were insignificant. In wrists, joint tenderness showed higher κ coefficient (κ = 0.329, p < 0.01) with ultrasound-determined inflammation than swelling (κ = 0.263, p < 0.01), while on the contrary, swelling showed higher κ coefficient (κ = 0.156-0.536, p < 0.01) with ultrasound-determined inflammation than tenderness (κ = 0.061-0.355, p < 0.05) in MCP and PIP joints (Fig. 2).

Concordance (κ coefficient) between clinical tender or swollen joint and GS or PD grades for each joint region
Both GS and PD grades corresponded to clinical tenderness and swelling with different coefficients in different  (Fig. 3). Joint tenderness and swelling had the highest κ coefficient with GS ≥ 1 synovial hyperplasia in most MCP and PIP joints, while with GS ≥ 2 synovial hyperplasia in wrists. For all 22 joints, PD ≥ 1 synovitis had the highest κ coefficient with both tenderness and swelling of joints by PE.

Discussion
Eradication of joint inflammation is the prerequisite for retarding the progression of bone destruction in RA; therefore, accurate assessment of the disease activity has become critical. All clinical indicators for the evaluation of disease activity include TJC and SJC, which are obtained by PE. However, the perception of pain is highly subjective and may be influenced by a number of issues, including socio-cultural factors [25][26][27]. Moreover, Basu et al. firstly provided objective neuroimaging evidence that RA is a mixed pain state displaying characteristics of central sensitization [28]. Previous studies showed that as many as 50% of patients continue to report clinically significant levels of pain despite excellent control of their peripheral inflammation [29,30]. Joint swelling may also be due to synovitis or non-vascular soft tissue, such as bony joint swelling in osteoarthritis, fat tissue, and subcutaneous edema. Additionally, high intra-and inter-observer variability in PE is inevitable [31][32][33]. Therefore, it should be highly concerned that joint inflammation in some RA patients may be inaccurately judged due to the limitations of PE. So far, the relative importance of clinical signs and whether it depends on joint region remains unclear. Ultrasound assessment of the joints shows superior sensitivity and inter-observer reliability in detecting joint inflammation compared to PE in RA population. However, which grade of abnormalities on US corresponds to clinical manifestations and is detrimental need to be clarified.
In the calculation of DAS28, the tender joint count weights twice of the swollen joint count [34]. Similarly, other clinical scores imply the same hierarchy of importance. However, in recent years, researchers have proposed the opposite ideas. Ceponis et al. showed the agreement between intra-articular PD signal and joint swelling was better than joint tenderness to palpation [35]. Similarly, Krabben et al. indicated the association of inflammation on MRI with swollen joints was stronger than with tender joints, illustrating the presence of swelling might be more significant than tenderness [36]. Besides, recent studies confirmed that compared to joint tenderness, swelling is the true predictor of subsequent radiographic progression in RA [37][38][39]. Some patients exhibit persistent chronic synovitis, manifesting as joint swelling and which may not be accompanied by pain. Thus, DAS28 has been challenged as a good assessment tool of disease activity [37]. In our study, joint swelling showed better agreement with ultrasound-determined inflammation than tenderness for MCP and PIP joints, which corresponded to the view proposed in recent studies where swelling contributes more to the joint inflammation than tenderness. On the contrary, tenderness showed better consistency with inflammatory lesions under ultrasound than swelling in wrists. To our knowledge, this has not been previously described. This discrepancy could be attributed to that the wrist, as a complex joint, is composed of multiple bones and joints, which intercommunicate through a common synovial cavity. The joint capsule is loose and thin on the dorsal side and can contain synovial folds. The relatively scattered distribution of synovial tissue may make the swelling of wrists uneasily judged well. Conversely, MCP and PIP joints are small and superficial with relatively closed joint capsule; therefore, swelling is easily found. In wrists, synovial hyperplasia evaluated by GS and synovitis by PD on ultrasound were more sensitive than tenderness and swelling in reflecting joint inflammation. While for PIPs, the tenderness and swelling detected by PE were much more frequent than the ultrasound-determined inflammation. PIPs showed lower agreements and concordance rates between clinical signs and US-determined inflammation than MCPs and wrists. Some possible reasons may contribute to this result. Firstly, compared to wrist and MCP joints, PIP joints are much smaller, more superficial with closed joint capsule; therefore, clinical signs can be more easily detected in PIP joints. Secondly, PIPs are more involved in osteoarthritis, in which joint tenderness or swelling is mainly contributed by osteophytes rather than synovitis or tendon inflammation. Thirdly, the most severe part of the joint may be missed by US as it is a two-dimensional image.
Except for synovitis, a swollen or tender joint can also be due to coexisting or alone tenosynovitis/peritendinitis. Previous results indicated better agreement of synovitis with clinical signs than with tenosynovitis/peritendinitis [36]. Compared to MCPs and PIPs, the tenosynovitis/peritendinitis of the wrists not only had the highest positive ratio, but higher consistency with PE. The tenosynovitis/ peritendinitis of the wrist which is more easily detected than relatively small MCP and PIP joints by ultrasound may be the reason.
Currently, a cut-off defining active disease from a GS point of view is not available and the optimum cut-off value to distinguish RA patients from healthy individuals' synovial thickness varies in different joints [40]. Compared to MRI, the sensitivity and specificity of GS (cut-off ≥ 1) for detecting synovitis also vary greatly among the different joint locations [41]. In the present study, tenderness and swelling were best consistent with GS ≥ 2 synovial hyperplasia in the wrists. Ogishima et al. reported that the wrists were more prone to developing subclinical synovitis than PIP and MCP joints [13], indicating the inaccurate clinical evaluations and the need for US as complementation, particularly in relatively larger joints such as wrists. Another possible explanation for this is that grade 1 GS synovial hyperplasia should be considered as non-pathologic findings. Witt et al. illustrated that grade 1 GS synovial hyperplasia can be detected in up to 15% of the joints in healthy people, indicative of its unspecific nature to RA [42]. Generally, the range of capsule distension in healthy individuals shows broad variations that may overlap with the pathologic findings on GS. With the borderline finding of GS, it is difficult to distinguish between a pathologic and a physiologic state. On the contrary, grade 2 and grade 3 GS findings are more likely clinically significant. Different from the wrists, GS ≥ 1 synovial hyperplasia showed the best consistency to tenderness and swelling for MCP and PIP joints, indicating grade 1 synovial hyperplasia may be pathological for these small and superficial joints.
One interesting finding in the present study was that PD ≥ 1 synovitis showed the best consistency with clinical tenderness and swelling for all 22 joints, implying any grade of PD should be considered clinical significant. Compared with GS, findings on PD are usually more clearly defined. Previous studies reported a high correlation between PD positivity and inflammatory cell infiltration or vascularity in synovial tissues (r = 0.84, p < 0.01) [43]. PD positivity is obviously related to histopathological activity in patients with RA. The existence of PD signal predicted clinical relapse and further radiographic progression at both the patient level and joint level [15].
Only a small portion of patients received NSAIDs and corticosteroids. NSAIDs and corticosteroids can decrease the inflammatory parameters such as tender and swollen joint counts. Furthermore, Zayat et al. have confirmed that the usage of NSAIDs may mask the GS and PD signal and result in lower scoring despite continuing disease activity [44]. Previous studies also suggested that clinical parameters, including STC and TJC, GS and PD synovitis [45,46], and the tenosynovitis [47], can be improved dramatically after the treatment of corticosteroids. Taken these together, these agents could improve both the clinical signs and inflammation detected by ultrasound, which could not substantially influence the results of this study.
There are some limitations of the study. Firstly, due to the absence of follow-up data, it is unclear whether grade 1 GS is associated with radiological progression. Clinical significance of grade 1 GS is required to address in further prospective studies. Secondly, we did not analyze the association of joint tenderness and swelling with other situations which may contribute these signs, such as edema, effusion, or osteophytes. Thirdly, either clinical signs or ultrasound examination is not a direct way to accurately assess the joint inflammation. But till now, imaging techniques especially ultrasound is a sensitive and convenient tool which is widely used in patients with RA.

Conclusions
Joint swelling showed better agreement with US-determined inflammation than tenderness for MCP and PIP joints, while the opposite for wrists. Ultrasound synovitis had better agreement with clinical signs than tenosynovitis/peritendinitis. Both tenderness and swelling are more likely to correspond to GS ≥ 2 for wrists, GS ≥ 1 for MCP and PIP joints, and PD ≥ 1 for any joint.