Skin thickening evaluated by the mRSS is a measurement used often in the clinic to assess patients with either lSSc or dSSc [7, 8, 24]. Unfortunately, relatively large intra-observer or inter-observer variability decreases its applicability for individual patients in the clinic. The coefficient of variation is about 20%, indicating that a change needs to be greater than 20% to be greater than the variability of the measure (similar to the joint counts in rheumatoid arthritis) and is further confounding by edema and the need for a 6–9-month trial [25].
In our study, using an 18-MHz ultrasonic probe, we calculated inter-observer and intra-observer correlation coefficients of 0.88 and 0.93. This corresponds to an intra-observer coefficient of variation of 2.2%, significantly better than that for the mRSS.
Scheja et al. demonstrated in a study of 41 patients with SSc that inter-observer variability when using ultrasound to assess skin thickness in the phalanx, hand and forearm was only 1%, 4.2% and 0.0016%, respectively when using a 20 MHz probe [26]. In addition, Moore et al., who established a 17-point skin HFU scoring system calculated correlation coefficients of 0.93 and 0.95 for inter-observer and intra-observer dermal measurements [14]. These findings support our study results.
ROC analysis defined a minimum detectable difference (MDD) for HFU assessment of skin thickness in SSc in this data set. The AUC was 0.831 (P < 0.001) with the best deflection-point at 7.4 mm. Sensitivity was 77.4% and specificity was 87.1% to separate active from inactive disease using the EUSTAR-DAI at a cut point of 2.5. Thus, we identified a skin thickness of 7.4 mm as the MDD that best separated normal skin from skin affected by SSc when we used phalanx/hand/forearm/leg/chest as a composite measure, which were previously defined as the five sites of ultrasound assessment [21, 22].
In view of the facts that more patients with lSSc were recruited into the study, we utilized the composite phalanx/hand/forearm site and local phalanx site separately for extensive ROC analyses (see Additional file 5: Figure S5 and Additional file 6: Figure S6). For skin thickness at the phalanx/hand/forearm sites, the AUC was 0.869 and the cutoff value was 4.3 mm, with the same sensitivity compared as the five sites (77.4%) but with higher specificity (93.5%). For the local phalanx site, the AUC was 0.946 and the cutoff value was 1.3 mm, with 87.1% sensitivity and much more higher specificity (96.8%) to discriminate thickened skin from normal skin. These figures suggest that a better distinction can be shown using fewer ultrasound sites in patients with lSSc, especially if the sites assessed are tailored to the type of patient and to the sites of clinical disease.
In the present study, we demonstrated that TST in patients with SSc is significantly greater than in healthy controls (P < 0.001), similar to other research [15, 16]. There were also statistically significantly differences locally, at the forearm, hand, phalanx and chest. There was no difference in skin thickness on the legs in patients with SSc and normal controls. This may have been because in the study there were small numbers of patients recruited with lower extremity involvement, but further research will be needed to examine this finding.
The TST correlated positively with the mRSS and correlated negatively with disease duration, similar to data reported by others [22]. Skin thickness reduced as skin went from interstitial edema through thickening to atrophy [21]. This point could be illustrated by the correlation between the mRSS and TST. The coefficient for correlation between the mRSS and HFU decreased from 0.63 (in patients with disease duration less than 1 year) to 0.40 (in patients with disease duration of 1–3 years) over time in a longitudinal study by Hesselstrant et al. [22]. We found that patients with a normal clinical skin score (mRSS = 0) still had some increased thickening identified by HFU (P < 0.01), as have others [27]. This finding may help explain the weaker correlation between the mRSS and HFU over time.
The mRSS is considered by many to be one of the most important aspects in the classification of different SSc subtypes [28]. Sedky et al. showed that ultrasound TST in patients with dSSc was thicker than in those with lSSc (P = 0.002), especially in the chest wall [16]. We did not find differences between dSSC and lSSc using HFU-TST. However, the number of patients with dSSc was small (N = 4), so the power to identify differences between dSSc and lSSc was very low.
The skin thickness measured by HFU in the chest wall of patients with SSc was greater than in normal controls (P < 0.05), despite the fact that 87.1% of our patients had lSSc and thus, by definition, these patients did not have clinical skin thickening on the chest (See Table 2). The meaning of this relative skin thickening at the chest in patients with lSSc as assessed by HFU will require further research in longitudinal studies, although it is congruent with the finding that even when mRSS = 0 in dSSc, the skin is measured as thicker when assessed by HFU.
There was low-to-moderate, positive correlation between HFU-TST and the EUSTAR-DAI (r = 0.436, P = 0.014) in patients with SSc; this suggests that thickened skin could predict more active disease, although this degree of correlation indicates either the need for many more patients or that active disease is predicted by more than only skin thickness. HFU-TST was also only moderately positively correlated with the mRSS (r = 0.416, P = 0.020) and the correlation remained in the multivariate regression model (t = 0.335, P = 0.044). This supports the possibility that the mRSS might reflect the increased skin thickness primarily in the early disease phase and that skin thickness and the mRSS could become disconnected over time [17, 22, 29].
It may be important to recognize and treat SSc in its edematous phase during which patients are more responsive to medication compared with the sclerotic or atrophic phase [29]. Hesselstrand et al. reported that patients in the edematous phase of their disease (usually short duration) had increased skin thickness assessed by HFU, but with low echogenicity (which represents high water content, i.e. more interstitial fluid). It is intriguing in this study that skin thickness negatively correlated with skin echogenicity measured by ultrasound (P = 0.001). That relationship implies that the edema results in increased skin thickness and decreased skin echogenicity. Over time, echogenicity increased, correlating with increasing fibrosis [29]. Although our patients had early disease, the finding that TST negatively correlated with disease duration in our data set indicates that these patients had a range of degrees of edema and fibrosis, going from the edematous to the fibrotic/atropic phase as disease duration increased. Overall, the this suggests that objective HFU techniques may be able to differentiate early edematous disease from early fibrosis and may even be able to detect thickening that is not clinically evident. These findings will need corroboration and significant research to ascertain their meaning.
Overall, the mRSS is simple to perform for the assessment of skin changes, it can estimate skin thickness and it samples large areas of the skin, but observer bias and low sensitivity detracts from its accuracy [30]. HFU may be able to examine both skin thickness and interstitial edema, which may mean this modality is able to better examine prognosis and determine drug response [31]. However, standardized imaging, decreasing operator variability, definition of appropriate skin sites and examination of a large range of patients need significant further research [21].
This study has some limitations. First, it is cross-sectional and a longitudinal study would be beneficial, which is being planned. Second, it involves relatively few patients from a single center. However, it is an exploratory effort and has given insights into the MDD and has identified some degree of correlation with a measure of disease activity. A study of more patients is needed, involving a broader set of clinical measures such as the Clinical Response Index for SSc (CRISS) or including gastrointestinal, specific myocardial and lung involvement. Third, we needed to use total skin measurement using HFU because our data precluded separation of the epidermis and dermis. An analysis in a group of patients with earlier more diffuse disease may help address this issue. Further we found no difference between limited and diffuse disease using HFU and the mRSS and did not find differences when assessing skin on the legs and thus, inclusion of more patients with diffuse disease and lower extremity involvement would have been helpful.