Therapeutic benefit of balneotherapy and hydrotherapy in the management of fibromyalgia syndrome: a qualitative systematic review and meta-analysis of randomized controlled trials

Introduction In the present systematic review and meta-analysis, we assessed the effectiveness of different forms of balneotherapy (BT) and hydrotherapy (HT) in the management of fibromyalgia syndrome (FMS). Methods A systematic literature search was conducted through April 2013 (Medline via Pubmed, Cochrane Central Register of Controlled Trials, EMBASE, and CAMBASE). Standardized mean differences (SMDs) and 95% confidence intervals (CIs) were calculated using a random-effects model. Results Meta-analysis showed moderate-to-strong evidence for a small reduction in pain (SMD −0.42; 95% CI [−0.61, −0.24]; P < 0.00001; I2 = 0%) with regard to HT (8 studies, 462 participants; 3 low-risk studies, 223 participants), and moderate-to-strong evidence for a small improvement in health-related quality of life (HRQOL; 7 studies, 398 participants; 3 low-risk studies, 223 participants) at the end of treatment (SMD −0.40; 95% CI [−0.62, −0.18]; P = 0.0004; I2 = 15%). No effect was seen at the end of treatment for depressive symptoms and tender point count (TPC). BT in mineral/thermal water (5 studies, 177 participants; 3 high-risk and 2 unclear risk studies) showed moderate evidence for a medium-to-large size reduction in pain and TPC at the end of treatment: SMD −0.84; 95% CI [−1.36, −0.31]; P = 0.002; I2 = 63% and SMD −0.83; 95% CI [−1.42, −0.24]; P = 0.006; I2 = 71%. After sensitivity analysis, and excluding one study, the effect size for pain decreased: SMD −0.58; 95% CI [−0.91, −0.26], P = 0.0004; I2 = 0. Moderate evidence is given for a medium improvement of HRQOL (SMD −0.78; 95% CI [−1.13, −0.43]; P < 0.0001; I2 = 0%). A significant effect on depressive symptoms was not found. The improvements for pain could be maintained at follow-up with smaller effects. Conclusions High-quality studies with larger sample sizes are needed to confirm the therapeutic benefit of BT and HT, with focus on long-term results and maintenance of the beneficial effects.


Introduction
Fibromyalgia syndrome (FMS) is a debilitating condition of almost unknown etiology and pathogenesis that is characterized by widespread musculoskeletal pain and tenderness, as well as secondary symptoms like fatigue, depression, irritable bowel syndrome and sleep disturbances. A standard therapy regimen is lacking and the condition causes high direct and indirect costs (for example, health care use, sick leave) [1]. In a survey of the German population using the modified American College of Rheumatology (ACR) 2010 preliminary diagnostic criteria for FMS [2], the overall prevalence of FMS was found to be 2.1% to 2.4% in women and 1.8% in men; however, the difference was not statistically significant [3]. Adequate treatment recommendations are therefore needed both in the interests of the welfare of the patient and for economic reasons. Current evidencebased guidelines are built on the fact that there is no single ideal treatment for FMS. Patient-tailored approaches are emphasized recommending non-pharmacological and pharmacological interventions according to individual symptoms (for example, pain, sleep problems, fatigue, and depression). Especially, self-management strategies (for example, exercise, psychological techniques) involving active patient participation should be an integral component of the therapeutic plan [4].
In this context, balneotherapy (BT) and hydrotherapy (HT) offer interesting treatment alternatives and are commonly used additional interventions in the management of FMS, despite ongoing debate about their effícacy. Prior research (an Internet survey of 2,596 people with FMS) found that around 26% of individuals suffering from FMS use pool therapy and 74% heat modalities (warm water, hot packs). The interventions perceived to be most effective (effectiveness rating ≥6.0) on a scale of 0 to 10, with 10 being most effective, were rest, (6.3 ± 2.5) (mean ± SD), heat modalities (6.3 ± 2.3), pain medication (6.3 ± 2.4), sleep medication (6.5 ± 2.7) and pool therapy (6.0 ± 3.0) [5].
However, the mechanisms by which immersion in mineral or thermal water or application of mud alleviates the symptoms of FMS are almost unknown. Pain, the key symptom of FMS, may be relieved by the hydrostatic pressure and the effects of temperature on the nerve endings, as well as by muscle relaxation [6]. Furthermore, it has been shown that thermal mud baths increase plasma levels of beta-endorphin, thus explaining their analgesic and antispastic effect, which is particularly important in patients with FMS [7]. The beneficial effects of water treatments are probably the result of a combination of specific (for example, buoyancy, aquatic resistance, heat) and unspecific effects (for example, change of environment, spa-scenery).
However, the definitions BT, HT and spa therapy are frequently confused and the terms tend to be used interchangeably [8]. In contrast to HT, which generally employs normal tap water, BT uses thermal mineral water from natural springs, but also natural gases (CO 2 , iodine, sulfur, radon, et cetera), peloids (mud) and other edaphic remedies (for example, hay) for medical treatment. BT is usually practiced in spas with their special therapeutic atmosphere as part of a complex therapy program, which is why the term is often used synonymously for spa therapy. Thalassotherapy is a special form of BT or spa treatment that uses seawater and the seaside climate. New definitions, such as health resort medicine, rather than BT and spa therapy, have not reached general acceptance [9].
Prior systematic reviews and meta-analyses covering BT (spa therapy) and HT in FMS have respectively covered the literature up to May 2011 [6], and December 2008 [10]. The systematic review by Terhorst et al. (2011) [11] on complementary and alternative medicine analyzed, among others, 11 studies on BT up to December 2010.
The network meta-analysis by Nüesch et al. (2013) [12], which investigated pharmacological and non-pharmacological interventions (land-and water-based aerobic exercise, multicomponent treatment (MCT), BT and cognitive behavioral therapy (CBT)), covered the literature up to 2011. In summary, these reviews found some evidence of beneficial effects arising from BT and HT, however, due to methodological flaws, their efficacy remains unclear.
Despite these limitations, German and Israeli guidelines recommend temporary use of BT and HT (grade B/C) [13,14]. Furthermore, BT and HT are often part of MCT (at least one exercise and one psychological component) but they are not analyzed separately. In several evidencedbased guidelines and reviews, MCT and aerobic exercises (land-based or water-based) are strongly recommended [12][13][14][15]. The aim of the present review is to offer an update of the literature on BT and HT in FMS, with special focus on separate analyses of the different treatment modalities.

Methods
This systematic review was performed according to the statement, preferred reporting items for systematic reviews and meta-analyses (PRISMA) [16] and the recommendations of the Cochrane Collaboration [17].

Literature search
Electronic bibliographic databases (Medline via Pubmed, Cochrane Central Register of Controlled Trials, EMBASE, and CAMBASE) were screened up to April 2013. The search strategy was constructed around a broad range of balneotherapeutic and hydrotherapeutic treatments: BT, HT, thalassotherapy, spa therapy, cryotherapy, thermotherapy, and phytothermotherapy combined with FMS. The search filter was restricted to randomized controlled trials (RCTs). Reference lists of relevant articles and reviews were examined for additional studies.
The search strategy for Pubmed was as follows: ("FMS" OR "fibromyal*") AND "RCT" AND ("BT" OR "HT" OR "thalassotherapy" OR "spa therapy" OR "thermotherapy" OR "phytothermotherapy" OR "aquatic" OR "hydrogalvanic" OR "cryo" OR "pool exercise" OR "water-based" OR "pool-based" OR "stanger" OR "mud" OR "thermal water" OR "bath" OR "peloid" OR "natural therapeutic gas" OR "radon"). The search strategy applied a combination of text and keywords (medical subject heading (MeSH) terms) and was adapted for each database if necessary.

Inclusion and exclusion criteria
The criteria were as follows: 1) types of study: RCTs were only eligible if they were published as full paper articles. No language restrictions were made; 2) types of participants: patients of any age diagnosed with FMS on recognized criteria were included; 3) types of intervention: studies that compared any kind of BT (mineral/thermal water, spa treatment, thalassotherapy, thermotherapy, peloids, natural therapeutic gas) or HT (treatment in plain water with or without exercise) with no treatment or any active treatment. Studies were excluded if BT/ HT treatments were not the main intervention or if the intervention in treatment and control group were the same and only the co-therapies differed; and 4) types of outcome: studies assessing at least one symptomspecific outcome of the major FMS symptoms [18], such as pain (for example, tender point count (TPC), visual analog scale (VAS)), fatigue, sleep disturbances, depressive symptoms, health-related quality of life (HRQOL) and/or relevant pain-related psychological issues such as self-efficacy pain and/or objective tests of physical fitness, were included.

Data extraction
The authors (JN, CS) of the review presented here independently extracted relevant study information (for example, participants, characteristics of the intervention and control, outcome measures, results) using predefined data fields, including risk-of-bias indicators. If necessary, existing inconsistencies were solved by discussion, and consensus achieved. For quantitative analysis the mean post-test values, or change scores when available, were used.

Risk of bias assessment
The risk of bias for each study was determined independently by the same two authors (assessment of information in study reports) using the criteria of the Cochrane risk-of-bias tool. Disagreements were resolved by discussion to achieve consensus.
Summary assessment of risk-of-bias key domains (selection, performance, detection, attrition and reporting bias), was based on the three-tiered rating style as proposed by Higgins et al. [19]. Performance bias was not considered a key domain due to the required participatory nature of BT and HT. Studies with a high risk of bias in one of the key domains or unclear risk in at least two key domains were considered to be at high risk of bias. Studies with unclear risk in one of the key domains were considered to have unclear risk of bias. Only studies with low risk of bias in all key domains were graded as having low risk of bias. Analysis was done with the Review Manager (RevMan) version 5.2 risk-of-bias tool from the Cochrane Collaboration [21].

Missing data
In the case of reported median, low and high end of range and sample size only, we estimated the mean and variance using the appropriate formula as mentioned by Hozo et al. [20].

Data analysis and assessment of heterogeneity
RevMan version 5.2 [21] was used to analyze the data and perform testing of heterogeneity, using the I 2 statistic, with the following categories: I 2 = 25%, no heterogeneity; I 2 = 50%, moderate heterogeneity; I 2 = 75%, strong heterogeneity [22], and P ≤0.1 for the Chi 2 test showing significant heterogeneity. We used Cohen's categories to evaluate the magnitude of the effect size, calculated by standardized mean difference (SMD), with g >0.2 to 0.5, small effect size; g >0.5 to 0.8, medium effect size; and g >0.8, large effect size. We used the following modified levels of evidence descriptors to classify the results: (1) strong, if there were consistent findings among multiple (≥3) RCTs with low risk of bias; (2) moderate, if there were consistent findings among multiple high-risk RCTs and/or one low-risk RCT; (3) limited, with one high-risk RCT; (4) conflicting, with inconsistent findings among multiple RCTs; and (5) no evidence, no RCTs [23]. Whenever possible we used the results from intention-to-treat analysis. Negative SMDs indicate a beneficial effect of the experimental intervention.

Subgroup and sensitivity analysis
Where at least two studies were available, subgroup analyses were pre-specified for different types of intervention. Additionally, control groups were compared (no treatment/active treatment). Waiting list or treatment-as-usual were classified as non-intervention control. The subgroup analyses were also used to examine potential sources of heterogeneity. Sensitivity analyses were performed for studies with high versus low risk of bias, respectively, for studies with serious flaws in one or more key domains and for sample size per treatment arm.
Of the 37 articles that were assessed, four were excluded because of insufficient data reporting [7,[44][45][46]. A further three studies were excluded because the main treatment (BT/HT) was the same both in the treatment and control group (Altan et al. [47]: baths in mineral water with and without exercise; Ammer and Melnizky [48]: whirl baths with and without etheric oils; Calandre et al. [49]: baths with two different kind of exercises). The remaining 30 articles included 2 reporting followup data to already included studies [50,51], and a further 4 reporting on the same study publication but with different outcome measures [52][53][54][55].

Description of included trials
The characteristics of the included studies are detailed in the following tables (see Additional files 1 and 2). The studies were separated according to treatment modalities: Additional file 1: HT with the subgroups, HT with exercise (n = 10) and hydrogalvanic (Stanger) bath (n = 2). Additional file 2: BT with the subgroups mineral water (n = 3), spa therapy (n = 3), sulfur bath (n = 2), thalassotherapy (n = 1), phytothermotherapy (n = 1), mud (n = 1), acratothermal water (n = 1). Study characteristics for all trials included in qualitative synthesis are summarized below.

Inclusion and exclusion criteria
In all the studies, FMS was diagnosed according to the ACR criteria [80].

Reporting of adverse events
Adverse events were reported in four studies [56,65,68,79]. In all cases the adverse events were not indicated as a cause of interruption or dropouts. Seven studies [57,58,[72][73][74][75]78] clearly reported that there were no adverse events. The remaining 13 studies gave no information on adverse events. No serious adverse events were reported (for details see Additional files 1 and 2).

Risk of bias
Only 5 of the 24 studies included had low risk of bias [56,57,64,67,68]; a further 5 were assigned as having unclear risk (studies with one unclear judgement; unclear allocation: [65,71,77]; selective reporting: [66]; unclear outcome assessment blinding: [79]). The remaining 14 studies were at high risk of bias, as they had two or more unclear judgements in the key domains, including 5 studies with serious flaws in one or more key domains [59,60,63,75,76]. For details see categorization of risk of bias at the individual study level (see Additional file 3).

Sequence generation and treatment allocation
Of 24 studies, 10 had unclear risk of selection bias in both domains, 2 were considered to be at high risk because of serious randomization flaws [59,75]. Half the studies reported adequate randomization, but only seven adequate allocation concealment [4,6,21,47,52,69,80].

Similar baseline
All studies had low risk of selection bias with the exception of two, one with unclear risk (unclear reporting; [76]) and one with high risk due to significant differences in baseline characteristics in a major FMS symptom (TPC) [63]).

Blinding of participants and personnel
Performance bias was not considered a key domain. Due to the participatory nature of BT and HT blinding is not feasible.

Incomplete outcome data
Of the 24 studies, 19 were assigned low risk of attrition bias (criteria: attrition rate reported, not exceeding 20% or intention-to-treat analysis). Five studies were assigned unclear or high risk of bias because two had high dropout rates [60,63] (high risk of bias) and the dropout rate was not clearly reported in three studies [70,76,78].

Selective reporting
Two studies were assigned high risk of bias [59,75]; thus, reporting was insufficient and not in alignment with the values presented in tables. A further five had unclear risk of reporting bias due either to double reporting [66,70] or incomplete/inconsistent outcome reporting [60,61,63].

Blinding of outcome assessment
Fifteen of the 24 studies had low risk of detection bias for outcome assessment, eight had unclear risk [58,59,61,62,69,72,75,79], and one was assigned a high risk of bias [76] (see Additional file 4).

Comparison group
Subgroup analysis of the type of comparison group suggests that RCTs comparing HT to no treatment (usual care) or other types of active control had a significant effect, but not when compared to land-based exercise (see Additional file 5).

Analysis of overall effects
Taking into account all available studies, regardless of treatment modality, meta-analysis provided moderate evidence for a medium reduction of pain at the end of treatment; SMD −0.57; 95% CI −0.77, −0.38; P <0.00001; I 2 = 45%. Results are shown for HT, BT and diverse treatments: hydrogalvanic bath (Stanger), mud therapy, sulfur bath and thalassotherapy (see Additional file 6).

Sensitivity analyses
Sensitivity analysis according to potential risks of bias showed no significant difference between the effect size of pain (HT) at the end of treatment and risk of bias (see Additional file 7). Analysis according to sample size (<25, >25) shows a slightly larger effect size and broader CIs in small studies (P = 0.54) (see Additional file 8).
Statistical heterogeneity of analysis for the effect size of pain in the BT group (I 2 = 63%) was substantially decreased (I 2 = 0%) by removing the study of Ardiç et al. [69] (pharmacological co-therapies not allowed; nonintervention control group). The magnitude of the effect size was decreased to SMD −0.58; 95% CI −0.91, −0.26, P = 0.0004, corresponding to a medium effect.

Publication bias
Visual analysis of the funnel plot shows a symmetric picture, with one outlier study already identified by sensitivity analysis [69]. This indicates that the results of the meta-analysis can be regarded as robust against potential reporting bias (see Additional file 9).

Summary of evidence
The primary aim of this systematic review and metaanalysis was to determine the therapeutic benefit of BT and HT in the management of FMS, with special focus on separate analyses of the different treatment modalities. For HT with exercise we found moderate-to-strong evidence (consistent findings among ≥3 RCTs with low risk of bias) for a small improvement in pain (eight studies, 462 participants; including three low-risk studies, 223 participants) and HRQOL (seven studies, 398 participants; including three low-risk studies, 223 participants). Follow-up data provided moderate evidence (consistent findings among multiple high-risk RCTs and/or one low-risk RCT) for maintenance of improvement, at least with regard to pain (four studies, 254 participants; including one low-risk study, 125 participants). However, no evidence was found for improvement of depressive symptoms (BDI) and TPC. Furthermore, no group difference was found when comparing water-based exercise to land-based exercise. This is in accordance with the review by Häuser et al. from 2010 [81]. We found moderate evidence of a medium-to-large effect on pain and TPC for BT with mineral/thermal water (five studies, 177 participants; including three high-risk and two unclear-risk studies), a medium effect on HRQOL, and no significant effect on depressive symptoms (BDI). Moderate evidence for maintenance of these improvements was found at follow up. However, the effects were smaller. The results confirm the conclusions of other reviews on BT [6,82].
Besides these two larger groups, further subgroup analyses were not possible due to the limited number of available studies and/or provided data. This is also true of the follow-up data provided, where only a few studies remained for statistical analyses. The evidence on the long-term effects that can be concluded from this metaanalysis is limited.
No conclusions can be drawn on hydrogalvanic/Stanger baths, thalassotherapy, mud baths, phytothermotherapy or sulfur baths, which were only represented by one study each. So as not to lose the information provided by these studies, we pooled all the available studies in an overall analysis, which showed similar effects (reduction of pain) to HT or BT.
Concerning safety, only preliminary conclusions can be drawn, because reporting of adverse events and the reasons for dropouts was poor. The data suggest that HT and BT are safe and well-accepted treatments, which is in line with other recommendations [10,83], and we should not forget the daily experience of patients and the general population practising some kind of BT or HT.
Male participants were rarely included in the study populations, and separate gender comparisons were not reported. Evidence for treatment effects in the management of FMS in men is limited. Furthermore, it has to be taken into account that the population of FMS patients participating in a trial is selected. Generalisability may be restricted [84].

Limitations
As so often in evidence-based approaches to nonpharmacological modalities, limitations are inherent and inevitable. This is especially true for BT, which depends on local conditions such as climate or water composition and provides a large variety of treatment modalities. Absence of blinding is also inevitable wherever treatment requires active participation on the part of the study subjects and clinicians.
There are also several methodological limitations. The analyses were underpowered due to the small number of studies and patients included. Analysis according to sample size (<25, >25) showed a slightly larger effect size and broader CIs in small studies (P = 0.54). The methodological quality (risk of bias) of the included studies varied, and was slightly better in HT studies than BT studies. Although some studies had low risk of bias, the majority -especially older studies -were associated with unclear or high risk of bias. Nevertheless, sensitivity analyses could show, at least in HT studies, that the effect sizes were not affected by methodological bias. Due to the limited number of BT studies, sensitivity analyses could not be performed here. Furthermore, the sample sizes in the BT studies were very small (<25 per treatment arm), except for one study [79]. Unfortunately, in this study, no results were collected for the control group after treatment. Thus, the data were not analyzed and only follow-up data were used.
Heterogeneity was not present in the HT studies, in contrast to considerable heterogeneity in the BT studies. This could be explained by the fact that co-therapies were not allowed in one study, which also had a noninterventional control group [69]. As far as selection bias is concerned, it is not possible to assess the extent to which the results may be influenced. Most of the studies reported unclear randomization methods as well as insufficient allocation concealment. The studies that allowed co-therapies did not control their effects for dosage or changes in concomitant therapies.
A strength of this review is the homogenous pool of treatment approaches selected for subgroup analyses, based on the professional expertise in the field of balneology of one of the authors (JN). The evidence of the integrated effect sizes seems robust, especially since publication bias is not plausible after visual analysis of the funnel plot, showing a symmetric picture, except for one outlier study [69] already identified by sensitivity analysis. Commencing from a systematic and thorough search of the literature (CS) we are confident not to have missed any larger important study.

Conclusions
In summary, based on the limited number of studies analyzed, small sample sizes and risk of bias attributed to the studies, it appears difficult to determine the overall benefit of BT and HT. There is a risk of overestimating the evidence on the efficacy of HT and even more so BT. However, although evidence is limited, recommendations in recent evidence-based interdisciplinary guidelines emphasize a patient-tailored approach with aerobic exercises, CBT and MCT according to the key symptoms of FMS [4]. In this context, BT and HT offer a wide variety of treatment opportunities, which can be perfectly adapted to the patients' abilities and preferences. Unlike pharmacological treatments with questionable clinical relevance and frequent side effects [12], the results of this review underline the potential value of BT and HT as supplementary therapy in the management of major symptoms of FMS.
In order to provide a better database for meta-analyses (internal validity), the use of a core set of outcome measures (outcome measures in rheumatology (OMERACT) [85]) including response rates is desirable. Future authors should use the consolidated standards of reporting trials (CONSORT) checklist [86] to report study results. Major interest should focus on long-term results and maintenance of beneficial effects. Given the popularity of BT and HT among patients with FMS, further studies with robust methodology are warranted to demonstrate and confirm the therapeutic benefits.