Discriminant validity, responsiveness and reliability of the arthritis-specific Work Productivity Survey assessing workplace and household productivity within and outside the home in patients with axial spondyloarthritis, including nonradiographic axial spondyloarthritis and ankylosing spondylitis

Introduction The arthritis-specific Work Productivity Survey (WPS) was developed to evaluate productivity limitations associated with arthritis within and outside the home. There is an unmet need for an instrument assessing similar productivity limitations in axial spondyloarthritis (axSpA), including nonradiographic axSpA and ankylosing spondylitis. Following its validation in rheumatoid and psoriatic arthritis, we aimed to assess psychometric properties of WPS in adult-onset active axSpA in this analysis. Methods Psychometric properties were assessed using data from the RAPID-axSpA trial (NCT01087762) in which researchers investigated certolizumab pegol efficacy and safety in axSpA. WPS was completed at baseline and every 4 weeks until week 24. Validity was evaluated at study baseline via known-groups defined by the first and third quartile cutoffs of patient scores to Ankylosing Spondylitis Disease Activity Score (ASDAS), Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), back pain, Bath Ankylosing Spondylitis Functional Index (BASFI), Short Form 36 health survey (SF-36) and Ankylosing Spondylitis Quality of Life Scale (ASQoL). Responsiveness and reliability were assessed by comparing WPS mean changes in ASAS 20% improvement criteria (ASAS20), BASDAI50, ASDAS clinically important improvement/major improvement (CII/MI) and BASFI minimum clinically important difference (MCID) responders versus nonresponders at week 12. All comparisons were conducted on observed cases in the randomized set using a nonparametric bootstrap-t method. Results The results confirmed the psychometric properties of WPS. AxSpA patients with a worse health state had significantly more days of household work lost, household work with reduced productivity, social activities missed and outside help hired, as well as a higher interference rate of arthritis, than patients with a better health state. Similarly, employed patients with a worse health state had significantly more work days lost or with productivity reduced, and a higher interference of arthritis on work productivity. Similar findings were also observed in the nonradiographic (nr) axSpA and AS subpopulations. The WPS was responsive to clinical changes, with responders reporting larger improvements at week 12 in WPS scores versus nonresponders. Effect sizes in responders were generally moderate to large (standardized response mean >0.5). Conclusions These analyses demonstrate that WPS is a valid, responsive and reliable instrument for the measurement of productivity within and outside the home in adult-onset axSpA, as well as the in subpopulations of AS and nr-axSpA. Electronic supplementary material The online version of this article (doi:10.1186/ar4680) contains supplementary material, which is available to authorized users.


Introduction
Axial spondyloarthritis (axSpA) refers to spondyloarthritis with predominantly axial involvement and comprises the well-known disease subgroup ankylosing spondylitis (AS), as well as a disease subgroup with little or no changes on plain radiographs, referred to as nonradiographic axial spondyloarthritis (nr-axSpA). Nr-axSpA and AS can be considered opposite ends of the same disease spectrum [1]. According to this concept, the presence of radiographic changes in the sacroiliac joints (and the presence of syndesmophytes in the spine) should be regarded as markers of disease progression and severity rather than as essential diagnostic criteria.
AS, the most frequently investigated subset of axSpA, is a chronic inflammatory rheumatic disease that affects approximately twice as many men as women and has a disease onset usually in the second and third decades of life. The prevalence of AS worldwide ranges from 0.1% to 1.4% [2][3][4]. The prevalence of spondyloarthritis in the United States was recently shown to be 1.4% [5]. Disability in AS is related to the degree of inflammatory activity causing pain, stiffness, fatigue and poor quality of sleep, as well as to the degree of bony ankylosis causing loss of spinal mobility. During early disease stages, disability is determined mostly by inflammatory activity, whereas in long-standing disease, both inflammation and ankylosis contribute to disability. The average time between onset of symptoms and definite disease diagnosis of AS has been reported to be up to 9 years [6]. At least 30% of patients have severe disease which is often associated with considerable loss of function.
Two of the common symptoms associated with ASpain and fatigue-are expected to impact work-related performance. Fatigue in patients with AS has been reported to be associated with limitations in daily life, functioning, pain and stiffness, as well as with global well-being and mental health [7]. AS patients in one study ranked "impact on work" as the area of their life most affected by their condition [8].
It has been reported that the costs associated with work disability or productivity losses at paid work (indirect costs) of AS are higher in some countries than the direct medical costs [9]. In a recently reported study conducted in the Netherlands of patients with AS under the care of rheumatologists, 11.6% of patients with paid work had an episode of AS-related sick leave in the previous 2 weeks (absenteeism) and just over 50% felt their work was adversely influenced by AS, suggesting a significant impact on presenteeism [10]. In the entire sample, 71% experienced restrictions in different types of unpaid tasks. Limitations in physical function were consistently associated with work outcome.
The key goals of treatment in AS include control of pain and stiffness, as well as reducing damage, disability and loss of function. Given that AS tends to occur in the second and third decades of life, it is expected that many people initially diagnosed with AS are in the midst of their working careers. In a 2001 review, Boonen et al. summarized findings on work participation among AS patients in different countries [11]. The proportion of patients in employment ranged from 34% to 96%, and the proportion of patients with work disability ranged from 3% to 50%.
In order to fully quantify the impact of an intervention on productivity, it is crucial to consider the entire productivity continuum, both within the work environment and within the home [12]. Preventing disability and loss of function may improve a patient's ability to stay in the workforce or maintain the ability to live independently at home. Therefore, there is interest in understanding the impact of axSpA and potential axSpA treatments on work-related productivity, including paid work as well as household work.
Historically, there has been an unmet need for an instrument designed to assess presenteeism and absenteeism in both the work and home environments [13][14][15][16]. The arthritis-specific Work Productivity Survey (WPS) was developed to fulfill the unmet need for an arthritisspecific instrument to assess the impact of an intervention on productivity within the work and home environments, in addition to daily activities during the preceding month [17]. Details of the development of the WPS are reported elsewhere [17]. The WPS has demonstrated properties of discriminative validity, reliability and responsiveness for the measurement of productivity within and outside the home in patients with active rheumatoid arthritis (RA) and psoriatic arthritis (PsA) [17][18][19].
There is no gold standard measure for assessing productivity in axSpA. During the Outcome Measures in Rheumatology (OMERACT) 9 meeting, the WPS was one of six instruments identified by the OMERACT Worker Productivity group as a possible candidate for assessing worker productivity changes in rheumatology, based on the available filter evidence (truth, discrimination and feasibility) [20].
The WPS was selected to measure the impact of axSpA on workplace and household productivity, as well as on participation in daily activities, because of the ease of use and positive response in terms of psychometric properties seen in rheumatoid arthritis (RA) [17], and the similarity in terms of work disability associated with RA and axSpA.
Our objective in writing this article was to assess the discriminant validity, responsiveness and reliability of the Work Productivity Survey in adult-onset active axSpA, as well as in AS and nr-axSpA subpopulations.

Patients and study design
Data from the double-blind period of the RAPID-axSpA (efficacy and safety of certolizumab pegol (CZP) in axSpA) trial (double-blind and placebo-controlled to  week 24, dose-blind to week 48 and then open-label to week 204) were used to conduct the psychometric validation of WPS [21]. In the first 24 weeks of RAPID-axSpA, CZP 200 mg every 2 weeks (Q2W), 400 mg every 4 weeks (Q4W) or placebo were investigated. The trial was conducted at 83 centers across North America, Latin America, Western Europe and Central/Eastern Europe from March 2010 to October 2011. Institutional review boards or ethics committees approved the protocol at each center (see Additional file 1). All patients gave written consent, and the study was conducted in accordance with the Declaration of Helsinki [21].
The primary efficacy endpoint was the ASAS20 response at week 12 [22,23]. Secondary and exploratory endpoints included the ASAS40, ASAS50, ASAS70, Total and Nocturnal Spinal Pain, physical functioning assessed by Bath Ankylosing Spondylitis Functional Index (BASFI), Bath Ankylosing Spondylitis Disease Activity Index (BASDAI) and Ankylosing Spondylitis Disease Activity Score (ASDAS), health-related quality of life (HRQoL), assessed using the Short Form 36-item health survey (SF-36), EuroQoL 5 dimensions (EQ-5D) and the Ankylosing Spondylitis Quality of Life Scale (ASQoL), and productivity measured using the WPS.

Questionnaires
The WPS is a disease-specific questionnaire used to assess the impact of arthritis on workplace and household productivity, as well as daily activities during the preceding month. It is interviewer-administered and self-reported by the patient and has a 1-month recall period [17].
The first item of the WPS addresses current labor market participation (employment outside the home), as well as providing normative and comparative data on employment status. This is a strong indicator of ability to work, because not working implies complete loss of paid productivity. Two items capture self-reported absenteeism (days of work missed) and presenteeism (days with productivity reduced by at least half ) due to arthritis, and two items capture the same concepts but apply to nonpaid (household) work. Additional items capture the respondent's estimate of the extent to which arthritis has interfered with work productivity (paid and nonpaid) on a scale of 0 to 10 (0 = no interference and 10 = complete interference), the number of days in the past month that outside help was hired because of arthritis, and the number of days in the past month family, social or leisure activities were missed because of arthritis [17].
The ASDAS is a composite score derived from a number of assessments, which are scored by the subject and physician and multiplied by a proven formula, with lower scores indicating low disease activity [24]. The BASFI comprises 10 items assessing physical function over the preceding week [25]. The summary score from this scale is the mean of the 10 items and ranges from 0 to 10, with 0 representing the best state (lower disease activity) and 10 the worst state. The minimum clinically important difference (MCID) for BASFI is one point.
The BASDAI is the most commonly used instrument to measure the disease activity of ankylosing spondylitis over the preceding week, and ranges from 0 to 10, with 0 representing the best state and 10 the worst state. The BASDAI50 is defined as an improvement of at least 50% in the BASDAI compared to baseline. A response criterion for the BASDAI is defined by an MCID decrease of at least one [26].
Total and Nocturnal Spinal Pain are assessed by two questions rated on a 0 to 10 numerical rating scale (NRS), where 0 = no pain and 10 = most severe pain. The ASQoL is an 18-item questionnaire, each item of which is used to assess the patient's current opinion on his or her quality of life [27]. Each item is scored as 1 = yes or 0 = no. The summary score is the total of the yes and no scores, thus ranging from 0 (best HRQoL) to 18 (worst HRQoL).
The SF-36 is a widely used generic HRQoL instrument used to evaluate eight health domains: physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional and mental health [28]. The eight domains are summarized in two component summaries: the Physical Component Summary (PCS) and the Mental Component Summary (MCS) [26]. Scores on the SF-36 range between 0 and 100, with higher scores indicating a better HRQoL.
The EQ-5D questionnaire is comprised of a five-item health status measure and a Visual Analogue Scale (VAS). Each of the five dimensions is divided into three levels: no problem, some or moderate problems and extreme problems, scored as 1, 2, and 3, respectively. The EQ-5D VAS records the respondent's self-rated health status on a vertical 20-cm scale, 0 to 100 graduated (0 = worst imaginable health status and 100 = best imaginable health status).
The ASAS20 response is defined as an improvement of at least 20% and absolute improvement of at least one unit on a 0 to 10 NRS in at least three of four domains: patient's global assessment of disease Activity, total spinal pain NRS score, BASFI and mean of BASDAI questions 5 and 6 concerning morning stiffness intensity and duration and the absence of deterioration in the potential remaining domain, with deterioration defined as a relative worsening of at least 20% and an absolute worsening of at least one unit [29].
The ASAS criteria for 40%, 50% or 70% improvement are defined as relative improvements of at least 40%, 50% or 70% and absolute improvement of at least two units on a 0 to 10 NRS in at least three of the four domains and no worsening at all in the remaining domain.
Further details of the questionnaires assessed are included in Additional file 2.

Data handling and statistical analysis
The assessment of the psychometric properties (discriminant validity, responsiveness and reliability) of the WPS was performed on the overall axSpA randomized set (RS) population, regardless of the randomization group. Analyses were also performed on the nr-axSpA and AS subpopulations separately.

Discriminant validity
Given the nature of the WPS questionnaire, which is composed of several single-global questions, scored and interpreted separately, and the length of the recall period of these questions, the construct validity of the WPS questionnaire was evaluated by means of discriminant validity using correlations and the known-groups validation method. The association between the responses to the WPS questions (Q)2 to Q9 and scores of the different measures of disease activity, physical functioning or HRQoL was assessed using Kendall correlation coefficients. Given the difference between the concepts assessed by the WPS questions and the other measures considered, the correlation coefficients are expected a priori to be low to moderate (low = 0 to 0.3, moderate ≥0.3 to <0.5), thus indicating a divergent validity of the measures compared. High correlations could imply a low discriminant validity and suggest that two items are measuring similar concepts. The Kendall association coefficients were evaluated between WPS Q2 to Q9 and the following selected measures: ASDAS, BASDAI, BASFI, total/nocturnal spine pain, SF-36 MCS, PCS and domains, ASQoL, fatigue NRS (from BASDAI) and EQ-5D VAS.
The known-groups validity method was used to compare the productivity scores between patients with a worse health state versus patients with a better health state. A patient with a worse health state was considered to be a patient with a higher disease activity, a worse HRQoL level or a lower physical functioning level, whereas a patient with a better health state was defined as having either a lower disease activity, a better HRQoL or a higher physical functioning level, respectively. The assumption tested through the known-groups validity method was that patients with a worse health state were expected a priori to have higher losses in paid and household work productivity (that is, higher WPS scores to Q2 to Q9) due to their disease, compared with patients in a better health state. For this purpose, known groups were formed using the first-and third-quartile scores for each outcome as cutoff points in order to avoid comparison of unbalanced groups [17]. Patients with baseline SF-36 scores at or above the third quartile, or ASQoL or fatigue NRS (from BASDAI) scores at or below the first quartile, were considered to have a "better" HRQoL, and those with SF-36 scores at or below the first quartile or ASQoL or fatigue NRS scores at or above the third quartile were defined as having "worse" HRQoL. Similarly, "better" and "worse" physical function were defined as BASFI scores at or below the first quartile and at or above the third quartile, or SF-36 physical function or SF-36 PCS scores at or above the third quartile and at or below the first quartile, respectively. Patients with ASDAS, BASDAI, or total/nocturnal spine pain score at or below the first quartile were considered to have "low" disease activity/severity, whereas ASDAS, BASDAI, or total/nocturnal spine pain score at or above the third quartile indicated "high" disease activity/severity.
The discriminant validity of the WPS was assessed using baseline observed data. To test the validity of productivity at paid work (WPS Q2-Q4), cutoff points were computed only on the patients employed outside the home, whereas the thresholds were computed on all patients for productivity within the home (Q5-Q9). Sensitivity analyses were performed using a median cutoff threshold.
A nonparametric bootstrap-t method was used to compare the mean WPS question responses between the known groups [30]. This method was favored because of the highly skewed distribution of the WPS scores. Bootstrap analyses were performed with 10,000 replications. A variance stabilizing transformation was used in order to adjust for dependence between the bootstrap values and the corresponding standard error.

Responsiveness to clinical changes and reliability
The responsiveness of the WPS to clinical changes in a patient's condition over time was evaluated by comparing the changes from baseline in productivity scores between clinical responders versus nonresponders at week 12 (as measured by ASAS20 criteria). The assumption tested was that clinical responders would have higher improvements in productivity at work outside the home and within the household versus nonresponders, reflected by higher negative changes (in absolute value) in WPS scores.
According to the primary analysis, patients were considered a "responder" if they met the criteria of ASAS20 improvement from baseline at week 12. Any patient who did not meet the criteria for ASAS20 was considered a "nonresponder".
The reliability of the WPS was tested in conjunction with the responsiveness to the ASAS20 clinical response by comparing the changes in WPS scores in patients achieving ASAS40, ASAS50, BASDAI50, ASDAS major improvement (MI), ASDAS clinically important improvement (CII), total/nocturnal back pain MCID and BASFI MCID responses at week 12 versus nonresponders.
WPS score changes from baseline at week 12 were compared between week 12 clinical responders versus nonresponders using a nonparametric bootstrap-t method. A variance stabilizing transformation was used in order to adjust for dependence observed between bootstrap values and the corresponding standard error. Bootstrap analyses were performed with 10,000 replications.
In addition to the comparison of the changes in WPS scores between the clinical responders and nonresponders, the standardized response mean (SRM) was calculated. The SRM is one of the most widely used measures of the effect size of the response, indicating whether the change was large relative to the variability of the measurements. The SRM is estimated as the mean change in scores between two visits divided by the standard deviation of that change in scores. Thresholds for the SRM (absolute values) were proposed by Cohen [31] to interpret the size of the effects: "small" from 0.2 to 0.5, "moderate" from 0.5 to 0.8 and "large" greater than 0.8.
The responsiveness and reliability of the WPS was assessed at week 12 on all RS patients, regardless of the randomization group.

Patient characteristics
A total of 325 patients were randomized, and 298 (91.7%) patients completed the 24-week phase. In the overall axSpA population, RS patients had a mean age of approximately 39.6 years, with 78.8% of patients between ages 25 and 54 years. Over half (61.5%) of the patients were male, and most (90.2%) were white (Table 1). In the AS subpopulation, the mean age (41.5 years) was higher compared to the nr-axSpA subpopulation (37.4 years), and AS patients were also more likely to be male compared to nr-axSpA patients (72.5% versus 48.3%, respectively) ( Table 1).
Patients in the overall axSpA population reported a median time since disease diagnosis of 3.9 years. In the AS subpopulation, the median time since diagnosis was 5.5 years, and for the nr-axSpA subpopulation it was 2.5 years ( Table 1). The majority (78.5%) of patients in the overall axSpA population tested positive for human leukocyte antigen (HLA) B27; this was also true for the AS and nr-axSpA subpopulations (81.5% and 74.8%, respectively). In general, BASDAI scores were similar, whereas BASMI and BASFI scores were lower, in the nr-axSpA subpopulation relative to the AS subpopulation, indicating comparable disease burden but less limitation in function and mobility in patients with nr-axSpA (Table 1).
Whereas the largest percentage of patients in the overall axSpA population and AS subpopulation were At baseline, 69.2% of patients in the overall axSpA population were employed outside the home, 12.3% were unable to work due to axSpA, 5.8% were students and 5.5% were retired. The rest were homemakers (3.1%), unable to work due to non-axSpA health problems (1.9%) or had other nonemployment status (2.2%) ( Table 2). Generally similar employment rates were noted in the nr-axSpA and AS subpopulation, although in the nr-axSpA subpopulation there were slightly more patients employed outside the home, homemakers or students, as well as fewer patients who were unable to work due to arthritis or were retired compared to the AS subpopulation (Table 2).

Baseline productivity within and outside the home
The burden of axSpA at study baseline was high, impacting workplace absenteeism and presenteeism as well as household productivity and participation in daily activities (Table 3).
Patients in jobs with some manual component had a higher number of workplace days missed per month than those in exclusively nonmanual jobs (mean 2.5 versus 1.4 days). Additionally, these patients reported more days per month with patient workplace productivity reduced by at least half compared to those with exclusively nonmanual jobs (mean 6.1 versus 4.3 days, respectively). In terms of household work, employed patients reported a high impact of axSpA symptoms, but the impact was lower compared to nonemployed patients and to patients unable to work due to arthritis (mean household work days missed per month: 4.6 versus 8.4 versus 14.6; mean days per month with household productivity reduced by ≥50%: 6.6 versus 9.4 versus 11.2, respectively).

Completion rates of Work Productivity Survey at baseline
At baseline, all patients in the axSpA RS population answered at least one of the WPS questions. The completion rates of each of the WPS questions at baseline in the RS population were very high, indicating that the instrument was clear, acceptable and representative of the study population, and therefore that the results can be generalized to a larger axSpA population. There was only one (0.3%) missing answer to WPS Q5 to Q9 at baseline. Among all employed axSpA RS patients who were required to answer WPS Q2 through Q4, the completion rates at baseline were also high, with only two (0.9%) missing answers for Q2 and three (1.3%) missing answers for Q3 and Q4.
Similarly high completion rates of WPS questions were noted in the nr-axSpA and AS subpopulations. At baseline, there was no presence of a ceiling effect, as shown by the very small number of patients with a maximal answer. In the overall axSpA population, two (0.9%) to four (1.8%) of the RS employed patients had an answer ≥30 days to WPS Q2 and Q3, respectively, and ten (4.5%) had a maximum answer of 10 to Q4. Out of all RS patients, 5 (1.5%) to 21 (6.5%) had an answer ≥30 days to WPS Q5 to Q8 or a maximal score of 10 to Q9. Similarly, no ceiling effects were noted in the nr-axSpA and AS subpopulations.
As expected, in terms of floor effect, the percentage of patients with a minimal response varied between the WPS questions, with a higher number of patients answering 0 to Q2 (work days missed in the past month, 69.5% of the employed RS population in the overall axSpA population) and to Q8 (days with outside help hired, 78.1% of the entire RS population), and ranging from 11.3% to 50.5% for the other questions. Similar results were observed in the nr-axSpA and AS subpopulations.

Discriminant validity
The association coefficients between all WPS questions and different continuous measures assessing the disease activity, physical functioning and HRQoL, were low (<0.3) to moderate (≥0.3 to 0.5), as expected, indicating divergent validity between the individual WPS questions and the other measures considered (Figure 1).     The level and the sign (positive or negative) of the Kendall association coefficients indicated that better productivity at work and within the home (as assessed by lower scores to WPS Q2 to Q9) was associated with better HRQoL, less fatigue, better physical functioning or less pain (Figure 1). The range of the association coefficients between the individual WPS questions and the clinical and HRQoL assessments was similar in the overall axSpA, nr-axSpA and AS populations. Nevertheless, higher correlation coefficients were observed in nr-axSpA compared to AS between WPS Q4 (arthritis interference with work productivity (outside home)) and all clinical and HRQoL measures, as well as between Q8 (days with outside help hired) and certain HRQoL measures (SF-36 and ASQoL) (Figure 1).
The known-groups validity analysis indicated that there was a higher burden of arthritis on productivity at both paid work and within the home in patients with a worse health state compared to patients with a better health state (Tables 4 and 5). Among employed patients in the overall axSpA population, patients with a worse health state had higher workplace productivity losses, with significantly more work days lost and more work days with productivity reduced by half, and a statistically higher interference of arthritis on work productivity compared to patients with a better health state (Table 4). In the overall axSpA population, compared with patients with a better health state, patients with a worse health state had larger productivity losses within the household, with, on average, significantly more days of household work lost; more days with household productivity reduced by at least half; more days missed of family, social or leisure activities; more days with outside help hired; and a significantly higher interference of arthritis (all per month) ( Table 5).
Similar findings were observed in the nr-axSpA and AS subpopulations, where patients with a worse health state had a higher burden of arthritis on productivity, at both paid work and within home, compared to patients with a better health state (Tables 4 and 5).

Responsiveness and reliability
Work Productivity Survey changes from baseline by SpondyloArthritis International Society (ASAS) 20% improvement criteria response at week 12 Significantly larger improvements in productivity within and outside the home (corresponding to higher negative mean changes in WPS responses) were reported by ASAS20 responders at week 12 compared to ASAS20 nonresponders in the overall axSpA population, except with regard to absenteeism, presenteeism and days missed of family, social or leisure activities, where only numerical differences were seen (Figure 2). Although differences in changes in absenteeism were noted between ASAS20 responders and nonresponders, the level of improvements in the nonresponders was numerically greater than the level of changes in responders, which might be explained by the difference in the baseline scores between the two groups (mean 3.1 days per month missed at baseline versus 1.4 days per month, respectively).
Similarly, in the AS and nr-axSpA subgroups, patients achieving an ASAS20 response at week 12 reported greater improvements in productivity, both within and outside the home, compared to baseline. As in the overall axSpA population, differences in absenteeism seemed to favor nonresponders over responders in the AS and nr-axSpA populations; however, this may be explained by the differences in baseline productivity loss.
For axSpA patients, the effect sizes of the changes in productivity, measured by the SRM, in ASAS20 responders were small (SRM < 0.5) for absenteeism, presenteeism and days with outside help, but moderate to large for the other WPS questions. In nonresponders, the magnitude of change was negligible (SRM < 0.1) or small (SRM < 0.5) (Figure 3). Similar findings were found in the nr-axSpA and AS subpopulations.

Work Productivity Survey changes from baseline by other response measures at week 12
The responsiveness of the WPS when using the BAS-DAI50 response criteria resulted in similar findings in Table 4 Work Productivity Survey baseline scores by defined known-groups: workplace productivity (randomized set, observed cases) a (Continued) Cutoff points represent the first and third quartiles of baseline scores: "Worse" state is defined for each individual measure as BASDAI score at or above the third quartile; BASFI score at or above the third quartile; Total Spine Pain score at or above the third quartile; ASQoL at or above the third quartile; SF-36 MCS at or below the first quartile; SF-36 PCS at or below the first quartile; "Better" state defined for each individual measure as BASDAI score at or below the first quartile; BASFI score at or below the first quartile; Total Spine Pain score at or below the first quartile; ASQoL score at or below the first quartile; SF-36 MCS, PCS at or above the third quartile; c WPS Q4 is scored on a 0 to 10 scale, where 0 = no interference and 10 = complete interference. d P ≤ 0.001; e P ≤ 0.01; f P ≤ 0.05; nonparametric bootstrap-t method with a variance stabilizing transformation, 10,000 replications. the overall axSpA, nr-axSpA and AS populations. In all three populations, BASDAI50 responders at week 12 reported significantly or numerically greater improvements compared to nonresponders, except in absenteeism (in axSpA and nr-axSpA populations) and days missed of social activities (nr-axSpA subpopulation), where slightly higher improvements were noticed in nonresponders versus responders; however, this may be explained by the differences in baseline productivity losses (data not shown). The pattern of productivity change effect sizes observed in BASDAI50 responders and nonresponders were similar to those observed in ASAS20 responders and nonresponders. With regard to the responsiveness of WPS to more stringent clinical responses, such as ASAS40 or ASAS50, the effect sizes in mean changes in productivity within and outside the home in the responder groups were moderate to large, except for two questions (absenteeism and days with outside help) for which the effect sizes were small (SRM < 0.5). Results were similar in the overall axSpA population, nr-axSpA and AS subpopulations, except for absenteeism and days missed of social activities, which indicated different behaviors in the nr-axSpA and AS subpopulations (data not shown).
The results based on the total and nocturnal back pain MCID response were inconclusive because of a large imbalance in the sample sizes of the two groups compared.
With regard to productivity changes were compared between responders and nonresponders at week 12 defined using other clinical response criteria, such as the ASDAS CII, ASDAS MI or the BASFI MCID response (Figure 4), clinical responders reported significantly or numerically larger improvements in productivity within and outside the home across all WPS questions in all Cutoff points represent the first and third quartiles of baseline scores: "Worse" state is defined for each individual measure as BASDAI score at or above the third quartile; BASFI score at or above the third quartile; Total Back Pain score at or above the third quartile; ASQoL at or above the third quartile; SF-36 MCS at or below the first quartile; SF-36 PCS at or below the first quartile; "better" state is defined for each individual measure as BASDAI score at or below the first quartile; BASFI score at or below the first quartile; Total Back Pain score at or below the first quartile; ASQoL score at or below the first quartile; SF-36 MCS, PCS at or above the third quartile. c WPS Q4 is scored on a 0 to 10 scale, where 0 = no interference and 10 = complete interference. d P ≤ 0.001; e P ≤ 0.01; f P ≤ 0.05; nonparametric bootstrap-t method with a variance stabilizing transformation, 10,000 replications.

Figure 2
Mean changes from baseline in Work Productivity Survey by SpondyloArthritis International Society 20% improvement criteria clinical response at week 12. Change from baseline in Work Productivity Survey (WPS) by Assessment of Ankylosing Spondyloarthritis International Society 20% improvement criteria (ASAS20) clinical response at week 12 in overall axial spondyloarthritis (axSpA) population (randomized set, observed cases). *P ≤ 0.05, **P ≤ 0.01, # P ≤ 0.001; responders versus nonresponders. P-values were obtained using the nonparametric bootstrap-t method. Rate of interference is a score on a scale of 0 to 10 points (0 = no interference and 10 = complete interference). WP: Work productivity. three populations. In the overall axSpA population, the effect sizes of the changes in productivity in ASDAS CII, ASDAS MI or BASFI responders ( Figure 5) were small (SRM < 0.5) for absenteeism (WPS Q2) and days with outside help (WPS Q8), and generally moderate to large for the other WPS questions. In nonresponders, the magnitude of change was negligible (SRM < 0.1) or small (SRM < 0.5). Similar effect sizes were observed in the nr-axSpA and AS subpopulations (data not shown).

Discussion
This work assessed the initial psychometric properties of the WPS, originally developed in a population of patients with RA [17], in patients with axSpA, nr-axSpA and AS. The disease specific WPS is a tool developed to estimate productivity limitations in the workplace and relating to household activities due to arthritis [17] and whose psychometric properties have already been demonstrated in patients with active RA [17]. Previous work demonstrated that the WPS could efficiently evaluate both the burden of the disease and clinical interventions on work outcomes in patients with RA [17,32,33]. The discriminant validity, responsiveness to clinical changes and reliability of the survey were evaluated in patients enrolled in a clinical trial for the treatment of active axSpA. The OMERACT meetings 6 and 7 [34,35]  highlighted the importance to patients of consideration of the impact of arthritic conditions on paid and unpaid work outcomes, as these factors represent an important component of the health and well-being of rheumatology patients. Similar thinking should apply to patients with axSpA. Patient-reported outcomes have long been included in rheumatology trials, as they capture the patient's perspective of the disease process and the impact of treatments on the disease. Despite being of interest to patients and employers, the impact of axSpA or AS on work outcomes is not currently a core component of rheumatology clinical trials.
The spondyloarthritis treatment guidelines of the Canadian Rheumatology Association and the Spondyloarthritis Research Consortium of Canada indicate that disease monitoring should include assessments of function, disability and handicap and further noted that "[p]articipation in social, leisure, education, community and work activities must be an integral measure used to evaluate outcomes by health professionals, educators, policymakers and researchers" 2279 [36]. Furthermore, the ASAS has indicated the importance of worker productivity in its educational slides. OMERACT has reinforced the importance of work productivity as an outcome measure in rheumatology through the Worker Productivity Special Interest Group, which has reviewed specific productivity instruments and continues to evaluate concepts and methodological and interpretation issues in work productivity [37].
The present findings indicate that the WPS instrument was generally well understood by patients, as indicated by the high completion rates. As in rheumatoid arthritis, the WPS demonstrated good discriminant validity, in terms of both association coefficients and known-groups analyses, evaluated against a range of different continuous measures used to assess disease activity, physical functioning and HRQoL. The association coefficients indicate the divergent validity between the individual WPS questions and the other measures considered, which was further confirmed by the known-groups analyses. Findings in the nr-axSpA and AS subpopulations were similar to those in the overall axSpA population.
The known-groups analyses based on the first and third quartiles of the instrument scores at baseline were further confirmed using a median cutoff of the score. However, the responsiveness of the WPS was assessed using clinically recognized thresholds and supports the discriminant validity analysis.
The WPS was also responsive to clinical changes, as measured by the ASAS20 and BASDAI50 responses at week 12. Findings were also similar when using a range of different clinical response measures (ASAS40, ASAS50, ASDAS MI, ASDAS CII and BASFI MCID), supporting the responsiveness and reliability of the WPS. Similar results were reported in all three populations (axSpA, nr-axSpA and AS).
All WPS questions showed a certain level of responsiveness to clinical changes. However, the responsiveness of questions concerning the number of work days missed due to axSpA and the number of days with outside help hired were not as large or consistent as the other WPS questions. The number of respondents who actually reported full days of work missed, or days with outside help, was quite small relative to the entire study Figure 4 Mean changes from baseline in Work Productivity Survey by Bath Ankylosing Spondylitis Functional Index response at week 12. Change from baseline in Work Productivity Survey (WPS) by Bath Ankylosing Spondylitis Functional Index (BASFI) response at week 12 in overall axial spondyloarthritis (axSpA) population (randomized set, observed cases). *P ≤ 0.05, #P ≤ 0.001, responders versus nonresponders. P-values were obtained using the nonparametric bootstrap-t method. Rate of interference is a score on a scale of 0 to 10 points (0 = no interference and 10 = complete interference). WP: Work productivity.
sample. The results for absenteeism might suggest that the impact of axSpA on productivity more likely manifests as daily interference with normal working practices, without resulting in full disability. However, it should also be noted that because 12.4% of the sample reported being unable to work due to arthritis at baseline, patients who might otherwise have reported missing a high number of days of work did not report their level of absenteeism. The low level of responsiveness for the question assessing days with hired outside help would appear to suggest that axSpA patients might not necessarily hire outside help, but the possibility that patients receive external help from relatives or friends cannot be excluded.
Given the intent of using the WPS across a variety of rheumatic conditions, including those where higher levels of disability might be anticipated, all questions of the WPS should remain to ensure an accurate assessment of the