In the phase 3 studies OPAL Broaden and OPAL Beyond, patients with active PsA receiving tofacitinib 5 and 10 mg BID showed improvements versus placebo throughout the 3-month placebo-controlled period for the composite endpoints assessed. These improvements were subsequently maintained to month 6 in OPAL Beyond and month 12 in OPAL Broaden. Adalimumab had comparable efficacy to tofacitinib across the composite endpoints in OPAL Broaden.
OPAL Broaden and OPAL Beyond involved two distinct populations of patients with PsA: csDMARD-IR/TNFi-naïve patients in OPAL Broaden and TNFi-IR patients in OPAL Beyond. Despite the difference in patient populations, baseline values for the composite endpoints were broadly similar across studies and treatments. Generally, LS mean changes from baseline were greater, and the effect size and standardized response mean were higher, in the OPAL Broaden study compared with OPAL Beyond. This suggests that the TNFi-naïve patients in OPAL Broaden showed more marked treatment responses than the TNFi-IR patients in OPAL Beyond, similar to previous reports for PsA treatment [18,19,20].
PASDAS baseline scores in OPAL Broaden were comparable with values reported in an equivalent study population [3]; however, along with the PASDAS baseline scores in OPAL Beyond, they were somewhat higher than those reported in a study of standard care [21] and patients in clinical practice [22]. In the GRACE (GRAPPA Composite Exercise) study, designed to develop composite disease activity and responder measures for PsA, a mean score of 5.30 for PASDAS was reported for patients changing treatment and this was taken as a surrogate for high disease activity [11]. The mean baseline PASDAS levels reported in this study were therefore suggestive of high disease activity in both OPAL Broaden and OPAL Beyond, and following 3 months of treatment, PASDAS levels dropped below this threshold. In addition, the GRACE study defined a good response as a PASDAS score of less than or equal to 3.2, following a decrease in score of greater than or equal to 1.6 from baseline [17]; in this study, this was achieved at month 12 in OPAL Broaden by 44.2% and 47.5% of patients receiving tofacitinib 5 and 10 mg BID, respectively, and at month 6 in OPAL Beyond by 28.5% and 28.9% of patients receiving tofacitinib 5 and 10 mg BID, respectively. Of note, a PASDAS score of less than or equal to 3.2 has been defined as low disease activity [17] and less than or equal to 1.9 as very low disease activity [23].
OPAL Broaden DAPSA baseline scores were slightly lower than baseline scores in an equivalent study population [3] but higher than reported in clinical practice [24]. In the GRACE study, patients changing treatment (considered to have high disease activity) had a mean DAPSA score of 41.91 [11], suggesting that patients in OPAL Broaden and OPAL Beyond had high levels of disease activity. Indeed, in a recent study analyzing data from 30 patients with PsA in an observational database, the cutoff for a DAPSA score indicating high disease activity was greater than 28 [25]. In this study, mean DAPSA scores were below the high disease activity score reported in the GRACE study after 3 months of active treatment in all groups [11].
In contrast to the findings with the other composite measures, the baseline CPDAI scores reported for OPAL Broaden and OPAL Beyond were somewhat lower than mean CPDAI score of 11.65 reported for patients changing treatment (surrogate for high disease activity) in the GRACE study [11]; thus, CPDAI scores did not appear to indicate patients with high baseline disease activity in these patient populations. However, another study has suggested a high disease activity threshold of greater than 7 for CPDAI [26]; mean CPDAI scores were below this threshold after 3 months of active treatment across all groups and both studies.
The DAS28–3(CRP) was included for comparative purposes only. Baseline DAS28–3(CRP) scores were somewhat higher than the mean DAS28–3(CRP) score of 3.96 observed for patients changing treatment (a surrogate for high disease activity) in the GRACE study [11]; however, DAS28–3(CRP) scores in this study were reduced below this level following 3 months of treatment. It should be noted, however, that this measure was developed and validated for rheumatoid arthritis and there are several reasons why it is inappropriate as a composite measure for assessing PsA, particularly as it measures only articular outcomes and excludes joints of the foot and ankle, potentially missing important inflammatory disease [27].
All reported effect size and standardized response mean values were greater than 0.80, the value generally taken to indicate a large treatment effect or response [3]. The largest effect size was observed at all time points and treatments for the composite endpoint PASDAS; this is consistent with findings reported for golimumab [3]. Effect size and standardized response mean generally showed increases with time on treatment, indicating that the composite endpoints demonstrated time-dependent improvement, as might be expected. Analysis of the percentage of PASDAS responders over time also demonstrated the ability of the PASDAS instrument to detect treatment-related changes in PsA disease activity.
The definition of MDA using the criteria applied in this analysis and in previous tofacitinib publications [9, 10] has utility for identifying treatment response and as such may be used as a target to guide treatment decisions [16]. When the standardized slope coefficients of the composite endpoints (STBs) from a multiple logistic regression model were compared, the change in PASDAS had the largest magnitude of association with MDA response among all the composite endpoints examined, suggesting that it had the strongest predictive ability compared with DAPSA and CPDAI; CPDAI had the lowest predictive ability of the endpoints.
The differing findings with respect to tofacitinib treatment for the three disease-specific composite endpoints considered in this analysis could have resulted from the different composition of the endpoints evaluated. The PASDAS and CPDAI both include assessment of the skin manifestations of PsA (the PASDAS by inclusion of the patient’s global “arthritis and psoriasis” VAS) and the severity of enthesitis and dactylitis as well as TJC and SJC. DAPSA, however, is focused on TJC and SJC, with no consideration of skin disease, enthesitis, or dactylitis and an arthritis-focused global score. The PASDAS and CPDAI also both incorporate PROs; the PASDAS incorporates the PCS score of the SF-36v2 acute, and the CPDAI the DLQI and ASQoL. In this analysis, the PASDAS appeared to be the most sensitive to improvements in the signs and symptoms of PsA related to treatment with tofacitinib and adalimumab; the effect size observed with the PASDAS was higher than for any other endpoint at all time points in both studies. The ability of the PASDAS to detect change in these two studies might reflect the components of the measure; skin manifestations, enthesitis, dactylitis, and PROs all appeared to be sensitive to treatment-related changes in OPAL Broaden and OPAL Beyond, although the adoption of a hierarchical testing scheme for key secondary endpoints precluded demonstration of significance for all measures and time points [9, 10]. The CPDAI also incorporates skin, enthesitis, dactylitis, and PROs but appeared less sensitive to treatment differences than PASDAS though with generally higher effect size and standardized response mean than DAPSA. Inclusion of the axial disease domain in CPDAI (which does not feature in the other composite endpoints assessed) could offer an explanation as to why tofacitinib had the least impact on this composite; it may be that axial disease responds to a lesser extent than the other domains to treatment with tofacitinib and this may have impacted the final composite score. The CPDAI may also be less responsive because of the way it is constructed: the CPDAI is essentially a categorical measure re-expressed as a continuous scale and the hierarchical thresholds may blunt responsiveness. As previously discussed, the utility of DAS28–3(CRP) is limited because of the small number of components included in the composite and the lack of inclusion of measures of skin disease, enthesitis, dactylitis, or PROs.
It is clear from these analyses that PASDAS has superior performance in this context and it has already been reported that the consensus view is that PASDAS should be the outcome measure of choice in PsA clinical trials [28]. The DAPSA is easier to evaluate but there are arguments against this measure; PsA is a complex multifaceted disease which requires appropriate evaluation across domains, and measures such as the DAPSA, though easy to perform in practice, do not fulfill this function. In terms of clinical practice, the PASDAS does provide a challenge in both acquiring the data and processing the result: the first challenge represents the general case of clinical assessment in PsA; the second challenge is easily overcome by the use of predefined spreadsheets and web-based resources.
This analysis had a number of limitations. The OPAL Broaden and OPAL Beyond studies were not designed for evaluation of the composite endpoints’ longitudinal validity and sensitivity to change. In addition, for the calculation of effect size and standardized response mean, only patients with greater than or equal to 3% psoriasis BSA affected at baseline were included, with no missing values of the composite endpoints across multiple visits. Consequently, patient numbers were relatively low in some cases; CPDAI data were available for only 63% and 52% of patients receiving tofacitinib 10 mg BID in OPAL Broaden at month 12 and OPAL Beyond at month 6, respectively, and effect size and standardized response mean were calculated in only 47–64% of patients. Also, there was no adjustment for multiplicity; therefore, the P values reported for comparison with placebo should be considered nominal.