Rheumatoid arthritis patients on persistent moderate disease activity on biologics have adverse 5-year outcome compared to persistent low-remission status and represent a heterogeneous group

Background The long-term outcome of rheumatoid arthritis (RA) patients who in clinical practice exhibit persistent moderate disease activity (pMDA) despite treatment with biologics has not been adequately studied. Herein, we analyzed the 5-year outcome of the pMDA group and assessed for within-group heterogeneity. Methods We included longitudinally monitored RA patients from the Hellenic Registry of Biologic Therapies with persistent (cumulative time ≥ 50% of a 5-year period) moderate (pMDA, 3.2 < DAS28 ≤ 5.1) or remission/low (pRLDA, DAS28 ≤ 3.2) disease activity. The former was further classified into persistent lower-moderate (plMDA, DAS28 < 4.2) and higher-moderate (phMDA, DAS28 ≥ 4.2) subgroups. Five-year trajectories of functionality (HAQ) were the primary outcome in comparing pRLDA versus pMDA and assessing heterogeneity within the pMDA subgroups through multivariable mixed-effect regression. We further compared serious adverse events (SAEs) occurrence between the two groups. Results We identified 295 patients with pMDA and 90 patients with pRLDA, the former group comprising of plMDA (n = 133, 45%) and phMDA (n = 162, 55%). pMDA was associated with worse 5-year functionality trajectory than pRLDA (+ 0.27 HAQ units, CI 95% + 0.22 to + 0.33; p < 0.0001), while the phMDA subgroup had worse 5-year functionality than plMDA (+ 0.26 HAQ units, CI 95% 0.18 to 0.36; p < 0.0001). Importantly, higher persistent disease activity was associated with more SAEs [pRLDA: 0.2 ± 0.48 vs pMDA: 0.5 ± 0.96, p = 0.006; plMDA: 0.32 ± 0.6 vs phMDA: 0.64 ± 1.16, p = 0.038]. Male gender (p = 0.017), lower baseline DAS28 (p < 0.001), HAQ improvement > 0.22 (p = 0.029), and lower average DAS28 during the first trimester since treatment initiation (p = 0.001) independently predicted grouping into pRLDA. Conclusions In clinical practice, RA patients with pMDA while on bDMARDs have adverse long-term outcomes compared to lower disease activity status, while heterogeneity exists within the pMDA group in terms of 5-year functionality and SAEs. Targeted studies to better characterize pMDA subgroups are needed, in order to assist clinicians in tailoring treatments.


Background
Rheumatoid arthritis (RA) disease activity state is associated with long-term prognosis [1]. According to the widely accepted treat-to-target (T2T) strategy, the aim of treatment in RA is to improve patients' health-related quality of life by abrogation of inflammatory burden [2]. T2T approach in RA aims remission or low disease activity, evaluated by composite indexes such as the disease activity score of 28 joint counts (DAS28) [3]. Biologic disease-modifying antirheumatic drugs (bDMARDs) are potent agents which control inflammation and improve prognosis. Their clinical effectiveness has been extensively reported in controlled clinical trials and registries. Data from registries have shown that 15-35% of patients treated with TNFα inhibitors (TNFis) achieve remission at 12 months [4][5][6][7] and figures are comparable with non-TNFis such as tocilizumab [8]. Interestingly, approximately 30-50% of patients although improve disease activity, they still have moderate disease activity (MDA) after treatment with the first bDMARD [6].
Outcomes of patients with residual MDA while on treatment are available only from trials focusing on early-onset disease treated with conventional synthetic DMARDs (csDMARDs) [9,10] and a pooled analysis of randomized controlled clinical trials of TNF is reported on early (1-year) outcomes [1]. Moreover, most of the registries report on disease activity status at prespecified time-points after treatment, while data indicative of long-term longitudinal disease activity course (i.e., timeaveraged disease activity) are limited. To this end, data for RA patients on bDMARDs having persistently MDA are scant. Rheumatologists in clinical practice often confront with patients who have residual disease activity even after switching biologic disease-modifying drugs. Therefore, and given the lack of data for this group, we sought to assess the long-term outcome of RA patients on bDMARDs who exhibit persistent moderate disease activity (pMDA) in clinical practice context.
Herein, we analyzed data from the Hellenic Registry of Biologic Therapies to evaluate the long-term (5 years) outcome of RA patients on bDMARDs who exhibit pMDA. We aimed to compare the functional status (HAQ) at 5 years and its longitudinal course (trajectories) in patients with pMDA versus those on persistent lower inflammatory burden (pRLDA). Additionally, we compared serious adverse events (SAEs) occurrence during the course of the follow-up between the two groups. We finally assessed for clinical heterogeneity within pMDA and looked for early predictors of patients' classification into the distinct persistent disease activity groups.

Data source
The Hellenic Registry of Biologic Therapies (HeRBT) was established in 2004 as a multicenter (7 centers) prospective observational cohort of patients with inflammatory arthritides [6]. RA and spondyloarthritis patients are included in the HeRBT at the initiation of first bDMARD and are followed-up for as long as they receive bDMARDs. Detailed analysis of protocol is given elsewhere [6,11]. Treatment decisions (bDMARD selection, co-medication, dosage adjustments/switches) were made at the discretion of treating rheumatologists based on clinical assessments, national guidelines, and patient's preferences. All adverse events were reported using a separate form in every visit by patients and treating physicians and details regarding their seriousness, healthcare utilization, and outcome were recorded. Ethical approval was obtained from the Institutional Review-Board of the University Hospital of Heraklion, Crete (decision number 1476/20-03-2012), along with participants' informed consents.

Patients
We analyzed all adult patients diagnosed with RA who were registered for the first time in HeRBT until 31 May 2013 and had continuous follow-up for at least 5 years, irrespective of bDMARD treatment switches. Data were censored when patients completed 5 years of follow-up.

Longitudinal clinical data
Data harmonization was performed to summarize longitudinal disease activity (assessed by DAS28 index) and functionality (assessed by Health Assessment Questionnaire (HAQ)) assessments in clinically relevant therapy time (TT) intervals. Specifically, we defined a total of eight TT intervals that spanned sequentially over the 5 years follow-up period, every 6 months for the first 2 years and every year afterwards. DAS28 and HAQ averages were computed for each patient, in each TT interval, from all clinical assessments conducted in the specific time period. In addition, the patients' serious adverse events (SAEs) were computed during the 5 years of follow-up. SAEs were defined as adverse events that required hospitalization and/or intravenous antibiotics, resulted in significant disability, and/or were lifethreatening or fatal.

Definitions and clustering by persistent disease activity
Patients were classified into persistent low (pRLDA) and persistent moderate (pMDA) disease activity groups, if they correspondingly had DAS28 ≤ 3.2 and 3.2 < DAS28 ≤ 5.1 for cumulative time percentage ≥ 50% of the 5-year follow-up time. Cumulative time percentage (CTP) of a DAS28 range was defined as the ratio of TT intervals that have DAS28 in the specific range. A patient was clustered in a persistent disease activity group when a specific DAS28-range occurred for CTP ≥ 50% or equivalently for at least 4 out of the 8 TT intervals (any of them).
To assess for within-group heterogeneity, patients with pMDA were further sub-divided into persistent lowermoderate (plMDA) and higher-moderate (phMDA) disease activity groups, when they fulfilled the additional criterion of having persistent DAS28 < 4.2 and DAS28 ≥ 4.2, respectively, for CTP ≥ 50%, as defined above. Any conflicting cases, where patients could be classified in two different persistent disease activity groups, were resolved by the preference policy of the worst-case scenario (classification in the higher disease activity group).

Statistical analysis
Summary descriptive measures were used on baseline characteristics of the persistent disease activity groups. Non-parametric hypothesis tests (Kruskal-Wallis, Wilcoxon rank-sum, and chi-squared, as appropriate) were used to compare groups' differences at baseline characteristics (Bonferroni-corrected p value = 0.0019 to account for multiple comparisons) and also to compare groups' 5-year outcomes regarding functional status (HAQ) and cumulative serious adverse events (SAEs). To estimate the required sample size to compare functionality, we formulated the null hypothesis that functionality (HAQ) at 5 years would not differ significantly between pRLDA and pMDA groups (i.e., difference in HAQ ≤ 0.22 units). To reject this null hypothesis with 80% power (α = 0.05) and assuming 0.5 mean HAQ (0.5 std) in the pMDA and 0.28 in the pRLDA group and patient allocation ratio pRLDA to pMDA to be 1:2, the estimated required sample size was at least 61 and 122 patients in the pRLDA and pMDA groups, respectively.
Multivariable mixed-effect regression analysis was used to associate the persistent disease activity groups pRLDA and pMDA with different 5-year functionality trajectories. Additionally, multivariable mixed-effect regression was performed for pMDA group to assess for clinical heterogeneity (5-year functionality trajectories) between plMDA and phMDA subgroups. In the analyses, HAQ was modeled as the dependent outcome variable while the persistent disease activity group (or subgroup) was modeled as fixed-effect variable (categorical, dummy-coded with reference-category the lower persistent disease activity group). We accounted for individual patient variability with a random effect on the patient level (random intercept for each patient) and adjusted for possible confounding effects of gender, age, and disease duration (fixed-effects). Time in treatment course was modeled as fixed-effect categorical variable (dummy-coded, 9-values, reference-category baseline, remaining values represent the 8 TT intervals). This time representation was selected to model the change in HAQ in each TT interval compared to baseline. Alternative representations of groups' interactions with time did not yield significantly better results on comparison metrics of AIC, BIC, and negative log-likelihood.
Multivariable logistic regression was also performed to analyze for early predictors that classify in pMDA (versus pRLDA) group. The model was adjusted for gender, age (per year), disease duration (per year), and previous csDMARDs at baseline (binary, true for csDMARDs < 2). The model also included the following patient's characteristics from baseline and first therapy semester: baseline disease activity (DAS28), baseline functionality (HAQ), anti-TNF treatment initiation, average disease activity in first semester (DAS28-average), and functionality difference (ΔHAQ) in first semester's average (HAQ-average) from baseline (binary, true for difference < − 0.22). Predictive efficiency of the model was evaluated using the 10-fold cross-validation process (90% training set, 10% test set, and 10 repetitions without resubstitution) in the samples of the analysis cohort, in order to avoid overfitting and selection bias. The efficiency metrics, accuracy (ACC), sensitivity or true positive rate (TPR), specificity or true negative rate (TNR), and the area under the receiver-operating-characteristic curve (AUC), were averaged from 10 independent datasets extracted from the cross-validation process.

Missing data management
Patients who had proportionally more missing than existing DAS28-assessments were excluded in order to maintain data quality. Missing DAS28-data in the rest of the patients were imputed with a multivariable mixedeffect regression model, due to the time-dependent nature of the longitudinal DAS28-data (repeated correlated DAS28-measurements in the same patients). The model was adjusted on gender, age, and disease duration (fixedeffects) and included a random effect on the patient level and a fixed-effect variable for time (as in the aforementioned mixed-effect model). All analyses were performed in MATLAB 9.2 statistical toolbox.

Cohort characteristics
The analysis cohort included 385 patients from the HeRBT registry that fulfilled two criteria, (a) they had at least 5 years follow-up and (b) they exhibited persistent low or moderate disease. This cohort was selected out of the 1466 RA patients included in the registry until May 2013, after excluding 763 patients due to < 5 years follow-up, 166 patients having missing longitudinal DAS28-data > 50%, 142 exhibiting persistent high inflammatory burden, and 10 patients not exhibiting any persistent disease activity level. This was a cohort mostly with established RA (mean disease duration 9.2 years), 70 patients had disease duration < 2 years, and the mean monitoring duration was 7.5 years (Table 1). Patients were treated with an average of 2.34 (± 1.13) csDMARDs prior to first bDMARD, while 279 (72%) were on combination with methotrexate. At the end of 5 years, 208 (54%) patients received a second and 99 (26%) received a third sequential bDMARD.
pMDA group was associated with worse 5-year functionality than pRLDA One of the main aims of this study was to investigate whether patients on pMDA have adverse long-term prognosis compared to patients on lower chronic inflammatory burden. The 5-year functionality (HAQ) trajectories (average DAS28 in the 8 TT intervals) of the pRLDA and pMDA groups are presented in Fig. 3, showing a clear distinct trajectory for each group. In multivariable mixed-effect regression analysis (Table 2), the pMDA group was associated with worse 5-year functionality trajectory than the pRLDA group (+ 0.28 higher HAQ trajectory in pMDA, 95% CI + 0.18 to + 0.39, p < 0.0001). Analysis was adjusted for possible confounding effects on gender, age and disease duration. Interestingly, the 5-year functionality was also significantly worse in females than males (+ 0.13 higher HAQ trajectory in females, 95% CI + 0.04 to + 0.23, p = 0.008) and in older patients (+ 0.009 higher HAQ trajectory per 1 year, 95% CI + 0.006 to + 0.012, p < 0.0001).
Subgroup phMDA was associated with worse 5-year functionality than plMDA Clinical heterogeneity has been reported for the MDA group of patients on csDMARDs while limited data are available concerning bDMARDs. We assessed whether our cohort of RA patients on bDMARDs having persistent moderate disease activity represents a heterogeneous group. For this, we compared lower and higher pMDA subgroups, the plMDA and phMDA, respectively. The 5-year functionality trajectories (average HAQ in the 8 TT intervals) of the patients classified in plMDA and phMDA subgroups are presented in Fig. 3, revealing a clear distinct trajectory for each subgroup. In multivariable mixed-effect regression analysis (Table 3), the phMDA subgroup was associated with worse 5-year functionality trajectory than plMDA subgroup (+ 0.26 higher HAQ trajectory in pMDA, 95% CI + 0.17 to + 0.36, p < 0.0001). Analysis was adjusted for possible confounding effects on gender, age, and disease duration. The 5-year functionality was also significantly worse in females than males (p = 0.04) and in older patients (p < 0.0001). At 5 years, the phMDA group had worse functionality status than plMDA (HAQ 0.41 ± 0.38 vs 0.66 ±

Patient (sub)groups are associated with different serious adverse events occurrence
We also assessed SAEs occurrence during the course of the follow-up to estimate for any differences in the longterm outcome between distinct patients groups. The 5year cumulative SAEs trajectories of pRLDA and pMDA groups as well as of the plMDA and phMDA subgroups are presented in Fig. 4, showing a clear distinct trajectory for each group and subgroup, respectively. At 5 years, pMDA group had higher occurrence of SAEs than pRLDA group (0.2 ± 0.48 in pRLDA vs 0.5 ± 0.96 in pMDA, p = 0.006). In addition, the phMDA subgroup had also higher occurrence of SAEs than the plMDA subgroup (0.32 ± 0.6 in plMDA vs 0.64 ± 1.16 in phMDA, p = 0.038). The differentiation between subgroups plMDA and phMDA in serious adverse events occurrence further supports the heterogeneity within the pMDA group.
Early predictors for classification between pRLDA and pMDA groups In view of the aforementioned clinically meaningful differences between pRLDA and pMDA patient groups, we developed a predictive model for the early classification of patients using data from patients' early therapy

Discussion
We characterized the long-term prognosis of RA patients with persistent moderate disease activity (MDA) under bDMARD treatment in the context of clinical practice. We found that a substantial proportion of patients improved disease activity status and function after treatment, yet a substantial proportion exhibited pMDA irrespective of treatment modifications. Most importantly, pMDA was associated with worse long-term (5year) outcomes (functional limitation and serious adverse events) than persistent lower inflammatory burden. Interestingly, the subgroup with plMDA had better long-term outcomes than those with phMDA. Efficiency of multivariable analysis: RMSE (Root mean square error) = 0.352, R 2 (R-squared) = 0.573 pRLDA persistent remission/low disease activity group; pMDA persistent moderate disease activity group Fig. 3 HAQ trajectories of patient groups pRLDA (remission-low) and pMDA (moderate) and subgroups plMDA (lower-moderate) and phMDA (higher-moderate). RA, rheumatoid arthritis; HAQ, functionality measured by Health Assessment Questionnaire; pRLDA, patients with persistent remission or low disease activity; pMDA, patients with persistent moderate disease activity; plMDA, patients with persistent lower-moderate disease activity; phMDA, patients with persistent higher-moderate disease activity An important finding of this study was that persistent MDA was linked to significantly worse functionality trajectory during 5 years of bDMARDs therapy, as compared to pRLDA (Fig. 3 and Table 2). Interestingly, residual disease activity was the major contributor to HAQ increase over time (Table 2). These data are comparable to those from early RA cohorts on csDMARDs assessing the cumulative effect of disease activity on RA-related outcomes [9,10,12]. Although one might argue that this finding is an expected one, we considered it clinically important and novel to focus for the first time on patients treated with bDMARDs, which might exert differential immunomodulatory effects as compared to csDMARDs. For example, it has been shown that both TNFi and tocilizumab may inhibit joint destruction effectively even when residual disease activity exists, which is not the case for methotrexate [13][14][15].
Another key finding of our study is that MDA patients comprise a heterogeneous group in terms of outcome. Patients with plMDA have significantly better 5-year functionality trajectory than those in phMDA ( Fig. 3 and Table 3). This agrees with studies focusing on early-onset RA and csDMARDs treatments, showing that patients on the lower end of MDA have significantly better outcomes than those on the higher end [10,16]. Notwithstanding the fact that observational studies cannot provide direct support for management strategies, we consider our results to be of clinical importance and relevant to the T2T concept. Firstly, our data provide further support to the validity of T2T in clinical practice, since there was a clear superiority for all long-term outcomes in the pRLDA as compared to pMDA group. Moreover, the heterogeneity of outcomes in lower and higher MDA patients can assist T2T strategies to tailor treatments for these subgroups in order to improve outcomes. Fig. 4 Serious adverse events trajectories of groups pRLDA (remission-low) and pMDA (moderate) and subgroups plMDA and phMDA. RA, rheumatoid arthritis; pRLDA, patients with persistent remission or low disease activity; pMDA, patients with persistent moderate disease activity; plMDA, patients with persistent lowermoderate disease activity; phMDA, patients with persistent highermoderate disease activity An interesting finding was that patients on pMDA accumulated more SAEs compared to patients on pRLDA during 5 years of bDMARDs therapy (Fig. 4). Additionally, the analysis within pMDA showed that subgroup phMDA had more SAEs than plMDA at 5 years of therapy (Fig. 4). A major contributor in SAE is serious infections. The correlation between disease activity level and serious infections has been shown by several cohort studies [17][18][19]. Nevertheless, only our study and the analysis from the CORRONA registry by Accortt et al. are those analyzing "cumulative" disease activity levels, revealing the significant "dose effect" of inflammatory burden on the risk for serious infections [20]. Moreover, data from the Nijmegen early RA inception cohort have shown that time-averaged disease activity burden contributes to the risk of cardiovascular events in RA patients on different background therapies [21,22]. These findings combined with the finding of higher functional decline of pMDA group underline the importance of cumulative residual disease activity as an important contributor in RA long-term prognosis.
Long-term prognosis of RA largely depends on the disease inflammatory burden and associated comorbidities. One of the limitations in the literature is that the majority of short-and even long-term studies evaluate RArelated inflammation cross-sectionally. However, values representative of longitudinal course of disease activity and its effect over time are considered to provide more valuable information. One such approach is the average disease activity from multiple years of treatment [10], while another is the area under the curve of DAS28 course (AUC) which was associated with both radiographic progression [23] and the risk for cardiovascular diseases (CVD) [21,22]. In the present study, we applied another approach using the cumulative time percentage (CTP) that DAS28 falls within a specified range during follow-up (CTP of DAS28-range), as indicative of longterm longitudinal disease activity course. We considered the AUC method not appropriate since it may not be able to distinguish a persistent moderate disease trajectory from one fluctuating equally between low and high disease activity levels which may exhibit approximately equivalent AUC values.
Studies analyzing persistent disease activity status from cohorts of biologics have focused on persistent remission [24][25][26][27][28][29][30]. Herein, we focused on persistent residual disease activity (pMDA), and we found that from patients in persistent moderate or lower inflammatory burden treated on bDMARDs, 23% were classified in a persistent low or remission status while 77% still exhibited substantial inflammatory burden (pMDA) after 5 years of therapy. Even though this seems as a rather high number, yet available data from registries have shown that only 8.2-21% of bDMARDs treated patients are classified as being in persistent remission [24][25][26][27]29]. Of note, pMDA patients in our cohort differ from pRLDA even from baseline and could be divided further in two heterogeneous subgroups.
Early predictors for patient classification in pRLDA compared to pMDA group identified in multivariable predictive modeling were male gender, lower baseline, and lower first semester disease activity (DAS28) and functionality improvement in first semester compared to baseline (ΔHAQ < − 0.22) ( Table 4). Comparable to our data, a meta-analysis of six studies for factors associated to sustained remission in patients treated with TNF inhibitors showed that greater baseline disease activity, age, disease duration, baseline functional impairment, and female sex were associated with reduced likelihood of achieving sustained remission [30]. Nevertheless, One of the limitations of this study is patients' missing follow-up data. In order to address missing data and also maintain data quality, we excluded patients with large percentage of missing DAS28-data (> 50%) and imputed missing DAS28-data in the rest of the patients. Of note, the imputed-dataset (385 patients) compared to the non-imputed (292 patients) included additionally 93 (24%) patients, and results were similar in both datasets (groups pRLDA and pMDA differentiated in functionality trajectory and SAEs, data not shown).
Another limitation of this study can be considered the merging of remission and low disease activity groups. The pRLDA group included 52 patients in persistent remission, 20 in persistent low disease activity, and 18 with fluctuations between low and remission disease activity. Future studies in larger datasets could focus in persistent strictly low (2.6 ≤ DAS28 ≤ 3.2) and persistent moderate (3.2 < DAS28 ≤ 5.1) disease activity comparison.
In order to evaluate the robustness of the methodological approach, we performed sensitivity analysis in shorter (3-year) therapy duration with similar results (groups pRLDA and pMDA differentiated in functionality trajectory). Furthermore, analysis on groups' differences regarding patients' inclusion-year (year ≤ 2007, 2007 < year ≤ 2010, year > 2010) did not yield significant variation between the groups.
Conclusions Our analysis revealed that a considerable proportion of RA patients on bDMARDs in clinical practice, although improve disease activity status, still manifests persistent moderate disease activity. This state was associated with adverse 5-year outcomes and was also found to present internal heterogeneity, while predictors were analyzed to assist for early patient classification. These findings further support the value of T2T strategy in order to improve long-term outcome and highlight the need for further targeted studies on persistent MDA state and its heterogeneous subgroups.