Study design and data source
We conducted a cohort study using data derived from patients in the Swiss Clinical Quality Management in Rheumatic Diseases (SCQM) registry. The SCQM registry was established in 1997 and is used to prospectively follow RA patients [11]. RA diagnoses are made by board-certified rheumatologists. Follow-up for the SCQM registry involves annual physical examination (i.e., tender joint count, swollen joint count), disease activity scores (e.g., DAS28), laboratory tests (i.e., esr), and radiographs. Other relevant health issues such as diagnosis of osteoporosis and its treatment are reported by the patient and recorded by the rheumatologist into SCQM. Clinical information is usually updated every time a patient changes anti-rheumatic therapy but at least once a year during a regular visit. Ethical approval for this study was obtained from the local ethics committees. All patients provided written informed consent.
Study population
We included patients with clinically diagnosed RA who had at least two eligible hand radiographs taken between January 1997 and October 2014. Radiographs were considered eligible if all eight DIP joints could be scored. Patients entered the study on the date of their first eligible radiograph. Each individual’s observation period lasted until the outcome or the final eligible radiograph. We divided the study population into two cohorts based on whether DIP OA (assessed by modified KLS) was present or absent at cohort entry (cohorts 1 and 2, respectively).
Exposures
We defined the following mutually exclusive treatment groups: csDMARD monotherapy (comparator group given the biggest size), bDMARD monotherapy, cs/bDMARD combination therapy, past use of any DMARD, and previously unexposed to DMARD therapy (called never-use). csDMARDs included methotrexate, leflunomide, sulfasalazine, and chloroquine. bDMARDs included TNFis (e.g., adalimumab, certolizumab, etanercept, infliximab, and golimumab) and non-TNFis (e.g., abatacept, tocilizumab and rituximab), by which we additionally stratified in cohort 2 given the observed increased risks of DIP OA progression of bDMARD users. The targeted synthetic DMARD tofacitinib was available during the observation period but was not used by any of the eligible patients.
The current exposure included 31 days after the end of supply. Thus, past users were defined as having been off treatment for at least this period.
Exposures of interest were assessed at baseline for time-invariant analyses and additionally at every other radiograph or visit for time-varying analyses. The date of the radiographic evaluation and visit date coincided in 88.8% of cases, and when we allowed a 1-month time window to assign radiographic information to the patient’s visit, this number increased to 93.3% (i.e., 6.7% of radiographs did not have accompanying visit information). Patient visits were recorded in SCQM on an average of once per year, and this interval was similar between the treatment groups (Additional file 1: Table S1). In time-varying analyses, we accounted for patients changing exposure groups during their observation period. Examples of exposure evaluation for time-varying analyses can be seen in Additional file 1: Figure S1.
Outcome
A trained radiology resident (CAL) blinded to patient data assessed conventional postero-anterior hand radiographs from each patient with known time order. The reader was trained by a set of 100 radiographs that have been evaluated by two senior rheumatologists (TH, UAW). As the standard for reading DIP OA, we used the Osteoarthritis Research Society International (OARSI) atlas [12]. DIP joints were scored for osteophyte severity (range 0–3), joint space narrowing (range 0–3), the presence of subchondral sclerosis, and the presence of central erosions, and these evaluations were used to formulate the modified KLS (range 0–4) [13].
Following the initial assessment of 7499 hand radiographs from 2870 RA patients, groups of 200 radiographs with an average of 60 patients were reassessed until Cohen’s κ coefficient, an indicator of intra-rater reliability, reached the pre-defined value of κ ≥ 0.70 [14]. This value was exceeded after 800 radiographs were assessed. The re-evaluated scores were used for the final analysis. In the final 200 rescored radiographs, kappa values were recorded as follows: K/L grade 0.90, osteophyte severity 0.77, joint space narrowing 0.83, subchondral sclerosis 0.91, and central erosions 0.92. The percentage of erosive joint surface destruction (in 19% increments) of bilateral metacarpophalangeal (MCP) joints 2–5 (with an unknown chronology) was assessed by the SCQM foundation using a similar scoring method to that described by Rau et al. In brief, the Ratingen score is based on the amount of joint surface destruction of each MCP and PIP joint. For this analysis, percentages of erosive joint surface destruction in all 8 MCP joints were summed (range 0–800%) and categorized. The intra-class correlation coefficient was 0.98.
Our primary outcome, assessed in cohort 1, was OA progression defined as an increase of ≥ 1 point in the summed KLSs of all eight DIP joints (range 0–32). The reliability of change was estimated by the κ value of 0.84; the smallest detectable change was a KLS of 0.13.
Our secondary outcome, assessed in cohort 2, was the rate of incident OA defined as a KLS ≥ 2 in ≥ 1 DIP joint (K/L grade 1 [i.e., questionable OA] was not rated as OA).
Covariates
Based on the literature review, we included the following a priori covariates in our adjusted analyses since they are confounders or risk factors for incident/progressive HOA: age (continuous), sex (binary), body mass index (BMI, continuous), RA duration (continuous), rheumatoid factor (RF) status (binary), DAS28-esr score (continuous), prednisone use (binary), cardiac disorders (i.e., myocardial infarction, ischemic heart disease, congestive heart failure, binary), hypertension (diagnosis or treatment, binary), osteoporosis (diagnosis or treatment, binary), hand surgery (binary), or large joint OA or hip/knee arthroplasty (binary). All variables were assessed at each radiograph/visit except for sex and RF status, which were given per patient (i.e., their values could not change over time).
In time-invariant analyses (baseline model), we performed a sensitivity analysis in which we additionally adjusted for the use of csDMARDs for ≥ 1 year prior to cohort entry and the use of bDMARDs for ≥ 1 year prior to cohort entry to account for prevalent use of cs/bDMARDs at cohort entry. This analysis yielded slightly higher HRs of DIP OA incidence/progression in bDMARD and cs/bDMARD combination therapy users than did the overall baseline analysis, data not shown.
A total of four variables had missing data (BMI, RA duration, RF status, DAS28-esr score). All variables for which we aimed to adjust for as well as the exposure and the outcome variable were part of the model with which we performed the two-level multiple imputation (by patient) of the missing values. We used the fully conditional specification imputation method and the Gibbs Markov chain Monte Carlo algorithm (because of some small clusters) in the BLIMP software 2.2 [15]. Although the overall missingness was only around 20%, we imputed 40 datasets. The imputation model was tested using the potential scale reduction (PSR) and performed at a PSR of < 1.02 which indicates model convergence. Detailed information about data missingness and the multiple imputation method can be found in Additional file 1: Tables S2–S4.
Statistical analysis
Statistical analyses were performed separately for the two cohorts 1 and 2. We described patient characteristics per exposure group at cohort entry (i.e., information changing over time as used in time-varying analyses was not described). Using Cox proportional hazard (PH) regression analysis, we estimated crude and adjusted hazard ratios (HRs) and 95% confidence intervals (CI) of DIP OA progression (cohort 1) and of incident DIP OA (cohort 2) using patient information at cohort entry only in all exposure groups (i.e., bDMARD, cs/bDMARD combination, past DMARD use, never DMARD use) when compared to csDMARD use (called baseline model). Cox PH assumptions were tested using the Martingale residual method and did not hold for csDMARD use or hypertension in cohort 1, or for never DMARD use in cohort 2. Therefore, since exposure and covariates changed over time, we additionally used time-varying Cox regression analyses. We additionally estimated the crude incidence rates as absolute risks based on the numbers of events and determined the follow-up times per exposure group. To test the model specifications, we chose at random one imputed dataset. The lowest Akaike Information Criterion (AIC) was reached with linear terms (i.e., introduction of interaction terms, quadratic or cubic terms did not lower the AIC value). Furthermore, for robustness assessment, we repeated our overall time-varying analyses using generalized estimating equation (GEE) analysis. Examples of data management for each analysis (Cox PH regression analysis, time-varying Cox regression analysis, GEE analysis) can be seen in Additional file 1: Tables S5–S7.
In exploratory subgroup analyses, we assessed subgroups of age (≤ 55 years, > 55 years) and RF status (positive, negative) at cohort entry. Moreover, because of their potential influence on bone turnover, we assessed subgroups of concomitant osteoporosis and concurrent prednisone in time-varying analyses. Since we aimed to assess the impact of concomitant prednisone and bDMARD use, bDMARD use without prednisone use was the comparator in this subgroup analysis.
DAS28-esr may be seen as a mediator variable; therefore, we performed a sensitivity analysis without adjusting for it in time-varying Cox regression analyses. Furthermore, since the detection of the outcome is only possible if a radiograph is taken, in sensitivity analyses using time-varying Cox regression, we led the outcome by 6 months (183 days).
Given the observed increased risks of DIP OA progression of bDMARD users, in a post hoc analysis in cohort 1, we descriptively and analytically assessed the progression of individual KLS components (osteophytes, joint space narrowing, sclerosis, and erosion). Progression of osteophyte and joint space narrowing were defined by an increase of ≥ 1 score in ≥ 1 DIP (continuous measurement). Sclerosis and erosion were measured binary, and progression was defined as an increase of ≥ 1 in frequency. In a further post hoc analysis, we adjusted our Cox regression analyses of DIP OA for fewer variables (i.e., age, sex, rheumatoid arthritis duration, hypertension, osteoporosis) because some of the smaller exposure groups ran the risk of overfitting.
The GEE analyses were performed in Stata/IC 16 while all other analyses were performed using the SAS statistical software version 9.4 (NC, USA).