Using longitudinal data with CVD outcomes from rheumatic outpatient clinics from the total ATACC-RA cohort, we have revealed comparable ability of various risk age models to rank RA patients in terms of time to CVD events. Interestingly, the cardiovascular risk age chart, which is based on quite wide CVD-RF intervals (e.g., sBP 120/140/160/180 mmHg), had only a slightly lower c-index (0.68) than the other risk age models, indicating comparable performance in correct ranking of individuals in terms of future CVD risk. Thus, although risk age estimations frequently differ 5 years or more [15], their discriminative ability is very similar.
Among the included RA patients, the concordance was about 0.7 for all risk age models and similar to the SCORE algorithms they are based on, which had a c-index of 0.71–0.72. Thus, c-indices were somewhat lower than what was reported for the general European population in the original SCORE paper. Conroy et al. found c-indices of 0.81 and 0.74 for high- and low-risk countries, respectively [8].
Evaluating risk age models by investigating time to event and using a composite of CVD events, the cardiovascular risk age model and the vascular age models all had comparable c-statistics around 0.7. In comparison, a concordance of 0.5 implies a discriminative ability no better than pure chance, whereas c-statistics approaching 0.60 to 0.75 are sometimes expressed as demonstration of “possibly helpful discrimination” and > 0.75 as “clearly useful discrimination” (although this is a criticized practice) [20].
External validation of the SCORE algorithm, from which the cardiovascular risk age and vascular age models are based on and calculated from, respectively, have revealed wide ranging c-statistics. In a review by Damen et al., reported c-indices ranged from 0.62 to 0.91 in different European and non-European study populations [6]. Comparisons of c-indices across populations are hampered due to factors such as differences in age distributions. However, for analyses in these cohorts, several additional explanations of the observed suboptimal concordance are plausible. The SCORE algorithm was developed for the European population. In our analyses, European and non-European cohorts were included. Additional analyses restricted to data from European cohorts were performed, but comparable c-indices were found. We also performed CVD risk estimations using both algorithms for high- and low-risk countries. In the main analyses, we pooled data across centers to increase the numbers and observation time (total person years at risk), but a limitation is the heterogeneity between the various cohorts. There was wide range of the c-indexes and standard errors across each unique center included in the analyses.
In RA, inflammatory disease activity, disease duration, and usage of GCs, sDMARDs, and bDMARDs are all factors that may influence overall risk of CVD [21,22,23,24,25,26,27,28]. In a recent study, Crowson et al. demonstrated that albeit conventional CVD-RFs accounted for half (49%) of CVD events in RA, high-grade inflammation and RA characteristics explained about 30% of the CVD risk [29]. However, the prediction models we evaluated only assess CVD risk related to conventional CVD-RFs. Furthermore, RA patients without known CVD have high occurrence of atherosclerotic plaques even in the case of only moderate estimated absolute risk, justifying the use of carotid ultrasound as a supplement in CVD risk stratification [30].
The latest EULAR recommendations on CVD risk management underline that rheumatic disease activity should be controlled to lower overall risk of CVD [11]. RA-related characteristics may also complicate the interpretation of conventional CVD-RFs. The lipid paradox denotes the phenomena in which low lipid levels due to elevated inflammation is associated with an increased risk of CVD [31]. Thus, a future RA-specific CVD risk algorithm should possibly weight lipid levels according to the disease activity. Regarding CVD prediction models, if important CVD-RFs are left out or not weighted appropriately, then concordance will be impaired.
In this paper, we aimed to evaluate the influence of RA disease characteristics on the performance of risk age models in ranking individuals correctly as high(er)- or low(er)-risk individuals. However, our findings were inconclusive due to the lack of statistical power resulting from the small number of participants included and/or short observation time with few events occurring. Underreporting of CVD events during the follow-up time is also possible, especially since RA patients may suffer from asymptomatic CVD events [32]. The inconclusive results on the association of RA disease characteristics and the c-index values due to large standard errors could have been different with a longer follow-up time and more participants. In time-to-event analyses, consideration of informative and interval censoring is also required. Only data on disease activity and sDMARD and bDMARD treatment were available at baseline. Surprisingly, a high rate of RA patients was not using sDMARDs and bDMARDs at study inclusion. Although this should be considered before extrapolation of our results to other RA cohorts, this may be partly explained by that a high rate of RA patients included in these analyses had short disease duration (explaining why some were methotrexate naive) and also to differences across different nations (explaining why some were bDMARDs naïve). Another limitation to this multi-center study is the lack to control that BP measurements were conducted similarly. Data on family history of premature CVD were also lacking. Among eligible patients, estimation of risk age were not possible in 338–357 individuals when using prediction models without or with HDL-c, respectively, due to missing data on sBP (n = 114), TC (n = 205), HDL-c (n = 219), and current smoking (n = 119).
There are also limitations with c-index calculation since it reports concordance based on ranks and not on the magnitude of risk differences. Consequently, in the case of very similar and only slightly different risk ages across subjects, the CVD prediction model’s discriminatory ability will be impaired. Moreover, concordance only describes one feature regarding the predictive ability of a risk model. Calibration, a comparison of the number of expected events to the number of observed events, is another important property regarding validation of prediction models [33]. However, in contrast to models predicting absolute risk, calibration cannot be performed in prediction models using the risk age concept.
The risk age models we have validated are derived from the SCORE algorithms which calculate absolute risk of fatal CVD. In the original SCORE publication, the authors argued that developing a CVD prediction model based on non-fatal CVD events are prone to errors due to misclassification. Non-fatal events are also of clinical importance, and it has been suggested to convert SCORE with a multiplier to estimate fatal and non-fatal events [7, 34]. However, Jørstad et al. found that the ratio of risk of fatal to fatal plus non-fatal CVD was largely dependent on age and sex and, consequently, a fixed multiplication factor was not applicable [35]. Since risk age communicates the detrimental effects of modifiable CVD-RFs on overall CVD risk and/or life expectancy, it is an attractive concept, also informing patients on the benefit of optimizing CVD-RFs. In the review by Groenewegen et al., it is argued that the perspicuous risk age concept might improve communication about CVD risk and possibly patient adherence to CVD preventive strategies (e.g. lifestyle changes and/or cardio-protective medication) [12]. CVD risk assessment is especially important in RA due to the high prevalence of modifiable CVD-RFs [36,37,38]. However, whether the risk age is an intuitive concept and if it has incremental value beyond absolute and relative risk calculation needs further evaluation.
To our knowledge, this is the first study comparing the discriminative ability of the risk age models proposed for use as supplements to CVD risk evaluation in the ESC guidelines for the general population [7, 12]. Not surprisingly, our study supports the notion that risk age models perform similarly to the SCORE algorithms in ranking individuals correctly as high or low risk of CVD events. Despite that risk age estimations frequently differ 5 years or more, the current risk age models based on SCORE perform almost equivalently in terms of concordance.