Quantitative prediction of radiographic progression in patients with axial spondyloarthritis using neural network model in a real-world setting
Arthritis Research & Therapy volume 25, Article number: 65 (2023)
Predicting radiographic progression in axial spondyloarthritis (axSpA) remains limited because of the complex interaction between multiple associated factors and individual variability in real-world settings. Hence, we tested the feasibility of artificial neural network (ANN) models to predict radiographic progression in axSpA.
In total, 555 patients with axSpA were split into training and testing datasets at a 3:1 ratio. A generalized linear model (GLM) and ANN models were fitted based on the baseline clinical characteristics and treatment-dependent variables for the modified Stoke Ankylosing Spondylitis Spine Score (mSASSS) of the radiographs at follow-up time points. The mSASSS prediction was evaluated, and explainable machine learning methods were used to provide insights into the model outcome or prediction.
The R2 values of the fitted models were in the range of 0.90–0.95 and ANN with an input of mSASSS as the number of each score performed better (root mean squared error (RMSE) = 2.83) than GLM or input of mSASSS as a total score (RMSE = 2.99–3.57). The ANN also effectively captured complex interactions among variables and their contributions to the transition of mSASSS over time in the fitted models. Structural changes constituting the mSASSS scoring systems were the most important contributing factors, and no detectable structural abnormalities at baseline were the most significant factors suppressing mSASSS change.
Clinical and radiographic data-driven ANN allows precise mSASSS prediction in real-world settings. Correct evaluation and prediction of spinal structural changes could be beneficial for monitoring patients with axSpA and developing a treatment plan.
Axial spondyloarthritis (axSpA), including ankylosing spondylitis (AS), is a chronic progressive disease characterized by inflammation of the entheses, leading to new bone formation and ankylosis of joints, primarily in the axial skeleton [1, 2]. Radiographic progression of the spine has been reported to occur in approximately 20–50% of patients with AS after 2 years [3,4,5]. Progressive structural deformity of the spine and ankylosis of the sacroiliac joints lead to functional impairments, resulting in decreased physical activity and worsened quality of life.
Current treatment strategies have been validated to control the symptoms and disease activity of axSpA [6,7,8]. However, it remains inconclusive whether any currently available medications for axSpA have a significant effect on spinal radiographic progression . Several factors predicting spinal radiographic progression have been identified, including male sex, smoking, presence of syndesmophytes at baseline, high degree of sacroiliitis on magnetic resonance imaging (MRI), and positivity for HLA-B27 [1, 3,4,5, 9,10,11,12]. Long-term use of tumor necrosis factor (TNF) inhibitors and effective suppression of inflammation also contribute suppressing the spinal radiographic progression in patients with AS [8, 11, 13, 14]. The modified Stoke Ankylosing Spondylitis Spinal Score (mSASSS) is a validated outcome measure for evaluating the effect of treatment on spinal radiographic progression in AS, and radiographs at 2-year intervals are usually required to ensure sufficient sensitivity to change . These results were obtained from well-designed controlled trials and cohort studies. However, they had limitations in their application to individual patients in a real-world setting because the number of risk or protective factors differed across the patients, and their weights and interactions among them are complex and cannot be quantitatively measured in a formulated metric. Moreover, each patient’s visit schedule to the hospital varies according to lifestyle, work environment, and disease status. The time intervals of follow-up radiographs are also variable and not controlled for 2 years.
In previous studies, a novel subgroup of axSpA with a high risk for spinal radiographic progression was identified using machine learning (ML) algorithms and the ensemble method, and radiographic progression was predicted by a combination of clinical and radiographic variables [12, 16]. However, radiographic progression was defined as dichotomous discrimination [a change of ≥ 2 mSASSS units in 2 years (yes/no) or at least one new syndesmophyte formation in 2 years (yes/no)] that is qualitatively determined . If radiographic progression could be precisely and quantitatively predicted, it would be more useful to monitor the disease course of patients and assess the treatment response. In this study, using a longitudinal observational cohort of patients with axSpA and linear regression and deep neural network models, we aimed to develop a fitted model to quantitatively predict the mSASSS at a specific follow-up time point with baseline clinical characteristics, radiographic damage indices, time-adjusted inflammatory burden, and exposure to treatment [non-steroidal anti-inflammatory drugs (NSAIDs) and TNF inhibitors].
A total of 682 patients with axSpA who fulfilled the Assessment of Spondyloarthritis International Society (ASAS) classification criteria for axSpA  and had received care at St. Vincent’s Hospital, Catholic University of Korea (Suwon, Republic of Korea), between 2005 and 2021 were identified. Clinical, laboratory data, and radiographic images were retrieved from medical records. At baseline, sex, age at diagnosis, time since diagnosis, HLA–B27 status, smoking status, and history of extra-articular manifestations (uveitis, psoriasis, inflammatory bowel disease, peripheral arthritis, and enthesitis) were recorded. Disease activity was assessed according to the ankylosing spondylitis disease activity score (ASDAS) using the C-reactive protein (CRP) level . Dose and duration of NSAID intake, TNF inhibitor use, and treatment duration were determined. Records about the use of interleukin (IL)-17 inhibitor were excluded from this analysis because the number of patients treated with IL-17 inhibitors was too small to train the models. Of these, 555 patients underwent radiographic evaluation at more than two time points. Using the age- and sex-matched approach, the dataset was divided into training and testing datasets at a 3:1 ratio, and the training and testing datasets with the highest similarity in the follow-up time points and radiographic progression were selected from 1000 simulations. An ML model was learned on training data and validated on testing data. In total, 2034 follow-up radiographic time points were identified in 555 patients with axSpA. We filtered the follow-up radiographic time points over 12 months, and 1297 and 420 follow-up radiographic time point were identified in the training and testing datasets, respectively. The study was conducted in accordance with the Helsinki Declaration and was approved by the Institutional Review Board of St. Vincent’s Hospital, The Catholic University of Korea (No. VC22RISI0237).
Radiographs and scoring
Radiographs of the sacroiliac joints and the cervical and lumbar spine were obtained at baseline and after follow-up. All available radiographs per patient were independently scored simultaneously according to the mSASSS  by two experienced readers, blinded to all other data except radiograph chronology. The interobserver reliability was assessed by calculating the interclass correlation coefficient, which was 0.946 (95% confidence interval [CI] 0.940–0.952). If the difference between the scores measured by the two readers was > 5 units (defined as major disagreement), the same assessors rescored these radiographs. In case of persistent major disagreement after rescoring, an independent adjudicator assigned a final score. Radiographic sacroiliitis (SI) was scored according to the modified New York criteria , and radiological hip involvement was graded based on the Bath Ankylosing Spondylitis Radiology Index (BASRI)-hip scoring system .
Calculation of NSAIDs intake and exposure to TNF inhibitors
Data on NSAID intake (dose and frequency) were retrieved from medical records. An index of NSAID intake, as recommended by Assessment of SpondyloArthritis International Society (ASAS), accounting for both dose and duration/regimen of drug intake (0: no NSAIDs intake at all; 100: daily NSAIDs intake at a dose equivalent to diclofenac 150 mg over the whole period of interest) was calculated . Exposure to TNF inhibitors was indicated as 0 if the patient did not receive anti-TNF therapy and as duration (months) if the patient was treated with TNF inhibitors.
Calculation of time-integrated CRP levels
The inflammatory burden over the disease course was estimated using time-integrated CRP, calculated by the area under the curve method .
Supervised ML algorithms for regression
The scheme of the supervised ML is illustrated in Fig. 1. Two ML models were applied to predict the mSASSS at a specific follow-up time point: a generalized linear model (GLM)  and artificial neural network (ANN) model [25, 26]. GLM is the simplest ML algorithm for specifying the relationship between a weighted sum of the feature inputs and a single numeric target. An ANN consists of units arranged in layers to convert an input vector into an output. The layers between the input and output layers are often hidden. Each unit receives an input, applies a function, and passes it to the next layer. Weights were applied to the signals passing from one unit to another, which were modified during the training phase. Backpropagation allows the model to self-learn . A multi-layered ANN with a backpropagation algorithm was trained, a total of 1000 iterations of the ANN with three, five, seven, or nine hidden layers were simulated, and the best model with the highest performance was selected. We used the neuralnet function installed in the R package neuralnet as the default settings [27, 28]. Neuralnet function uses a globally convergent algorithm (grprop) based on resilient backpropagation without weight backtracking and additionally modifies one learning rate. The logistic function (f(u) = 1/(1 + e−u)), a bounded nondecreasing nonlinear and differentiable function, was used as an activation function, and the learning rates in the grprop algorithm are limited to the boundaries from the lower 0.5 to the upper 1.2 .
Explainable ML model interpretation
Two methods were used to interpret the model: (1) variable importance measured by the model-agnostic method  and (2) Shapley additive explanations (SHAP). In the model-agnostic method, if a variable is important, then we expect that the model’s performance will worsen after permuting the variable’s values. The significance of the variable increases with the extent of the performance variation. SHAP explains any model’s prediction by computing each feature’s contribution to the prediction [31, 32]. This method is based on Shapley values from coalitional game theory, which is the average marginal contribution across all possible coalitions . The SHAP value of a clinical variable V (e.g., NSAIDs intake index) is computed as the average of this variable’s contributions across all possible combinations of clinical variables, including V. The SHAP value of a clinical variable can be positive or negative, suggesting an increased or decreased likelihood of developing a particular outcome . Our study investigated the impact and interaction among clinical variables by visualizing SHAP values in global (cohort level) forms.
Evaluation of predictive performance for the regression model
Three error metrics were used to evaluate the performance of the regression model: (1) mean squared error (MSE), (2) root mean squared error (RMSE), (3) mean absolute error (MAE), and the coefficient of determination (R2) . The MSE of an estimator measures the average squared difference between estimated and true values. The RMSE is a rooted, monotonic transformation of the MSE. The MAE measures the average of the sum of the absolute differences between the observed and predicted values. The coefficient of determination is the proportion of variation in the dependent variable that is predictable from independent variables.
For continuously distributed data, the results are shown as means with standard deviation; between-group comparisons were performed using Student’s t-test or analysis of variance (ANOVA). Categorical or dichotomous variables were presented as frequencies and percentages and were compared using the chi-squared test or Fisher’s exact test. Correlation analysis between two continuous variables was performed using Pearson’s method. A two-sided P-value less than 0.05 was considered statistically significant. All statistical analyses were performed using R (version 4.2.0, R Project for Statistical Computing, www.r-project.org).
Baseline characteristics of the study population
In total, 555 patients with axSpA were enrolled and split into training (n = 416, 75%) and testing (n = 139, 25%) groups in an age- and sex-matched stratified manner. The baseline characteristics of the study participants (n = 555) are presented in Table 1. All baseline characteristics, except for a history of enthesitis were comparable between the groups. In total, 310 patients with axSpA (55.8%) received TNF inhibitors. The number of follow-up time points in the training and testing datasets was 1297 and 420, respectively (Fig. 2).
Linear regression models for mSASSS prediction
Known and potential factors affecting radiographic progression were included while the formulation of the linear regression model: sex, age at diagnosis, disease duration, body mass index (BMI), HLA-B27, peripheral involvement, uveitis, enthesitis, inflammatory bowel disease, psoriasis, smoking, baseline CRP level, baseline ASDAS-CRP, grade of sacroiliitis, grade of hip joint involvement, and baseline mSASSS. Treatment-dependent variables included time after baseline evaluation, time-integrated CRP level, and exposure to TNF inhibitors. If the patient did not receive a TNF inhibitor, exposure to the TNF inhibitor was assigned to zero. Baseline mSASSS was modified into two formats and assigned to the models: (1) C-spine and L-spine mSASSS and (2) the number of each score of mSASSS scores (0, 1, 2, and 3). Finally, two GLM models (designated as GLM-1 and GLM-2) were built using the formats of mSASSS and separately evaluated (Fig. 1).
The prediction results of mSASSS in the testing dataset are shown in Fig. 3. For GLM-1, R2 and RMSE values were 0.9093 and 3.5796, respectively. The most important variables for prediction were baseline mSASSS of the L-spine and C-spine, followed by the time after the initial evaluation. For GLM-2, R2 and RMSE values were 0.9356 and 3.1409, respectively. The number of mSASSS segment scores 0, 1, and 2 were counted as important variables, but the number of mSASSS segment scores 3 was not. The time after the initial evaluation was also an important variable.
ANN models for mSASSS prediction
Same as that in the GLM, the baseline mSASSS was modified into two formats, and two ANN models (designated as ANN-1 and ANN-2, respectively) were built using mSASSS format. A multi-layered ANN with a backpropagation algorithm and three, five, seven, or nine hidden layers was fitted, and the best model with the highest performance was selected. In both ANN-1 and ANN-2, models of five hidden layers showed the best performance compared to models of three, seven, or nine layers by MSE (Fig. 4A).
For ANN-1 with five hidden layers, the R2 and RMSE values were 0.9468 and 2.9943, respectively (Fig. 4B). The most important two variables for prediction were the same for GLM-1 (baseline mSASSS of the L-spine and C-spine), but ANN-1 showed better performance than GLM-1. The third most important variable was the time after the initial evaluation in GLM-1, while positive history of uveitis in ANN-1.
For ANN-2, with five hidden layers, the R2 and RMSE values were 0.9537 and 2.8358, respectively (Fig. 4C). This model showed the best performance. The number of mSASSS segment scores of 3 and 2 were considered the most important variables, followed by the number of mSASSS segment scores of 0 and 1. Time after the initial evaluation, history of uveitis, and smoking status were also important variables. Exposure time to TNF inhibitors was identified as having some contribution to ANN-2.
Figure 5 shows the SHAP summary plot for the top 10 features contributing to the ANN-2 model’s prediction of follow-up mSASSS in patients with axSpA. No 3 and 2 scores in the mSASSS segments (i.e., no bridged syndesmophytes) and zero scores for all 24 segments in the mSASSS (i.e., total mSASSS = 0) at baseline evaluation exercised strong leverage on mSASSS change in a negative way. Short-term follow-up (indicated as 13 months after the initial evaluation in this analysis) also had a negative effect on increase in mSASSS prediction. Smoking and being overweight (indicated as BMI = 30.6 kg/m2) contributed to increase in mSASSS prediction at follow-up. Overall, contribution of minor factors was distinctly sensed in the ANN compared to the GLM.
When subdivided into three subgroups by follow-up time points (less than 2 years, 2–4 years, and over 4 years), the RMSE tended to decrease as the follow-up time increased (Table 2). The RMSE was much smaller in patients without syndesmophytes at baseline than in those with syndesmophytes at baseline (Table 2).
In this study, we demonstrated the feasibility of ML models in predicting mSASSS using baseline clinical characteristics and treatment-dependent variables, which were obtained in clinical practice but were quite diverse. The mSASSS was predictable beyond the limit of the simplified binary definition of radiographic progression in 2 years. The performance was excellent in that the R2 values of the fitted models were in the range of 0.93–0.96. In particular, ANN performed better than GLM and effectively captured the complex interactions among variables and their contributions to the transition of mSASSS over time in the fitted models.
In our analysis, the input of mSASSS as a format of the number of each score had a better predictive power compared to the input of mSASSS as a format of the total score, indicating that fractionized scoring data is more suitable for building the mSASSS prediction model than the summed-up single value. Radiographic damage in the axSpA linearly progresses at a variable rate and is scored in the range of 0–72 by mSASSS. Each score (0, 1, 2, and 3) represents its own structural abnormalities and a distinct pathophysiological background. Even the same total score could indicate different structural damages. Each lesion could also respond differently to treatment according to the adjacent internal tissue status, such as fat deposition or metaplasia on MRI [35, 36]. The total score of the mSASSS and dichotomous definition of radiographic progression is useful for easy recognition and prompt assessment of spinal structural damage in clinical practice. However, it might be too simplified to present the substantive condition. Categorizing continuous variables by an arbitrary cutoff point can lead to the loss of important information or overestimation or underestimation . The presence of syndesmophyte(s) at baseline was a powerful established predictor for radiographic progression within 2 years [1, 38]. However, total mSASSS was a more important feature for predicting radiographic progression than the presence of syndesmophyte(s) in most ML algorithms , which effectively deal with high-dimensional complex data, including multiple heterogeneous factors contributing to the disease [39, 40]. More detailed and fragmented data could be more informative for making a predictive model with better performance in ML processing.
In ANN, models with five hidden layers showed the best performance compared to models of three, seven, or nine layers. This indicates that deeper ANN did not necessarily demonstrate better performance. Simple algorithms can perform just as well as or even better than more complex ones in some circumstances: when the underlying relationship between features and output is simple and additive or when the number of training examples is relatively low. Thus, more complex models are likely to overfit and generalize poorly . Clinical data are not as highly complex as radiographic images, magnetic resonance images, or multi-omics data and might not fit the deeper or sophisticated ANN [39, 40]. In the subgroup analysis, the mSASSS prediction was more accurate with a longer follow-up period or in the absence of syndesmophytes at baseline. The short-term follow-up data may not have been sufficiently learned because there were relatively few data points (Fig. 2A); moreover, the complexity of the interaction between variables could be lower in the long-term stable stage. Syndesmophytes result from new bone formation that develops after initiating an inflammatory event [42, 43]. The presence of syndesmophyte(s) indicates that the bone-forming potential might exceed the control threshold of inflammation. In our analysis, laboratory data and treatment strategies largely depended on the inflammatory process (e.g., NSAID intake index, exposure to TNF inhibitor, and time-integrated CRP levels) and did not include any specific information regarding bone formation such as bone formation biomarkers and sequential MRI findings. Thus, the decreased accuracy of mSASSS prediction in the presence of syndesmophyte(s) might be attributable to insufficient information.
ANN showed better performance and better discerned the complex interaction among variables and their contribution to the outcome compared to GLM. Radiographic progression is structural damage as an aggregated result of interactions between clinical, molecular, and environmental factors and cannot be fully explained by simple and additive models. ANN traditionally had a concern, so-called black-box problem. The problem-solving process in artificial intelligence is opaque and not interpretable to humans in a straightforward manner. Feature importance and SHAP analyses are solutions in the field of explainable ML and are used to gain insight into model performance and the contribution of various risk factors. Structural changes constituting the mSASSS scoring systems were the most important contributing factors, and no detectable structural abnormalities at baseline were the most significant factors suppressing the mSASSS change. This finding corroborates the importance of early diagnosis and initiation of effective treatment before spinal structural changes begin.
This study had some limitations. First, the data were retrospectively collected. Retrospective data collection is susceptible to misclassification and information bias. Second, this study lacked bone formation markers or MRI findings, which could be informative for new bone formation in axSpA. Third, mSASSS has inherent limitations: the inability to assess involvement of the thoracic spine and facet joints, which are the most frequently affected sites of axSpA .
In conclusion, interventions that slow or halt the progression of irreversible structural damage in axSpA are expected to confer clinical benefits in terms of delaying loss of function and improving the quality of life. Correct estimation of the disease and prediction of treatment response should be beneficial for evaluating the treatment response and making a future plan. Our study showed that the constructing predictive models for radiographic progression were feasible in a real-world setting and that the models displayed good performance. Prospective studies examining the use of ML in mSASSS prediction in a multicenter cohort with a larger size are needed to validate the use of such models. The discovery of clinically active biomarker(s) in terms of new bone formation and the development of exact assessment tools could also boost the development of a better predictive model for radiographic progression in axSpA.
Availability of data and materials
The data underlying this article cannot be shared publicly for the protection of the privacy of individuals that participated in the study. The data may be shared upon reasonable request to the corresponding author.
Artificial neural networks
Analysis of variance
Assessment of Spondyloarthritis International Society
Ankylosing spondylitis disease activity score
Bath Ankylosing Spondylitis Radiology Index
Body mass index
Generalized linear model
Mean absolute error
Magnetic resonance imaging
Modified Stoke Ankylosing Spondylitis Spinal Score
Mean squared error
Non-steroidal anti-inflammatory drugs
The coefficient of determination
Root mean squared error
Shapley additive explanations
Tumor necrosis factor
Sieper J, Braun J, Dougados M, Baeten D. Axial spondyloarthritis. Nat Rev Dis Primers. 2015;1:15013.
Sieper J, Poddubnyy D. Axial spondyloarthritis. Lancet. 2017;390(10089):73–84.
Poddubnyy D, Haibel H, Listing J, Marker-Hermann E, Zeidler H, Braun J, Sieper J, Rudwaleit M. Baseline radiographic damage, elevated acute-phase reactant levels, and cigarette smoking status predict spinal radiographic progression in early axial spondylarthritis. Arthritis Rheum. 2012;64(5):1388–98.
Baraliakos X, Listing J, von der Recke A, Braun J. The natural course of radiographic progression in ankylosing spondylitis–evidence for major individual variations in a large proportion of patients. J Rheumatol. 2009;36(5):997–1002.
Baraliakos X, Listing J, von der Recke A, Braun J. The natural course of radiographic progression in ankylosing spondylitis: differences between genders and appearance of characteristic radiographic features. Curr Rheumatol Rep. 2011;13(5):383–7.
Baraliakos X, Gensler LS, D’Angelo S, Iannone F, Favalli EG, de Peyrecave N, Auteri SE, Caporali R. Biologic therapy and spinal radiographic progression in patients with axial spondyloarthritis: a structured literature review. Ther Adv Musculoskelet Dis. 2020;12:1759720x20906040.
Wang R, Bathon JM, Ward MM. Nonsteroidal antiinflammatory drugs as potential disease-modifying medications in axial spondyloarthritis. Arthritis Rheumatol. 2020;72(4):518–28.
Karmacharya P, Duarte-Garcia A, Dubreuil M, Murad MH, Shahukhal R, Shrestha P, Myasoedova E, Crowson CS, Wright K, Davis JM 3rd. Effect of therapy on radiographic progression in axial spondyloarthritis: a systematic review and meta-analysis. Arthritis Rheumatol. 2020;72(5):733–49.
Ramiro S, Stolwijk C, van Tubergen A, van der Heijde D, Dougados M, van den Bosch F, Landewé R. Evolution of radiographic damage in ankylosing spondylitis: a 12 year prospective follow-up of the OASIS study. Ann Rheum Dis. 2015;74(1):52–9.
van Tubergen A, Ramiro S, van der Heijde D, Dougados M, Mielants H, Landewe R. Development of new syndesmophytes and bridges in ankylosing spondylitis and their predictors: a longitudinal study. Ann Rheum Dis. 2012;71(4):518–23.
Sari I, Lee S, Tomlinson G, Johnson SR, Inman RD, Haroon N. Factors predictive of radiographic progression in ankylosing spondylitis. Arthritis Care Res (Hoboken). 2021;73(2):275–81.
Joo YB, Baek IW, Park YJ, Park KS, Kim KJ. Machine learning-based prediction of radiographic progression in patients with axial spondyloarthritis. Clin Rheumatol. 2020;39(4):983–91.
Park JW, Kim MJ, Lee JS, Ha YJ, Park JK, Kang EH, Lee YJ, Song YW, Lee EY. Impact of tumor necrosis factor inhibitor versus nonsteroidal antiinflammatory drug treatment on radiographic progression in early ankylosing spondylitis: its relationship to inflammation control during treatment. Arthritis Rheumatol. 2019;71(1):82–90.
Sepriano A, Ramiro S, Wichuk S, Chiowchanwisawakit P, Paschke J, van der Heijde D, Landewé R, Maksymowych WP. Tumor necrosis factor inhibitors reduce spinal radiographic progression in patients with radiographic axial spondyloarthritis: a longitudinal analysis from the alberta prospective cohort. Arthritis Rheumatol. 2021;73(7):1211–9.
van der Heijde D, Braun J, Deodhar A, Baraliakos X, Landewé R, Richards HB, Porter B, Readie A. Modified stoke ankylosing spondylitis spinal score as an outcome measure to assess the impact of treatment on structural progression in ankylosing spondylitis. Rheumatology (Oxford). 2019;58(3):388–400.
Joo YB, Baek IW, Park KS, Tagkopoulos I, Kim KJ. Novel classification of axial spondyloarthritis to predict radiographic progression using machine learning. Clin Exp Rheumatol. 2021;39(3):508–18.
Rudwaleit M, van der Heijde D, Landewe R, Listing J, Akkoc N, Brandt J, Braun J, Chou CT, Collantes-Estevez E, Dougados M, et al. The development of Assessment of SpondyloArthritis international Society classification criteria for axial spondyloarthritis (part II): validation and final selection. Ann Rheum Dis. 2009;68(6):777–83.
Molto A, Gossec L, Meghnathi B, Landewe RBM, van der Heijde D, Atagunduz P, Elzorkany BK, Akkoc N, Kiltz U, Gu J, et al. An Assessment in SpondyloArthritis International Society (ASAS)-endorsed definition of clinically important worsening in axial spondyloarthritis based on ASDAS. Ann Rheum Dis. 2018;77(1):124–7.
Creemers MC, Franssen MJ, van’t Hof MA, Gribnau FW, van de Putte LB, van Riel PL. Assessment of outcome in ankylosing spondylitis: an extended radiographic scoring system. Ann Rheum Dis. 2005;64(1):127–9.
van der Linden S, Valkenburg HA, Cats A. Evaluation of diagnostic criteria for ankylosing spondylitis. A proposal for modification of the New York criteria. Arthritis Rheum. 1984;27(4):361–8.
MacKay K, Brophy S, Mack C, Doran M, Calin A. The development and validation of a radiographic grading system for the hip in ankylosing spondylitis: the bath ankylosing spondylitis radiology hip index. J Rheumatol. 2000;27(12):2866–72.
Dougados M, Simon P, Braun J, Burgos-Vargas R, Maksymowych WP, Sieper J, van der Heijde D. ASAS recommendations for collecting, analysing and reporting NSAID intake in clinical trials/epidemiological studies in axial spondyloarthritis. Ann Rheum Dis. 2011;70(2):249–51.
Matthews JN, Altman DG, Campbell MJ, Royston P. Analysis of serial measurements in medical research. BMJ. 1990;300(6719):230–5.
Arnold KF, Davies V, de Kamps M, Tennant PWG, Mbotwa J, Gilthorpe MS. Reflection on modern methods: generalized linear models for prognosis and intervention—theory, practice and implications for machine learning. Int J Epidemiol. 2020;49(6):2074–82.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44.
Molnar C, Casalicchio G, Bischl B. iml: an R package for interpretable machine learning. J Open Source Software. 2018;3(26):786.
Günther F, Fritsch S. Neuralnet: training of neural networks. R J. 2010;2(1):30.
Magoulas GD, Plagianakos VP, Vrahatis MN. Globally convergent algorithms with local learning rates. IEEE Trans Neural Networks. 2002;13(3):774–9.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, editors. Adv Neural Inf Process Syst vol. 30. 2017. p. 4765–74.
Molnar C. Interpretable machine learning: Lulu.com. 2020.
Shapley LS, Roth AE. The Shapley value: essays in honor of Lloyd S. Shapley: Cambridge University Press; 1988.
Steurer M, Hill RJ, Pfeifer N. Metrics for evaluating the performance of machine learning based automated valuation models. J Propert Res. 2021;38(2):99–129.
Chiowchanwisawakit P, Lambert RG, Conner-Spady B, Maksymowych WP. Focal fat lesions at vertebral corners on magnetic resonance imaging predict the development of new syndesmophytes in ankylosing spondylitis. Arthritis Rheum. 2011;63(8):2215–25.
Baraliakos X, Kruse S, Auteri SE, de Peyrecave N, Nurminen T, Kumke T, Hoepken B, Braun J. Certolizumab pegol treatment in axial spondyloarthritis mitigates fat lesion development: 4-year post-hoc MRI results from a phase 3 study. Rheumatology (Oxford). 2022;61(7):2875–85.
Ranganathan P, Pramesh CS, Aggarwal R. Common pitfalls in statistical analysis: logistic regression. Perspect Clin Res. 2017;8(3):148–51.
Baraliakos X, Listing J, Rudwaleit M, Haibel H, Brandt J, Sieper J, Braun J. Progression of radiographic damage in patients with ankylosing spondylitis: defining the central role of syndesmophytes. Ann Rheum Dis. 2007;66(7):910–5.
Kim KJ, Tagkopoulos I. Application of machine learning in rheumatic disease research. Korean J Intern Med. 2019;34(4):708–22.
Kingsmore KM, Puglisi CE, Grammer AC, Lipsky PE. An introduction to machine learning and analysis of its use in rheumatic diseases. Nat Rev Rheumatol. 2021;17(12):710–30.
Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30.
Poddubnyy D, Sieper J. Mechanism of new bone formation in axial spondyloarthritis. Curr Rheumatol Rep. 2017;19(9):55.
Maksymowych WP, Elewaut D, Schett G. Motion for debate: the development of ankylosis in ankylosing spondylitis is largely dependent on inflammation. Arthritis Rheum. 2012;64(6):1713–9.
We would like to thank Editage (www.editage.co.kr) for English language editing.
The authors have not declared a specific grant for this research from any funding agency in the public, commercial, or not-for-profit sectors.
Ethics approval and consent to participate
This study was approved by the Institutional Review Board of St. Vincent’s Hospital, The Catholic University of Korea (No. VC22RISI0237). The requirement for informed consent was waived due to the retrospective design of the study.
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Baek, IW., Jung, S.M., Park, YJ. et al. Quantitative prediction of radiographic progression in patients with axial spondyloarthritis using neural network model in a real-world setting. Arthritis Res Ther 25, 65 (2023). https://doi.org/10.1186/s13075-023-03050-6