Skip to main content

Machine learning identification of thresholds to discriminate osteoarthritis and rheumatoid arthritis synovial inflammation



We sought to identify features that distinguish osteoarthritis (OA) and rheumatoid arthritis (RA) hematoxylin and eosin (H&E)-stained synovial tissue samples.


We compared fourteen pathologist-scored histology features and computer vision-quantified cell density (147 OA and 60 RA patients) in H&E-stained synovial tissue samples from total knee replacement (TKR) explants. A random forest model was trained using disease state (OA vs RA) as a classifier and histology features and/or computer vision-quantified cell density as inputs.


Synovium from OA patients had increased mast cells and fibrosis (p < 0.001), while synovium from RA patients exhibited increased lymphocytic inflammation, lining hyperplasia, neutrophils, detritus, plasma cells, binucleate plasma cells, sub-lining giant cells, fibrin (all p < 0.001), Russell bodies (p = 0.019), and synovial lining giant cells (p = 0.003). Fourteen pathologist-scored features allowed for discrimination between OA and RA, producing a micro-averaged area under the receiver operating curve (micro-AUC) of 0.85±0.06. This discriminatory ability was comparable to that of computer vision cell density alone (micro-AUC = 0.87±0.04). Combining the pathologist scores with the cell density metric improved the discriminatory power of the model (micro-AUC = 0.92±0.06). The optimal cell density threshold to distinguish OA from RA synovium was 3400 cells/mm2, which yielded a sensitivity of 0.82 and specificity of 0.82.


H&E-stained images of TKR explant synovium can be correctly classified as OA or RA in 82% of samples. Cell density greater than 3400 cells/mm2 and the presence of mast cells and fibrosis are the most important features for making this distinction.


Joint damage in the knee can be severe in both osteoarthritis (OA) and rheumatoid arthritis (RA) such that total knee replacement (TKR) is often the only management option [1]. More than 700,000 TKRs are performed annually in the USA, and explanted tissue is often stained with hematoxylin and eosin (H&E) and evaluated by a pathologist as the standard of care. The physical exam of the knees of patients with OA can be similar to that of patients with RA, that is, both conditions can be characterized by joint swelling, warmth, and effusion. Pathology reports regarding the extent of synovial inflammation can be another useful piece of information for the managing clinician to discriminate ongoing RA-related disease activity from coincident primary OA in patients with longstanding RA. Therefore, establishing a precise level of synovial tissue inflammation for future investigators could provide a fast, inexpensive, and clinically meaningful benchmark for patient assessment.

Several investigators have sought to optimize methods to score synovial inflammation using H&E-stained synovial tissue samples to distinguish OA from RA. For example, Krenn et al. [2,3,4] developed a widely cited scoring algorithm that includes semi-quantitative assessments of three synovial features identifiable on H&E-stained synovium: inflammatory infiltrates, lining hyperplasia, and stromal activation, a measure of cellularity that encompasses fibroblasts, endothelial cells, and giant cells. It is challenging to distinguish macrophages from fibroblasts in H&E-stained images, and as a result, some groups have modified the Krenn scoring system, adopting assessments of inflammatory infiltrates and lining hyperplasia, but not stromal activation, to score synovitis [5]. In an effort to further improve sensitivity and specificity, assessments of five immunohistochemistry-stained features (CD31, CD3, CD68, CD20, and Ki67) were recently added to the Krenn score [6]. Since immunohistochemistry is not as widely available and is more expensive than H&E, our group has been studying whether assessing additional histological features in H&E-stained sections, such as plasma cells, Russell bodies, binucleate plasma cells, neutrophils, mast cells, and lining and sub-lining giant cells as well as extracellular features such as fibrin, detritus, fibrosis, and mucoid degeneration, might be useful for discriminating various types of synovial inflammation. We previously reported that plasma cells, binucleate plasma cells, Russell bodies, fibrin, neutrophils, and synovial lining giant cells were predictive of high inflammatory gene expression subsets in RA [7].

Another challenge in using semi-quantitative assessments of synovitis is the disagreement between human pathologist scores of the same sample due to the subjective grading of synovial features. Since synovial inflammation tends to be patchy, it is likely that one source of inter- and intra-rater variability is that human pathologists make a subjective choice to assess certain high-power fields in any given whole slide image. Automated computer vision quantification of cell density on whole slide images removes the requirement for subjective selection of a certain field of interest, is reproducible, is scalable as it does not require the technical expertise of a pathologist, and captures granular information about the number of cells in a synovial sample, which is very onerous to manually count by a pathologist. We previously developed and validated a computer vision algorithm to automatically count each cell nucleus in an H&E-stained synovial whole slide image in 170 RA patient synovial samples [8]. This algorithm uses classical computer vision techniques to identify synovial tissue and nuclei and yields a value of cell density, as identified by mean stained nuclei count per mm2 of tissue. Using this approach, we found that mean whole slide image synovial cell density in RA is strongly correlated with human pathologist scores and bulk tissue RNA-seq gene expression inflammatory subset. We hypothesized that the computer vision quantification of cell density in addition to human pathologist scores would be useful in discriminating OA from RA. Here, we employed machine learning to calculate optimal thresholds to discriminate OA from RA-related synovial inflammation using human pathologist scores of fourteen histology features as well as computer vision quantification of mean cell density in a cohort of 147 OA patients and 60 RA patients undergoing knee arthroplasty.


Study design and cohort

We compared knee synovial histologic features from two different cohorts of patients undergoing TKR for OA or RA at a high-volume, tertiary care hospital. This was a secondary analysis of OA and RA patients that were identified via electronic medical records or physician referral and enrolled during their preoperative screening visit.

The OA patients were enrolled in the OA subtypes cohort from November 2018 through October 2019. Patients over the age of 45 that met ACR Clinical/Radiographic Criteria, ACR Clinical/Laboratory Criteria [9], or Kellgren-Lawrence (KL) Radiographic Criteria (grades 2–4) for knee OA [9, 10] were included in the study. Patients who had a fracture in the operative knee, a diagnosis of a systemic rheumatic disease such as RA, or any disease other than OA as an indication for TKR were excluded from the study. In addition, three patients were excluded from the study sample after TKR because the pathologist assessment of the arthroplasty explant revealed a rheumatic disease diagnosis masked as OA.

As previously described, RA patients were enrolled in the RA Perioperative FLARE Study from October 2013 to October 2021 [7, 11, 12]. Inclusion criteria for this cohort were patients above the age of 18 who met the American College of Rheumatology (ACR)/European League Against Rheumatism 2010 classification criteria for RA [13] and/or the ACR 1987 criteria for RA [14]. Patients who had any other systemic rheumatic disease or crystalline arthropathy were excluded.

Written informed consent was obtained for all participants. Patients meeting the inclusion/exclusion criteria were enrolled in the respective OA and RA cohorts. Demographic characteristics such as patient age, race, sex, and body mass index (BMI) were collected. Erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), rheumatoid factor (RF), and cyclic citrullinated peptide (CCP) were measured on all OA and RA patients. RF and CCP were measured as part of the standard of care in RA patients, or if unavailable, were performed by serum ELISA as in OA patients.

As per institutional policy, ethical approval for this study was provided by the Institutional Review Board at the Hospital for Special Surgery (IRB #2018-0895 and #2014-233), and the research was performed in accordance with the relevant guidelines and regulations. The study methods and results are described in accordance with the Strengthening of Reporting in Observational studies in Epidemiology (STROBE) guidelines for cohort studies [15].

Tissue processing and histologic scoring

Synovial samples were obtained intra-operatively from 147 OA patients and 60 RA patients. As per the study protocol, orthopedic surgeons were requested to preferentially obtain a research sample from grossly abnormal-looking synovium. Tissue for histological examination was chosen by a pathologist on the basis of gross features including the smoothness and granularity of the synovial surface, red or brown discoloration, and the clarity, dullness, or opacity of the synovial layer, preferentially avoiding regions of electro-cautery effect.

Synovial samples were preferentially obtained from the most grossly inflamed (dull and opaque) area of the synovium. If there was no obviously inflamed synovium, samples were obtained from standard locations: the femoral aspects of the medial and lateral gutters and the central supratrochlear region of the suprapatellar pouch. OA synovial tissue samples were formalin-fixed and paraffin-embedded, and the RA tissues were fresh-frozen in optimal cutting temperature compound. Each tissue biopsy was sectioned at 5-μm thickness and stained with Harris-modified hematoxylin solution and eosin Y (H&E) manufactured by Epredia in Kalamazoo, MI. An expert musculoskeletal pathologist (ED) scored fourteen synovial histologic features in a single section for each patient: lymphocytic inflammation, mucoid change, fibrosis, fibrin, germinal centers, lining hyperplasia, neutrophils, detritus, plasma cells, binucleated plasma cells, Russell bodies, sub-lining giant cells, synovial lining giant cells, and mast cells. Detailed methods for scoring these features are included in the Appendix, some of which are described in prior studies [8] and available at

Computer vision analysis of cell density

Pathology slides were digitized using an Aperio AT Turbo Scanner manufactured by Leica Biosystems in Deer Park, IL, USA, with a 20× resolution of the whole slide image. As previously described [8], we applied computer vision techniques on the whole slide images to count the cell nuclei and quantify the amount of tissue present. The whole slide images were deconstructed into smaller image tiles, each covering an area of approximately 0.25 mm2. These tiles were transformed into grayscale, analyzed for different intensity levels, and assigned a metric based on the proportion of the tile determined to contain tissue. Using a combination of techniques—including Otsu’s method [16], the watershed algorithm, and local adaptive thresholding—the cell nuclei were isolated from the tissue within the image. Final nuclei counts were refined using shape filtering and nuclei density was calculated by normalizing the total count of individual nuclei by the tissue area. This method yields a continuous value of mean cell count per mm2 of tissue. Pre-processing the whole slide image into tiles takes an average of 40 min, which enables the computation of nuclei density in under a minute. The open-access code can be downloaded here: See Fig. 1 for representative histological images of varying nuclei densities.

Fig. 1
figure 1

Representative images of varying nuclei densities

Data analysis

Demographic characteristics of the OA and RA patients are reported as frequencies, means, standard deviations (SD), medians, and interquartile ranges (IQR). Chi-square tests were used to compare fourteen pathologist-graded histology scores between OA and RA patients. Logistic regression models were performed to distinguish OA vs RA as the outcome and adjusting for fibrosis and mast cell scores with lymphocytic infiltrates.

Supervised machine learning analysis

A supervised machine learning model was built to classify OA vs RA samples using Random forests (Fig. 2). The model inputs were either all fourteen pathologist scores, the computer vision score alone, or both sets of scores combined. The model is selected according to the area under the receiver operating curve (AUC). The hyperparameters of the random forest model we tuned include the number of trees and the depth of each tree, which were optimized with a nested 5-fold cross-validation process (5-fold for the outer loop and 5-fold for the inner loop) [17] from candidate values [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200] and [5, 6, 8, 10, 12, 14, 16, 18, 20], respectively. The outer loop separates the data into 5 equal folds with stratified partition. For each iteration, one specific fold will be used as a testing and the rest 4 folds as training. Then another 5-fold cross-validation procedure will be performed on the training set to estimate the optimal model hyperparameters. The final results were reported using macro-AUC and micro-AUC on the testing data. For micro-AUC, we computed the AUC of each fold and reported the average AUC and standard deviation (SD). For macro-AUC, we concatenated the AUC from all folds of the testing data [18]. Such a nested cross-validation process can help obtain a robust estimation on the model’s generalization performance [17].

Fig. 2
figure 2

Overview of the analysis pipeline. OA osteoarthritis, RA rheumatoid arthritis, AUC area under receiver operating characteristic curves. Created with

Additionally, to determine the discriminative power of each individual pathology feature in distinguishing OA vs RA, we treated the feature values themselves as prediction scores for generating the receiver operating characteristic (ROC) curve, based on which the AUC value was calculated. Then, to determine the optimal threshold for a given feature to distinguish OA vs RA, Youden’s J statistic was calculated to obtain the optimal point on the ROC curve, the optimal threshold, sensitivity, and specificity [19]. Finally, feature importance was calculated for the model combining all fourteen pathologist scores and computer vision-generated cell density.

A p-value less than 0.05 was considered statistically significant. Python 3.6 Scikit-Learn 0.24.2 was used for the machine learning analysis, Python Scikit-image 0.17.2 to was used for the computer vision analysis, and Stata version 14.0 was used for descriptive statistics and logistic regression models [20].


Patient characteristics

A total of 147 OA patients and 60 RA patients were included in the analysis (Table 1). A greater proportion of RA patients were female (83.3%) compared to OA patients (61.2%) (p = 0.002). OA patients had a higher median BMI than RA patients (p = 0.006). More RA patients reported a history of cigarette use (60.0%) than OA patients (37.4%) (p = 0.003). RA patients had elevated CRP values compared to OA (p < 0.001). The median duration since diagnosis was lower in OA patients compared to RA patients (p = 0.032). We measured serum RF and CCP by ELISAs on all OA patients and found that no OA patients in our cohort harbored CCP antibodies and only one had RF positivity (2.5 times upper limit of normal) without any signs and symptoms of RA. A total of 50.0% of the RA patients had positive RF, and 78.4% had positive anti-CCP.

Table 1 Patient characteristics

Comparison of OA and RA synovial histologic features

Fibrosis (p < 0.001) and mast cell presence (p < 0.001) were significantly more common in OA (Table 2). In fact, these two features were almost universally present in OA (95.2% and 99.3%, respectively). There was no statistically significant difference in mucoid change (which was common in both diseases) and germinal centers (which were very rare in both diseases) between patients with OA and RA. To test the hypothesis that fibrosis and mast cells were more commonly observed in OA because there are fewer lymphocytic infiltrates in OA than RA and these features are thus more easily observed, we ran adjusted logistic regression models. Fibrosis (all grades) and mast cells remained statistically significantly associated with the outcome after adjusting for lymphocytic infiltrates in these models. Histologic features of the synovium that were increased in RA compared to OA included lymphocytic inflammation (p < 0.001), lining hyperplasia (p < 0.001), neutrophils (p < 0.001), detritus (p < 0.001), plasma cells (p < 0.001), Russell bodies (p = 0.019), binucleate plasma cells (p < 0.001), sub-lining giant cells (p < 0.001), synovial lining giant cells (p = 0.003), and fibrin (p < 0.001) (Table 2). Computer vision quantification of mean cell density per mm2 of tissue was significantly lower in patients with OA (2900) compared to those with RA (4196) (p < 0.001).

Table 2 Synovial histologic features of osteoarthritis vs rheumatoid arthritis

Supervised machine learning to distinguish OA vs RA

Using disease state OA versus RA as classifiers and histology scores as inputs, we calculated thresholds to optimally distinguish the two disease states for the fourteen pathologist-scored histology features and the computer vision quantification of cell density, and we evaluated the discriminative power of the features according to the area under the curve (AUC) generated from tuning the cutoff threshold (Fig. 3).

Fig. 3
figure 3

Discovery of optimal thresholds for the top four most predictive histology features to discriminate synovial tissue samples from patients with OA from those with RA. A Raw histology feature scores in patients with OA and RA. B AUC curves extracted from Random Forest machine learning model. C Distribution of raw OA and RA histology feature scores and optimal threshold values extracted from Random Forest machine learning model. D Percent of OA and RA samples above or below optimal thresholds identified in C. OA osteoarthritis, RA rheumatoid arthritis, AUC area under the receiver operating characteristic curves

Together, the 14 pathologist-scored features yielded a micro-AUC of 0.85±0.06 and macro-AUC of 0.85 for distinguishing OA and RA. By comparison, using computer vision-generated cell density scores alone yielded a similar micro-AUC 0.87 (macro-AUC: 0.88). Finally, combining the computer vision score of cell density with the 14 pathologist scores further improved the micro-AUC to 0.92 ± 0.06 (macro-AUC 0.91). Micro- and macro-precision, recall, and F1 scores, along with the out-of-bag error for each model, are provided in Supplemental Table 1. Feature importance scores for the combined model were calculated and are shown in Table 3: the four most important features that distinguished OA from RA were mast cells followed by cell density, fibrosis, and lining hyperplasia.

Table 3 Feature importance, macro area under receiver operating characteristic curves (macro-AUC), and optimal thresholds of the synovial features in distinguishing OA and RA patients

Thresholds to distinguish OA vs RA

The top four features with the highest individual discriminative power were the computer vision score of cell density (macro-AUC = 0.88), fibrosis (macro-AUC = 0.84), mast cells (macro-AUC = 0.80), and lining hyperplasia (macro-AUC = 0.78) (Table 3). With Youden’s J statistic, we discovered that the threshold of cell density lower than 3400 cells per mm2 distinguished OA from RA synovium with a sensitivity of 0.82 and specificity of 0.82. The thresholds for the pathologist-scored features for distinguishing OA from RA synovium were the following: focal and widespread fibrosis (vs absence), presence of mast cells (vs absence), and normal or up to 2–3 cells of lining hyperplasia (vs 3–4 or >4 cells) (Fig. 3). Optimal thresholds for the full list of features are provided in Table 3.


Using two well-characterized cohorts of OA and RA patients, we found that H&E-stained images from OA and RA synovial biopsies were distinguishable using 14 pathologist-scored features, computer vision-quantified cell density, or their combination, with AUCs of 0.85, 0.88, and 0.91, respectively. Mast cells and the presence of fibrosis were much more common in OA than in RA synovial biopsies. On the other hand, synovium from patients with RA had increased lining hyperplasia, lymphocytic inflammation, neutrophils, detritus, plasma cells, Russell bodies, binucleate plasma cells, sub-lining giant cells, synovial lining giant cells, and fibrin. The top four features that distinguished OA and RA patients were mast cells, mean cell density, fibrosis, and lining hyperplasia. Finally, we discovered that a threshold of greater than 3400 cells per mm2 distinguishes OA from RA synovium with a sensitivity of 0.82 and specificity of 0.82. Thus, automated whole slide cell density can potentially be used as a screening tool in research and clinical settings.

The careful annotations of specific cellular and extracellular features in OA and RA yielded some interesting insights into the two diseases. Lymphocytic inflammation was not uncommon in samples from patients with OA. A total of 27.2% of OA patients had moderate or greater than moderate synovial lymphocytic inflammation, defined as >1 perivascular aggregate per high-power field [7]. In studies of RA, aggregates of lymphocytes have consistently been shown to be associated with increased levels of cytokines, chemokines [21,22,23], and RA-specific autoantibodies [24, 25] and to be predictive of response to TNF inhibitor [24] and rituximab [26]. However, lymphocyte aggregates are not specific to RA [21]. In this cohort, one-third of patients with OA harbor at least moderate lymphocytic aggregates, underscoring the lack of specificity in our definitions of aggregates. It is possible that lymphocytic infiltrates in our patients with OA may have been caused by undiagnosed concomitant crystalline diseases such as calcium pyrophosphate deposition disease or gout. However, this finding is also in agreement with others who have found that there may be a distinct inflammatory OA subtype [27, 28] that may benefit from different treatment approaches.

Fibrosis and mast cells have previously been reported in OA synovium by other investigators [29,30,31,32,33,34]. Our study adds to the literature by demonstrating that fibrosis and mast cells are almost always observed in OA (95% and 99%, respectively) and that they are key features that help distinguish RA versus OA. There are two important limitations of this observation. Firstly, these two features were scored as binary, not continuous, and, as can be seen by the angular ROC, this may bias the search for their role and the thresholds in the classification task. Secondly, RA synovial tissue samples were fresh-frozen in optimal cutting temperature compound, and the OA tissues were formalin-fixed and paraffin-embedded. Since paraffin-embedding better preserves morphological details, it is possible and even likely that mast cells were more readily detectable in OA samples, and this difference in sample processing could have contributed to the importance ascribed to mast cells in our analysis. It is less likely, but not impossible, that this difference in sample processing would affect the assessment of fibrosis. Mast cells and fibrosis were inversely associated with other inflammatory features, such as lymphocytes and plasma cells, consistent with prior studies [29, 30].

The finding of increased detritus—small fragments of cartilage or bone—in RA compared to OA was not anticipated, since cartilage damage is a hallmark of OA. One possible explanation is that detritus is increased in RA because intense inflammation is more destructive and may yield larger and therefore more visually obvious debris particles, whereas the cartilage debris generated in response to OA-related damage is smaller and invisible by 10–40× imaging. However, this may also reflect the more advanced damage in the RA joints in patients at the time of arthroplasty.

Several inflammatory features that are typically associated with inflammatory RA such as binucleate plasma cells, Russell bodies, and plasma cells were observed in 11%, 9%, and 15% of OA patients, respectively. This was not anticipated, as plasma cell infiltration of RA synovium has been thought to be related to the fact that patients with RA tend to harbor autoantibodies, such as RF and CCP. Since none of the OA patients in this cohort harbored CCP and only one (0.7%) harbored RF, this finding suggests the non-autoantibody functions of plasma cells in synovial tissue inflammation warrant further exploration.

Neutrophils, which were observed in 22% of RA cases, were very rare (<1%) in OA. We previously observed an association of synovial neutrophils and fibrin, the final product of the clotting cascade, with prolonged morning stiffness in patients with RA [35]. Morning stiffness that lasts for more than 1 h is rare in OA. Thus, our observation that neutrophils are exceedingly rare in OA underscores the possibility that neutrophils together enmeshed in fibrin clots may indeed play a role in the prolonged duration of RA-related morning stiffness. Furthermore, OA stiffness, which is classically either unchanged or worse with activity, likely has a different etiology. Given the well-established contribution of fibrosis to stiffness in other organs [36], it is possible that synovial fibrosis contributes to stiffness in patients with knee OA, as previously proposed [37].

In addition to the above-mentioned sample processing limitation, our study has some other noteworthy limitations. For one, the study population is a convenience sample of OA and RA patients seeking knee arthroplasty at a high-volume, tertiary care hospital in the USA, and thus, the findings may not be applicable to early-stage patients or to joints other than knees. Future studies will compare these histology assessments in other joints and stages of disease. Our sample size is also relatively small, and we did not conduct external validations due to data availability. Further efforts on the evaluation of our model on other independent data sets are needed for justifying its generalizability. We also limited our study to patients who met the classification criteria for OA and RA. While the classification criteria for OA include criteria to help exclude RA, such as less than 30 min of morning stiffness, negative rheumatoid factor, and erythrocyte sedimentation rate less than 40 mm/h, the classification criteria for RA do not include features to help exclude OA and it is likely that many patients with RA also have OA. Though many of the RA patients in our study may have had coincident OA, their synovium was distinguishable from those with OA. We also do not know if these features are distinguishing other causes or types of synovitis, such as psoriatic arthritis and lupus, or if they are better considered methods for distinguishing inflammatory from non-inflammatory pathology. In addition, we only used cell density as an automated computer vision-based feature in our analysis. Identification of additional informative computer vision features for distinguishing OA and RA would warrant further exploration in the future. Finally, 55.0% of the RA patients in this study reported taking a biologic, which would be expected to hinder the ability of the pathologist or our models to discriminate OA from RA since they attenuate inflammation. However, despite the high prevalence of biologic use, we found that the vast majority of samples from patients with RA could be discriminated from those with OA.

Strengths of this study include well-characterized cohorts of OA and RA and an expert musculoskeletal pathologist who has scored and graded the slides for both cohorts. We also demonstrate the utility of cell density, an automated measure by computer vision which can be universally used without an expert pathologist and offers scalability, a quick turn-around, and minimal cost. Previous application of machine learning in rheumatic diseases has involved identifying patients with RA from clinical data, billing codes, and natural language processing-derived concepts in electronic health records [38,39,40]. Our group has used machine learning to develop algorithms to use synovial histology features to predict gene expression subsets in RA [7] and computer vision algorithms to quantify RA synovial inflammation as measured by cell density [8]. The results presented here extend these studies and indicate that computer vision analysis of standard-of-care pathology slides scanned within electronic health records might also be useful to discriminate patients with RA from those with OA. This has the potential to help clinicians distinguish previously unrecognized or undiagnosed RA who undergo TKR in the future. Presently, we hope this algorithm can help other translational researchers generate more accurate and precise quantification of synovial inflammation for their study comparisons.

In summary, pathologist-scored mast cells, fibrosis, and lining hyperplasia were the most important pathologist-scored features for discriminating OA and RA synovium. A threshold synovial cell density of >3400 yields a sensitivity of 0.82 and a specificity of 0.82 for distinguishing OA from RA. Future efforts will attempt to identify additional informative computer vision features as well as comparisons of their performance on other clinical cohorts.

Availability of data and materials

The datasets generated and analyzed during this study are available from the corresponding author on reasonable request.





Rheumatoid arthritis


Hematoxylin and eosin


Total knee replacement




American College of Rheumatology


Body mass index


Erythrocyte sedimentation rate


C-reactive protein


Rheumatoid factor


Cyclic citrullinated peptide


Strength of Reporting in Observational Studies in Epidemiology


Standard deviation


Interquartile range


Area under the receiver operating curve


Receiver operating characteristic


  1. NIH consensus conference: total hip replacement. NIH Consensus Development Panel on Total Hip Replacement. JAMA. 1995;273(24):1950–6.

    Article  Google Scholar 

  2. Slansky E, Li J, Häupl T, Morawietz L, Krenn V, Pessler F. Quantitative determination of the diagnostic accuracy of the synovitis score and its components. Histopathology. 2010;57(3):436–43.

    Article  PubMed  Google Scholar 

  3. Krenn V, Morawietz L, Burmester GR, et al. Synovitis score: discrimination between chronic low-grade and high-grade synovitis. Histopathology. 2006;49(4):358–64.

    Article  CAS  PubMed  Google Scholar 

  4. Krenn V, Morawietz L, Häupl T, Neidel J, Petersen I, König A. Grading of chronic synovitis--a histopathological grading system for molecular and diagnostic pathology. Pathol Res Pract. 2002;198(5):317–25.

    Article  CAS  PubMed  Google Scholar 

  5. Zhang F, Wei K, Slowikowski K, et al. Defining inflammatory cell states in rheumatoid arthritis joint synovial tissues by integrating single-cell transcriptomics and mass cytometry. Nat Immunol. 2019;20(7):928–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Najm A, le Goff B, Venet G, et al. IMSYC immunologic synovitis score: a new score for synovial membrane characterization in inflammatory and non-inflammatory arthritis. Joint Bone Spine. 2019;86(1):77–81.

    Article  CAS  PubMed  Google Scholar 

  7. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheum. 2018;70(5):690–701.

    Article  CAS  Google Scholar 

  8. Guan S, Mehta B, Slater D, et al. Rheumatoid arthritis synovial inflammation quantification using computer vision. ACR Open Rheumatol. Published online January 10, 2022.

  9. Altman R, Asch E, Bloch D, et al. Development of criteria for the classification and reporting of osteoarthritis: classification of osteoarthritis of the knee. Arthritis Rheum. 1986;29(8):1039–49.

    Article  CAS  PubMed  Google Scholar 

  10. Kellgren JH, Lawrence JS. Radiological assessment of osteo-arthrosis. Ann Rheum Dis. 1957;16(4):494–502.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Goodman SM, Mirza SZ, DiCarlo EF, et al. Rheumatoid arthritis flares after total hip and total knee arthroplasty: outcomes at one year. Arthritis Care Res. 2020;72(7):925–32.

    Article  Google Scholar 

  12. Goodman SM, Bykerk VP, DiCarlo E, et al. Flares in patients with rheumatoid arthritis after total hip and total knee arthroplasty: rates, characteristics, and risk factors. J Rheumatol. 2018;45(5):604–11.

    Article  PubMed  Google Scholar 

  13. Aletaha D, Neogi T, Silman AJ, et al. 2010 Rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis Rheum. 2010;62(9):2569–81.

    Article  PubMed  Google Scholar 

  14. Arnett FC, Edworthy SM, Bloch DA, et al. The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum. 1988;31(3):315–24.

    Article  CAS  PubMed  Google Scholar 

  15. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol. 2008;61(4).

  16. Otsu N. A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern. 1979;9(1):62–6.

    Article  Google Scholar 

  17. Cawley GC, NLCT. On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res. 2010;11(70):2079–107

    Google Scholar 

  18. Pereira RB, Plastino A, Zadrozny B, Merschmann LHC. Correlation analysis of performance measures for multi-label classification. Inf Process Manag. 2018;54(3):359–69.

    Article  Google Scholar 

  19. Youden WJ. Index for rating diagnostic tests. Cancer. 1950;3(1):32–5.<32::aid-cncr2820030106>;2-3.

    Article  CAS  PubMed  Google Scholar 

  20. van der Walt S, Schönberger JL, Nunez-Iglesias J, et al. scikit-image: image processing in Python. PeerJ. 2014;2:e453.

    Article  PubMed  PubMed Central  Google Scholar 

  21. van de Sande MGH, Thurlings RM, et al. Presence of lymphocyte aggregates in the synovium of patients with early arthritis in relationship to diagnosis and outcome: is it a constant feature over time? Ann Rheum Dis. 2011;70(4):700–3.

    Article  PubMed  Google Scholar 

  22. Yanni G, Whelan A, Feighery C, et al. Contrasting levels ofin vitrocytokine production by rheumatoid synovial tissues demonstrating different patterns of mononuclear cell infiltration. Clin Exp Immunol. 1993;93(3):387–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Manzo A, Paoletti S, Carulli M, et al. Systematic microanatomical analysis of CXCL13 and CCL21in situ production and progressive lymphoid organization in rheumatoid synovitis. Eur J Immunol. 2005;35(5):1347–59.

    Article  CAS  PubMed  Google Scholar 

  24. Klaasen R, Thurlings RM, Wijbrandts CA, et al. The relationship between synovial lymphocyte aggregates and the clinical response to infliximab in rheumatoid arthritis: a prospective study. Arthritis Rheum. 2009;60(11):3217–24.

    Article  CAS  PubMed  Google Scholar 

  25. Cantaert T, Timmer T, Vandooren B, et al. Synovial T/B cell lymphoid aggregates regulate the production of rheumatoid arthritis-specific autoantibodies. Clin Immunol. 2007;123:S93.

    Article  Google Scholar 

  26. Thurlings RM, Vos K, Wijbrandts CA, Zwinderman AH, Gerlag DM, Tak PP. Synovial tissue response to rituximab: mechanism of action and identification of biomarkers of response. Ann Rheum Dis. 2008;67(7):917–25.

    Article  CAS  PubMed  Google Scholar 

  27. Dell’Isola A, Steultjens M. Classification of patients with knee osteoarthritis in clinical phenotypes: data from the osteoarthritis initiative. PLoS One. 2018;13(1):e0191045.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Lv Z, Yang YX, Li J, et al. Molecular classification of knee osteoarthritis. Front Cell Dev Biol 2021;0.

  29. Minten MJM, Blom A, Snijders GF, et al. Exploring longitudinal associations of histologically assessed inflammation with symptoms and radiographic damage in knee osteoarthritis: combined results of three prospective cohort studies. Osteoarthr Cartil. 2019;27(1):71–9.

    Article  CAS  Google Scholar 

  30. Abdul N, Dixon D, Walker A, et al. Fibrosis is a common outcome following total knee arthroplasty. Sci Rep. 2015;5(1):1–13.

    Article  CAS  Google Scholar 

  31. de Lange-Brokaar BJE, Kloppenburg M, Andersen SN, et al. Characterization of synovial mast cells in knee osteoarthritis: association with clinical parameters. Osteoarthr Cartil. 2016;24(4):664–71.

    Article  Google Scholar 

  32. Klein-Wieringa IR, de Lange-Brokaar BJE, Yusuf E, et al. Inflammatory cells in patients with endstage knee osteoarthritis: a comparison between the synovium and the infrapatellar fat pad. J Rheumatol. 2016;43(4):771–8.

    Article  PubMed  Google Scholar 

  33. Gruber B, Poznansky M, Boss E, Partin J, Gorevic P, Kaplan AP. Characterization and functional studies of rheumatoid synovial mast cells. Activation by secretagogues, anti-IgE, and a histamine-releasing lymphokine. Arthritis Rheum. 1986;29(8):944–55.

    Article  CAS  PubMed  Google Scholar 

  34. Pu J, Nishida K, Inoue H, Asahara H, Ohtsuka A, Murakami T. Mast cells in osteoarthritic and rheumatoid arthritic synovial tissues of the human knee. Acta Med Okayama. 1998;52(1):35–9.

    Article  CAS  PubMed  Google Scholar 

  35. Orange DE, Blachere NE, DiCarlo EF, et al. Rheumatoid arthritis morning stiffness is associated with synovial fibrin and neutrophils. Arthritis Rheum. 2020;72(4):557–64.

    Article  CAS  Google Scholar 

  36. Rockey DC, Bell PD, Hill JA. Fibrosis--a common pathway to organ injury and failure. N Engl J Med. 2015;373(1):96.

    Article  PubMed  Google Scholar 

  37. Kuo SJ, Yang WH, Liu SC, Tsai CH, Hsu HC, Tang CH. Transforming growth factor β1 enhances heme oxygenase 1 expression in human synovial fibroblasts by inhibiting microRNA 519b synthesis. PLoS One. 2017;12(4):e0176052.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Jiang M, Li Y, Jiang C, Zhao L, Zhang X, Lipsky PE. Machine learning in rheumatic diseases. Clin Rev Allergy Immunol. 2021;60(1):96–110.

    Article  PubMed  Google Scholar 

  39. Zhou SM, Fernandez-Gutierrez F, Kennedy J, et al. Defining disease phenotypes in primary care electronic health records by a machine learning approach: a case study in identifying rheumatoid arthritis. PLoS One. 2016;11(5):e0154515.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Carroll RJ, Eyler AE, Denny JC. Naïve Electronic Health Record phenotype identification for Rheumatoid arthritis. AMIA Annu Symp Proc. 2011;2011:189–96

    PubMed  PubMed Central  Google Scholar 

Download references


The authors would like to acknowledge the following individuals for their contributions to the study: Edoardo Spolaore; Serene Mirza; Collin Brantner; Carine Moezinia; Haley Tornberg; Diyu Pearce-Fisher; Samantha Lessard; Purva Singh; Yana Bronfman; Thomas Sculco, MD; Michael Parks, MD; David Mayman, MD; and Michael Cross, MD.


This work was supported by the C. Ronald MacKenzie Young Scientist Endowment Award, the Leon Lowenstein Foundation, the Anna-Maria and Stephen Kellen Foundation Total Knee Improvement Program, the Arthritis Foundation, the Cedar Hill Foundation, and the following NIH grants: UL1 TR001866 and R01AR078268, 1UC2AR081025, UM1AI109565, UC2 AR082186. The funding sources were not involved in the study design, collection, analysis, and interpretation of data; in the writing of the manuscript; or in the decision to submit the manuscript for publication.

Author information

Authors and Affiliations



BM contributed to the conception and design, analysis and interpretation of the data, collection and assembly of data, drafting of the article, critical revision of the article for important intellectual content, and obtaining of funding. SG contributed to the conception and design, analysis and interpretation of the data, and drafting of the article. ED contributed to the conception and design, collection and assembly of data, provision of study materials or patients, and critical revision of the article for important intellectual content. DJ contributed to the analysis and interpretation of the data, statistical expertise, and drafting of the article. JAG contributed to the analysis and interpretation of the data, collection and assembly of data, drafting of the article, and administrative, technical, or logistic support. MO, LD, TP, and WR contributed to the collection and assembly of data and critical revision of the article for important intellectual content. PS, MF, and JR contributed to the provision of study materials or patients and critical revision of the article for important intellectual content. JK contributed to the collection and assembly of data, critical revision of the article for important intellectual content, and administrative, technical, or logistic support. JT contributed to the analysis and interpretation of the data, statistical expertise, and drafting of the article. DS, DF, and ZX contributed to the analysis and interpretation of the data, statistical expertise, and critical revision of the article for important intellectual content. FW contributed to the conception and design, analysis and interpretation of the data, statistical expertise, and drafting of the article. DO contributed to the conception and design, analysis and interpretation of the data, drafting of the article, and critical revision of the article for important intellectual content. All authors approved of the final version of this article to be for publication.

Corresponding author

Correspondence to Bella Mehta.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was provided by the Institutional Review Board at the Hospital for Special Surgery (IRB #2018-0895 and #2014-233), and the research was performed in accordance with the relevant guidelines and regulations.

Competing interests

Bella Mehta, MBBS, MS: Novartis Non-labeled educational content development; Janssen Advisory board. Susan M. Goodman, MD: American College of Rheumatology Board or committee member, Novartis Research support, UCB Paid consultant. Deanna P. Jannat-Khah, DrPH, MSPH: AstraZeneca, Cytodyn, Walgreens stocks. Miguel Otero, PhD: Regeneron Pharmaceuticals paid consultant; Tissue Genesis Research support. Laura Donlin, PhD: Karius, Inc Research support. Stryker Paid consultant and paid speaker. Mark P. Figgie, MD: Lima and Wishbone paid consultant; HS2, Mekanika, and Wishbone stocks. Jose A. Rodriguez, MD: Board or committee member for the American Association of Hip and Knee Surgeons and the Eastern Orthopaedic Association Nomination Committee. Paid consultant and received IP royalties from ConforMIS, Medacta, Exactech, Inc, and Smith & Nephew. Research support from DePuy, Exactech, Inc, and Smith & Nephew. Editorial or governing board of Clinical Orthopaedics and Related Research, HSS Journal, and Journal of Arthroplasty.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplemental Table 1.

Performance metrics for three models in distinguishing OA vs. RA. AUC = area under the receiver operating curve.



Human pathologist-scoring system of fourteen features applied to all samples. A portion of these features are illustrated at

1. Lymphocytic inflammation: inflammation consisting of any combination of lymphocytes and plasma cells around blood vessels (perivascular aggregates) or in the interstitium

0: None

1: Mild (0–1 perivascular aggregates per low power field)

2: Moderate (>1 perivascular aggregate + focal interstitial infiltration)

3: Marked (both perivascular and widespread interstitial aggregates)

4: Band-like

2. Lining hyperplasia: The number of cells that comprise the thickness of the synovial lining layer. These cells assist in modifying the content of the synovial fluid (make lubricin).

0: Normal lining

1: 2–3 cells thick

2: 3–4 cells thick

3: > 4 cells thick

3. Neutrophils: Polymorphonuclear leukocytes. Of note, marginating neutrophils are more likely due to acute stress of surgery than part of a disease phenotype. Therefore, only neutrophils that are present in the interstitium or synovial lining are scored positively.

0: None

1: Present (granulation tissue, interstitial, or marked)

4. Plasma cells (×25 mag): Percentage of lymphocytic infiltration, regardless of distribution, that consists of morphologically recognizable plasma cells. Average the percent of all plasma cells over all the inflammation in all fields.

0: < 10% plasma cells within lymphocytic aggregates

1: < 50% plasma cells

2: >50% plasma cells

5. Binucleated plasma cells: plasma cells with two or more nuclei

0: None

1: Present

6. Russell bodies: plasma cells engorged with bright red substance (antibodies/immunoglobulin)

0: None

1: Present

7. Sub-lining giant cells: multinucleated giant cells below the synovial lining layer

0: None

1: Present

8. Synovial lining giant cells: multinucleated giant cells in the synovial lining layer

0: None

1: Present

9. Fibrin: deposits of fibrinous material or reddish-pink disorganized material on the surface of the synovial membrane

0: None

1: Present

10. Fibrosis: Collagen exudate that looks like extremely pink extracellular material

0: None

1: Focal

2: Widespread or band-like

11. Mast cells: cells with round or slightly oval cytoplasm with blue granules, and a nucleus that is round and dark, but occasionally oval, centrally placed

0: None

1: Present

12. Mucoid change: bluish extracellular material that may appear as something missing from the slide or negative space. Mucin lends the synovial matrix a gelatinous character. It represents sulfated proteoglycans and or non-collagenous proteins

0: None

1: Slight (perivascular or focal interstitial)

2: Moderate (perivascular or focal interstitial)

3: Marked (perivascular or focal interstitial)

4: Myxomatous

13. Germinal centers: well-defined collections of enlarged lymphoid cells. These are immunoblasts (small cleaved, small noncleaved, large cleaved, etc.) with prominent nucleoli, tingible body macrophages that contain nuclear material eating nuclear dust, karyorrhexis

0: None

1: Present

14. Detritus: bits of bone or cartilage embedded in the synovium

0: Absent

1: Present (small or large particles)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mehta, B., Goodman, S., DiCarlo, E. et al. Machine learning identification of thresholds to discriminate osteoarthritis and rheumatoid arthritis synovial inflammation. Arthritis Res Ther 25, 31 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: