- Research article
- Open Access
Measurement of global functional performance in patients with rheumatoid arthritis using rheumatology function tests
Arthritis Res Thervolume 6, Article number: R315 (2004)
Outcome assessment in patients with rheumatoid arthritis (RA) includes measurement of physical function. We derived a scale to quantify global physical function in RA, using three performance-based rheumatology function tests (RFTs). We measured grip strength, walking velocity, and shirt button speed in consecutive RA patients attending scheduled appointments at six rheumatology clinics, repeating these measurements after a median interval of 1 year. We extracted the underlying latent variable using principal component factor analysis. We used the Bayesian information criterion to assess the global physical function scale's cross-sectional fit to criterion standards. The criteria were joint tenderness, swelling, and deformity, pain, physical disability, current work status, and vital status at 6 years after study enrolment. We computed Guyatt's responsiveness statistic for improvement according to the American College of Rheumatology (ACR) definition. Baseline functional performance data were available for 777 patients, and follow-up data were available for 681. Mean ± standard deviation for each RFT at baseline were: grip strength, 14 ± 10 kg; walking velocity, 194 ± 82 ft/min; and shirt button speed, 7.1 ± 3.8 buttons/min. Grip strength and walking velocity departed significantly from normality. The three RFTs loaded strongly on a single factor that explained ≥70% of their combined variance. We rescaled the factor to vary from 0 to 100. Its mean ± standard deviation was 41 ± 20, with a normal distribution. The new global scale had a stronger fit than the primary RFT to most of the criterion standards. It correlated more strongly with physical disability at follow-up and was more responsive to improvement defined according to the ACR20 and ACR50 definitions. We conclude that a performance-based physical function scale extracted from three RFTs has acceptable distributional and measurement properties and is responsive to clinically meaningful change. It provides a parsimonious scale to measure global physical function in RA.
Measurement of physical functional limitations in patients with rheumatoid arthritis (RA) is a time-honored strategy to assess the disease's outcome . Performance-based tests of physical function such as grip strength and walking velocity were included in some of the earliest trials of antirheumatic therapy . These tests provide reproducible, quantitative information about a patient's current status and about the prognosis [3, 4]. In a paper describing the behavior of functional tests over time in RA, Pincus and Callahan made the analogy between them and commonly used laboratory tests of other organs, referring to performance- and questionnaire-based measures as 'rheumatology function tests' (RFTs) .
It is useful to consider RFTs within an overarching conceptual framework of the disease's outcome. We have proposed a disablement framework for studying the development of disability, and possibly other outcomes, in RA . The framework consists of a main disease–disability pathway, which describes the sequential development of pathology, impairment, functional limitation, and, finally, disability [5–9]. Within this framework, performance-based functional tests are well suited to quantify functional limitations, because they entail measurement of physical actions performed by the intact person . A number of different tests are available, and researchers often include more than one in studies. However, the clinical literature is sparse in guiding how to analyze or report research findings when multiple tests are used. The need for data parsimony may sway investigators to report findings on less than the full set of tests available. We are concerned that if researchers choose this route, important information may be lost.
In an earlier analysis, we used principal component factor analysis to extract the underlying latent variable from three primary disability scales . The distributional and measurement characteristics of the latent disability scale were better than those of the primary scales . In the present analysis, we used a similar approach to extract a global physical performance scale from three primary performance-based RFTs: grip strength; walking velocity over 50 feet; and the timed shirt button test. The resulting latent functional performance scale reflects overall physical function in RA. This data reduction approach may assist investigators who wish to quantify functional limitations in RA.
Materials and methods
From 1996 to 2000, we enrolled patients meeting the 1987 RA criteria  in a study of the disablement process in RA . We have described our sample in previous publications [12, 13]. The study's acronym, ÓRALE (Outcome of Rheumatoid Arthritis Longitudinal Evaluation), matches a Mexican-American idiom for "Let's go!" Here, we will show cross-sectional results obtained during the recruitment evaluation of each participant.
Our study was approved by the institutional review board of each of the clinical facilities where we went to recruit patients, and all patients gave their written, informed consent. A physician or a research nurse, assisted by a trained research associate, conducted evaluations at the clinic where the patient was recruited. The evaluation lasted approximately 90 minutes and consisted of a comprehensive interview, a physical examination, a review of medical records, and laboratory and x-ray tests. Interviews were conducted in either English or Spanish, as preferred by the patient.
We ascertained age, sex, and race/ethnicity by self-report [12, 13]. For race/ethnicity, we used the following question: "In which of the following race or ethnic groups do you feel you belong?" Patients could choose from 'White', 'Black', 'Asian', 'Hispanic', and 'Other'.
A physician or research nurse, trained in joint examination techniques, assessed 48 joints in each patient for the presence or absence of tenderness or pain on motion, swelling, or deformity, as described elsewhere .
We asked patients to rate the amount of pain they experienced due to their arthritis during the past week, on a graded, horizontal 10-point scale that we have validated in our patient population .
Global response measures
We used two scales to measure patients' overall condition. The first, a global assessment of disease activity scale, was completed by the examining physician or nurse. Raters assessed the degree of inflammatory disease activity on a 10-point scale, ranging from 'mildest disease' to 'most severe disease'. Raters were instructed to consider symptoms such as joint pain, stiffness, tenderness, and swelling, as well as the presence of subcutaneous nodules, to rate this variable. The second scale we used was the SF-36 general health subscale , which was administered by an interviewer. Patients were asked to respond to the following five statements: (a) "In general, would you say your health is:", with the response options 'excellent', 'very good', 'good', 'fair', and 'poor'; (b) "I seem to get sick a little easier than other people"; (c) "I am as healthy as anybody I know"; (d) "I expect my health to get worse"; and (e) "My health is excellent". Response choices for items (b) to (e) were five-level Likert scales ranging from 'definitely true' to 'definitely false'. Responses to the five questions were recoded, summed, and scaled to range from 0 to 100 .
Performance-based rheumatology function tests
We used the following tests:
1. Grip strength. This was measured with a hand-held JAMAR® Dynamometer (Sammons Preston; Bolingbrook, IL, USA). In a sitting position, with the elbow held at 90 degrees, and the forearm supported on a flat horizontal surface, patients were asked to squeeze the handle with as much as strength as possible. Three repetitions from each hand were recorded, in kilograms. The mean value of all repetitions for both hands is shown.
2. Walking velocity. Starting from a standing position, patients were asked to walk at their usual pace for a distance of 50 feet, or 25 feet if they had difficulty covering the full distance. No effort was made to conceal the stopwatch used to time the patients. Results are expressed in feet per minute. Patients unable to walk were assigned a velocity of 0 feet per minute.
3. Timed button test. Patients were asked to don a standard eight-button, men's or women's extra-large denim shirt and fasten the front buttons (Wal-Mart; San Antonio, TX, USA). A stopwatch was activated when the patient took the shirt offered by the examiner, and stopped when the last button was fastened. This test quantifies the performance of large and small upper extremity joints. Results are expressed as buttons per minute. Patients unable to don the shirt were assigned a score of 0 buttons per minute.
Physical disability measures
We used four measures:
1. The disability index of the Modified Health Assessment Questionnaire (MHAQ) , a self-administered, 'arthritis-specific' instrument which asks respondents to rate the amount of difficulty they experience performing eight activities (dressing, getting out of bed, lifting a cup, walking, bathing, bending, turning faucets, and getting in and out of a car), on a scale ranging from 1 to 4 (without difficulty, with some difficulty, with much difficulty, and unable).
2. The physical function scale of the SF-36 questionnaire (SF36PF), an interviewer-administered, 'generic' instrument . The SF36PF asks respondents to rate the amount of limitation caused by their health on 10 physical activities (vigorous activities; moderate activities; carrying groceries; climbing several flights of stairs; climbing one flight of stairs; bending, kneeling or stooping; walking more than a mile; walking several blocks; walking one block; and bathing or dressing). Respondents rated each activity on a three-level scale (a lot, a little, not at all). Item responses were then summed and rescaled, with results expressed on a scale ranging from 0 to 100, higher values representing better function.
3. The Steinbrocker functional classification was used by the physician or the research nurse, who were trained in physical function assessment, to rate the extent of physical disability on a four-level scale, ranging from Class I, "complete functional capacity to carry out all usual duties without handicaps", to Class IV, "largely or wholly incapacitated with [the person] bedridden or confined to wheelchair . . ." .
4. A latent physical disability variable was computed by extracting the first principal component from the MHAQ, SF36PF, and Steinbrocker scales, using factor analysis . We extracted this latent physical disability variable scale using a procedure analogous to the one described here for the global functional performance scale and described in detail elsewhere .
We asked patients to describe their current work status from among the following answers: working full-time; working part-time; retired; student; housewife; unemployed/laid off; and disabled/unable to work. We used these responses for two sets of analyses. For the first, patients were classified as working (full- or part-time) vs not working (all others); for the second, they were classified as disabled/unable vs all others.
We have recontacted the patients at yearly intervals since their initial evaluation. For patients with whom we were not able to establish contact, even through family members, we searched publicly available death registries. We obtained a death certificate for all patients who died.
We performed a principal component factor analysis, using the grip strength, walking velocity, and button speed, and then extracted the first principal component from the unrotated factor loadings, using the least squares regression method . We rescaled the extracted factor to vary from 0 to 100 with a positive valence, higher values representing less disability. We used the skewness and kurtosis test to check each variable for departure from normality . To evaluate the degree of association between the new scale and other study variables with interval or ratio distributions, we used Pearson product moment correlation coefficients . Differences between the coefficients were tested after Fisher's z-transformation , using the procedure provided by Goldstein . Because this required us to perform a total of 21 correlation coefficient comparisons, we considered coefficients to be significantly different only if P was ≤0.002 for the comparison, adjusted according to the Bonferroni technique (the conventional α = 0.05 ÷ number of comparisons = 21). To evaluate the association of the new global functional performance scale with categorical criterion variables, we divided the new scale into ordinal categories and used a chi-square to test the strength of association . We then evaluated the fit of multivariable models that included the new global functional performance scale, compared to models that included the primary RFT. We asked the question: "Does a multivariable model that includes the new global functional performance scale fit criterion standards better than models that include the RFT?" Age and sex were included as covariates in all these multivariable models, because they can have a strong influence on any of the criterion measures we used. A simplified (without coefficients), general form of the models we compared was
y = a + b + fp
where y could be any of the criterion standards (working status, vital status, grip strength, etc.), a was age, b was sex, and fp was one of the four functional performance scales (grip strength, walking velocity, button speed, or the new global functional performance scale). When y was a categorical variable, the model was a logistic regression, and when y was an interval or ratio variable, the model was an ordinary least squares regression. We expected that the fit of a multivariable model including the new global scale on any of the criterion standards would be equivalent or superior to the fit of models that include the three primary scales. We used the Bayesian information criterion (BIC) to confirm this expectation . The BIC varies inversely with a model's fit: given two models, the one with the smaller or more negative BIC has a better fit . We used Raftery's guidelines to interpret BIC differences between two models: a BIC difference >10 is considered 'very strong' evidence in favor of the model with the smaller BIC; a difference of >6 to 10 as 'strong'; >2 to 6 as 'positive'; and 0 to 2 as 'weak' evidence .
To assess the responsiveness of the primary RFT and the global functional performance scales, we classified patients as improved or unimproved. Available data allowed us to compute the American College of Rheumatology preliminary definition of improvement in RA, with one modification . The definition requires a 20% or 50% improvement in both tender and swollen joint counts, plus a 20% or 50% improvement in at least three of five additional measures. Four of these additional measures were available to us: global assessment of disease activity completed by the examining physician or nurse, 10-point pain scale, MHAQ, and ESR. In place of the patient global assessment required by the definition , we substituted the SF-36 general health subscale . We calculated change in the three primary RFTs and the global functional performance scale as the difference between the baseline and follow-up measurements. We used the change scores among improved and unimproved patients to calculate Guyatt's responsiveness ratio for each functional scale . Guyatt's ratio =
We performed all analyses on a desktop personal computer, using the Stata 8 software package (College Station, Texas, USA).
We recruited 779 patients from 1996 to 2000. The clinical characteristics of the study sample have been described in earlier publications [10, 13]. The median age of the patients was 57 years (range 19 to 90 years); 70% were women and 56% were Hispanic. The median number of years of formal education was 12 (range 0 to >16); 21% were working full-time or part-time, and 27% were disabled from work. The median disease duration was 8 years (range 0 to 52). Mean joint counts were 15 tender, 7 swollen, and 10 deformed. Subcutaneous nodules were present in 30% of patients, and rheumatoid factor in 89%.
Of the 779 patients enrolled, 43 (5.5%) died before we could conduct a follow-up functional performance assessment. Of the remaining 736, we measured follow-up functional performance in 676 (92% of survivors), a mean period of 1.3 years after the baseline assessment (median 1 year, range 6 months to 5 years). An additional 48 patients died after the follow-up measurement, for a total of 101 deaths by July 2003. Significant differences at baseline between the surviving patients who did not participate in the follow-up and those who did participate included slower walking velocity (179 vs 203 feet/minute; P = 0.02) and slower shirt button speed (6.2 vs 7.7 buttons per minute; P = 0.002) among patients without follow-up assessment. There were no significant differences between the two groups in age or sex, or in the number of tender, swollen, or deformed joints.
Figure 1 is a diagram of the factor analysis we used to derive the global functional performance scale. The three RFTs – grip strength, walking velocity and button speed – loaded strongly on a single factor, with loadings ≥0.8. This factor explained ≥70% of the primary variables' combined variance. Uniqueness values were below 0.36 for each of the primary variables, indicating that they share about two-thirds of variance. We extracted the single factor without rotation, using linear regression scoring, to derive the global functional performance scale. The factor scoring coefficients used to derive the scale are shown in the following formula, in which GFP = global functional performance, GS = grip strength, WV = walking velocity, and BS = button speed:
GFP = GS × 0.38033 + WV × 0.40709 + BS × 0.40508
Figure 2 shows frequency distributions for the three primary scales and the derived global scale. The global functional scale's distribution did not depart significantly from the normal distribution on the skewness-kurtosis test (chi-square 4.01 with 2 degrees of freedom, P = 0.13). In contrast, grip strength and walking velocity departed significantly from normality (chi-square 155 and 10.4, P = 0.007 and ≤0.001, respectively), with shirt button speed as the one primary test that had normal distribution (chi-square 3.3, P = 0.19). Figure 3 depicts a matrix of bivariate distributions between the three primary RFTs and the derived global physical functions scale. The correlation between the latter and the three primary RFTs was ≥0.8 in all three cases.
Table 1 shows coefficients of correlation between each of the RFTs, including the new global physical function scale, and the criterion variables of joint tenderness, swelling, and deformity; overall pain; the MHAQ and SF36PF scales, and the Steinbrocker class; and the latent disability scale. For 19 of 24 comparisons, the strength of the correlation between the global physical function scale and the criterion variables was stronger than that between the primary RFTs and the criterion variables.
Table 2 shows the BIC of models that contained age and sex plus either the grip strength, walking velocity, shirt button speed, or global functional performance scale as independent variables, with each of the criterion standards as dependent variables. The fit of the models that included the derived global scale was better than the fit of most of the models that included the primary RFTs. This was evidenced by smaller or more negative BICs on the better-fitting models, as shown in the table.
After a median period of one year, 119 patients (18%) improved sufficiently to meet the ACR50 definition. An additional 117 patients (17%) met the ACR20 definition of improvement. Change in RFT and in the global functional performance scale is shown in Table 3, according to the level of ACR improvement. The responsiveness of all functional tests was at least moderate. The largest Guyatt's ratio was seen for the global functional performance scale, suggesting that this scale is the most responsive to improvement as defined here (Table 3).
We measured the correlation between assessments performed at the baseline evaluation and the extent of physical disability measured at follow-up (Table 4). Global functional performance correlated significantly more strongly with physical disability at follow-up than did any of the primary RFTs. Global functional performance at baseline also had a stronger correlation with follow-up physical disability than did the baseline number of tender, swollen, or deformed joints, or the baseline primary disability scales, MHAQ, SF36 PF, or Steinbrocker class. Only the baseline latent physical disability exceeded the global functional performance in its correlation with follow-up physical disability (Table 4).
Figure 4 shows the relation between the global physical function scale and the deformed-joint count, current working status, current disabled status, and death occurring during the 6 years of observation covered by the present report. For all comparisons, the global physical function scale was strongly associated with the outcome.
Our objective was to measure the degree of functional limitation in a sample of RA patients. We elected three established, performance-based RFTs: grip strength, walking velocity, and the timed shirt button test . We found evidence that a new variable derived through a data reduction process from the three tests performed better than the primary tests, while meeting the need for data parsimony.
To demonstrate the characteristics of the global functional scale, we used a number of comparison variables, based on the disablement process model [5, 10]. Thus, our comparison criteria included key RA impairments such as the amount of pain and the number of tender, swollen, and deformed joints; and measures of physical disability, including the MHAQ, SF36PF, and Steinbrocker functional class, as well as current occupational status. To be consistent with earlier studies of RFTs , we also included death within 6 years as an outcome. We demonstrate significant associations between the new global functional performance score and each of the comparison standards. We chose the BIC as a comparative fit measure because it is a tool used often for model selection [24, 27]. We expected that the models that included global functional performance scale would have smaller BICs, indicating better fit. Indeed, this was usually the case: for nearly all of the criterion variables, the fit of the global scale was superior to that of the primary measures of grip strength, walking velocity, or shirt button speed.
We also evaluated the ability of these performance-based measures to respond to clinical change. With the data available to us, we could compute the ACR20 and ACR50 improvement definitions, with one exception: we lacked a patient global assessment scale . In its place, we used the general health subscale of the SF-36. We estimate that the global functional performance scale is more responsive to clinically significant improvement than are the primary RFTs. However, it should be noted that improvement among our patients was not in response to a specific intervention. Because of this, further research is necessary to test the responsiveness of the global functional performance scale to specific intervention, and to distinguish between active drug and placebo in a clinical trial.
Pooled indices are often more reliable than the individual components of an index . This is most likely due to improved capture of an underlying construct when multiple scales are used, in contrast with a single instrument. There are precedents in rheumatology for developing pooled indices, usually as part of efforts aimed at measuring the efficacy of antirheumatic drugs [29–32]. We have previously applied this data reduction strategy to develop a physical disability scale, using a generic scale, an arthritis-specific one, and an observer-assessed functional status grade . Similar processes could be applied to develop summary scales for other RA dimensions, such as disease damage or joint impairment.
The polyarticular nature of RA usually causes a global limitation in joint function. This characteristic of RA makes a global functional scale valuable for investigators who wish to capture the full impact of RA on a patient's performance. However, each of the RFTs we chose is influenced by different upper and lower extremity properties: hand prehensile strength for the grip measure; large and small upper extremity joint range and dexterity for the shirt button test; and lower limb strength; joint stability; and overall balance for walking velocity. The many-sided quality of the three tests works against the aim of measuring global performance as a single construct. Our approach was to use principal component factor analysis to extract the shared component from the three scales. Indeed, the three primary tests loaded strongly on a single factor that explained 70% of the variance of the three scales.
We believe this approach is suited for research focusing on RA patients' total level of functional limitation, as is the case in our and other studies aiming to map the outcome of RA in patients over time. It may also be a reasonable approach to measure the effectiveness of therapies that reach all joints, such as antirheumatic drugs. Although performance-based RFTs such as grip strength or walking velocity were often included in antirheumatic drug trials in the past, investigators did not attempt to condense them as we have done. These tests have usually not been included in recent trials of antirheumatic drugs. It may be of interest to re-evaluate the role of performance-based RFTs in antirheumatic drug trials, using the approach we used here to tap into the underlying construct. Our initial estimate of the responsiveness of the global scale responsiveness suggests that its use could lead to more efficient clinical trials.
It should be mentioned that investigators who aim to measure regional joint performance more specifically can still do so using the primary RFTs. For example, a study aiming to assess the impact of lower-limb joint replacement on functional performance may be better off using the walking velocity. Likewise, interventions aimed at increasing upper-limb performance may wish to use the grip strength or button speed instead of the global scale.
As we have pointed out previously , our approach is data-driven. The global functional performance scale is derived after all data collection has been completed. Researchers planning to use the approach we have outlined can define the primary outcome scales in advance of a study (i.e. grip strength, walking velocity, and button speed in the present analysis). Expected effect sizes on the extracted variable can be used to compute statistical power and the needed sample size. As we have found, it is likely that with this approach, the extracted latent variable will exceed the primary scales in performance.
In conclusion, we have used principal component factor analysis to derive a global functional performance scale to measure the functional limitation stage in the process of disablement in RA. The new variable outperforms the primary scales in a number of tests of association and fit with criterion standards, and in response to clinically significant change. This approach may be used to develop latent variables measuring other RA disease components, such as disease activity and damage.
American College of Rheumatology 20% (50%) response criteria
Bayesian information criterion
modified health assessment questionnaire
rheumatology function test
short-form 36 physical function scale
Decker JL, McShane DJ, Esdaile JM, Hathaway DE, Levinson JE, Liang MH, Medsger TA, Meenan RF, Mills JA, Roth SH, Wolfe F: Definition of elements pertaining to functional measurement. In Dictionary of the Rheumatic Diseases, Volume 1: Signs and Symptoms. 1982, American College of Rheumatology Glossary Committee. New York: Contact Associates International Ltd, 63-68.
The Research Sub-committee of the Empire Rheumatism Council: Gold therapy in rheumatoid arthritis. Final report of a multicenter controlled trial. Ann Rheum Dis. 1961, 20: 315-334.
Pincus T, Brooks RH, Callahan LF: Reliability of grip strength, walking time and button test performed according to a standard protocol. J Rheumatol. 1991, 18: 997-1000.
Pincus T, Callahan LF: Rheumatology function tests: grip strength, walking time, button test and questionnaires document and predict long term morbidity and mortality in rheumatoid arthritis. J Rheumatol. 1992, 19: 1051-1057.
Escalante A, del Rincón I: The disablement process in rheumatoid arthritis. Arthritis Rheum. 2002, 47: 333-342. 10.1002/art.10418.
World Health Organization: International Classification of Impairments, Disabilities and Handicaps. 1980, Geneva: WHO
Nagi SZ: Disability concepts revisited: implications for prevention. In Disability in America: Toward a National Agenda for Prevention. Edited by: Pope AM, Tarlov AR. 1991, Washington, DC: Division of Health Promotion and Disease Prevention, Institute of Medicine, National Academy Press, 309-327.
Verbrugge LM, Jette AM: The disablement process. Soc Sci Med. 1994, 38: 1-14. 10.1016/0277-9536(94)90294-1.
Brandt EN, Pope AM, Eds: Enabling America. Assessing the Role of Rehabilitation Science and Engineering. 1997, Washington, DC: National Academy Press
Escalante A, del Rincón I, Cornell JE: A latent variable approach to measuring physical disability in rheumatoid arthritis. Arthritis Rheum. 2004, 51: 399-407. 10.1002/art.20404.
Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, Healey NA, Kaplan SR, Liang MH, Luthra HS, Medsger TA, Mitchell DM, Neustadt DH, Pinals RS, Schaller JG, Sharp JT, Wilder RL, Hunder GG: The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis. Arthritis Rheum. 1988, 31: 315-324.
del Rincón I, Battafarano DF, Arroyo RA, Murphy FT, Fischbach M, Escalante A: Ethnic variation in the clinical manifestations of rheumatoid arthritis. Role of HLA-DRB1 alleles. Arthritis Rheum. 2003, 49: 200-208. 10.1002/art.11000.
del Rincón I, Battafarano DF, Arroyo RA, Murphy FT, Escalante A: Heterogeneity between men and women in the influence of the HLA-DRB1 shared epitope on the clinical expression of rheumatoid arthritis. Arthritis Rheum. 2002, 46: 1480-1488. 10.1002/art.10295.
Orces CH, del Rincón I, Abel MP, Escalante A: The number of deformed joints as a surrogate measure of damage in rheumatoid arthritis. Arthritis Rheum. 2002, 47: 67-72. 10.1002/art1.10160.
Escalante A, Galarza-Delgado D, Beardmore TD, Baethge BA, Esquivel-Valerio J, Marines AL, Mingrone M: Cross-cultural adaptation of a brief outcome questionnaire for Spanish-speaking arthritis patients. Arthritis Rheum. 1996, 39: 93-100.
Ware JE: SF-36 Health Survey. Manual and Interpretation Guide. 1993, Boston: Nimrod Press, 321-322.
Pincus T, Summey JA, Soraci SA, Wallston KA, Hummon NP: Assessment of patient satisfaction in activities of daily living using a modified Stanford Health Assessment Questionnaire. Arthritis Rheum. 1983, 26: 1346-1353.
Steinbrocker O, Traeger CH, Batterman RC: Therapeutic criteria for rheumatoid arthritis. JAMA. 1949, 140: 659-666.
Norman GR, Streiner DL: Principal components and factor analysis. In Biostatistics. The Bare Essentials. Edited by: Norman GR, Streiner DL. 2000, Hamilton, Ontario: BC Decker, Inc;, 163-177. 2nd
D'Agostino RB, Balanger A, D'Agostino RB: A suggestion for using a powerful and informative test of normality. Am Stat. 1990, 44: 316-321.
Daly LE, Bourke GJ, McGilvreay J: Interpretation and uses of medical statistics. 1991, Oxford, UK: Blackwell Scientific Publications
Meng X-L, Rosenthal R, Rubin DB: Comparing correlated correlation coefficients. Psychol Bull. 1992, 111: 172-175. 10.1037//0033-2909.111.1.172.
Goldstein R: Testing dependent correlation coefficients. Stata Tech Bull Reprints (STB32). 1997, 6: 128-129.
Raftery AE: Bayesian model selection in social research. In Sociological Methodology. Edited by: Marsden PV. 1995, Cambridge, MA: Blackwell, 111-195.
Felson DT, Anderson JJ, Boers M, Bombardier C, Furst D, Goldsmith C, Katz LM, Lightfoot R, Paulus H, Strand V, Tugwell P, Weinblatt M, Williams HJ, Wolfe F, Kieszak S: American College of Rheumatology. Preliminary definition of improvement in rheumatoid arthritis. Arthritis Rheum. 1995, 38: 727-735.
Guyatt G, Walter S, Norman G: Measuring change over time: assessing the usefulness of evaluative instruments. J Chronic Dis. 1987, 40: 171-178. 10.1016/0021-9681(87)90069-5.
Zucchini W: An introduction to model selection. J Math Psychol. 2000, 44: 41-61. 10.1006/jmps.1999.1276.
Crocker L, Algina J: Introduction To Classical and Modern Test Theory. 1986, New York: Holt, Rinehart & Winston;
Smythe HA, Helewa A, Goldsmith CH: "Independent assessor" and "pooled index" as techniques for measuring treatment effects in rheumatoid arthritis. J Rheumatol. 1977, 4: 144-152.
Prevoo MLL, van't Hof MA, Kuper HH, van Leeuwen MA, van de Putte LBA, van Riel PLCM: Modified disease activity scores that include twenty-eight joint counts. Development and validation in a prospective longitudinal study of patients with rheumatoid arthritis. Arthritis Rheum. 1995, 38: 44-48.
Smolen JS, Breedveld FC, Schiff MH, Kalden JR, Emery P, Eberl G, van Riel PL, Tugwell P: A simplified disease activity index for rheumatoid arthritis for use in clinical practice. Rheumatology. 2003, 42: 244-257. 10.1093/rheumatology/keg072.
Pincus T, Strand V, Koch G, Amara I, Crawford B, Wolfe F, Cohen S, Felson D: An index of the three core data set patient questionnaire measures distinguishes efficacy of active treatment from that of placebo as effectively as the American College of Rheumatology 20% response criteria (ACR20) or the Disease Activity Score (DAS) in a rheumatoid arthritis clinical trial. Arthritis Rheum. 2003, 48: 625-630. 10.1002/art.10824.
This research was supported by an Arthritis Investigator Award and a Clinical Science Grant from the Arthritis Foundation; and NIH grants RO1-HD37151, K24-AR47530 and K23-HL004481, and grant M01-RR01346 for the Frederic C Bartter General Clinical Research Center. The authors thank Drs Ramon Arroyo, Daniel Battafarano, Rita Cuevas, Alex de Jesus, Michael Fischbach, John Huff, Rodolfo Molina, Matthew Mosbacker, Frederick Murphy, Carlos Orces, Christopher Parker, Thomas Rennie, Jon Russell, Joel Rutstein, and James Wild, for giving us permission to study their patients.