Contribution for new genetic markers of rheumatoid arthritis activity and severity: sequencing of the tumor necrosis factor-alpha gene promoter

The objective of this study was to assess whether clinical measures of rheumatoid arthritis activity and severity were influenced by tumor necrosis factor-alpha (TNF-α) promoter genotype/haplotype markers. Each patient's disease activity was assessed by the disease activity score using 28 joint counts (DAS28) and functional capacity by the Health Assessment Questionnaire (HAQ) score. Systemic manifestations, radiological damage evaluated by the Sharp/van der Heijde (SvdH) score, disease-modifying anti-rheumatic drug use, joint surgeries, and work disability were also assessed. The promoter region of the TNF-α gene, between nucleotides -1,318 and +49, was sequenced using an automated platform. Five hundred fifty-four patients were evaluated and genotyped for 10 single-nucleotide polymorphism (SNP) markers, but 5 of these markers were excluded due to failure to fall within Hardy-Weinberg equilibrium or to monomorphism. Patients with more than 10 years of disease duration (DD) presented significant associations between the -857 SNP and systemic manifestations, as well as joint surgeries. Associations were also found between the -308 SNP and work disability in patients with more than 2 years of DD and radiological damage in patients with less than 10 years of DD. A borderline effect was found between the -238 SNP and HAQ score and radiological damage in patients with 2 to 10 years of DD. An association was also found between haplotypes and the SvdH score for those with more than 10 years of DD. An association was found between some TNF-α promoter SNPs and systemic manifestations, radiological progression, HAQ score, work disability, and joint surgeries, particularly in some classes of DD and between haplotypes and radiological progression for those with more than 10 years of DD.


Introduction
Tumor necrosis factor-alpha (TNF-α) has been shown to be relevant for the physiopathology of rheumatoid arthritis (RA), and its inhibition by anti-TNF-α antibodies or recombinant soluble receptors results in a major improvement of this disease [1]. On the other hand, TNF-α production shows a wide variation, with high-and low-producer phenotypes present in humans [2] but with a strong concordance in monozygotic twins [3], pointing to the influence of genetic variation on the regulation of TNF-α circulating levels. These arguments have favored the view that genetic factors controlling TNF-α could have a major impact on RA outcome. An extensive network of gene products is involved in the production, modulation, and decay of TNF-α, affecting the stabilization of the transcripts, full activation of membrane-bound TNF-α by proteases, and the interaction with its membrane receptors and with membrane-shedded receptors [4]. In addition, the gene itself and/ or its promoter area could be the source of genetic variation. However, present knowledge suggests that the highest genetic variability is concentrated in the promoter area of the TNF-α gene, where at least eight different single-nucleotide polymorphisms (SNPs) are concentrated, with the potential to affect the binding of transcriptor factors and thus to control the activity of the promoter and resulting mRNA and protein levels [4].
Several studies have addressed the issue of TNF-α gene promoter SNPs and RA outcome. Although some contradictory results have emerged, the data published so far indicate the possible existence of TNF-α gene promoter variants that act as markers for disease severity and response to treatment in RA [5][6][7]. Nevertheless, further investigation is necessary to determine whether the previously identified TNF-α gene promoter variants contribute directly to RA outcome or act as genetic markers of other polymorphisms in the TNF-α gene promoter area.
In this study, we have analyzed the promoter region of the TNF-α gene, between nucleotides -1,318 and +49, of 554 patients with RA from 11 Portuguese rheumatology centers serving the mainland and Azores and Madeira Islands. An association was found between some SNPs, localized in the -238, -308, -857, and -863 positions, and systemic manifestations, functional status, radiological damage, work disability, and joint surgeries and between haplotypes and radiological damage for those with more than 10 years of disease duration (DD).

Patients
Patients included in this study (n = 554) fulfilled the American College of Rheumatology (ACR) 1987 revised criteria for RA [8]. Research was carried out in compliance with the Declaration of Helsinki. Written informed consent was obtained from all patients, and all of the ethics committees of the participating hospitals approved the study. Patients were randomly selected and evaluated at Santa Maria Hospital (Lisbon), Egas Moniz Hospital (Lisbon), Coimbra University Hospital (Coimbra), Faro Hospital (Faro), Nossa Senhora da Assunção Hospital (Seia), Caldas da Rainha Hospital (Caldas da Rainha), the Portuguese Institute of Rheumatology (Lisbon), Infante D. Pedro Hospital (Aveiro), Garcia de Orta Hospital (Almada), Divino Espírito Santo Hospital (Ponta Delgada), and Funchal Central Hospital (Funchal). For every patient included in this study, detailed data were collected in a separate clinical record [9]: DD, age of onset, rheumatoid factor (RF), erythrocyte sedimentation rate (ESR) and C-reactive protein at the time of evaluation, the number of previous disease-modifying anti-rheumatic drugs (DMARDs), the use of anti-TNF-α treatments, the dose of prednisolone, previous joint surgeries due to inflammatory destructive arthropathy directly related to RA (total joint replacement and arthrodesis), the number of years of education, and work disability (defined as the legal incapacity to work as judged by an official medical committee and directly attributed to the consequences of RA). Patients were considered to have systemic manifestations if at least one of the following clinical features could be detected: subcutaneous nodules, pulmonary fibrosis confirmed by chest roentgenograms and lung function tests, echocardiographic evidence of pericardial effusion, pleural effusion shown by chest roentgenograms, Felty syndrome (less than 2 × 10 9 /l granulocytes and splenomegaly), cutaneous vasculitis (leukocytoclastic vasculitis histologically proved), non-compressive neuropathy confirmed by electromyography, or the diagnosis of Sjögren syndrome based on the clinical symptoms of dry eyes and dry mouth (confirmed by a positive Schirmer's test and/or keratoconjunctivitis sicca with involvement of salivary glands documented by positive lip biopsy and/or salivary scintigraphy). Simple x-rays of the hands and feet were analyzed with the Sharp/van der Heijde (SvdH) method [10]. Disease activity was evaluated according to the core disease activity parameters proposed by the ACR and the European League Against Rheumatism: number of swollen and tender joints, pain as evaluated by the patient in a 10-cm analogue scale, disease activity as evaluated by the patient and the physician in a 10-cm analogue scale, ESR, and Health Assessment Questionnaire (HAQ) score [11]. The disease activity score using 28 joint counts (DAS28) [12] was calculated. The presence of other RA cases in the family was recorded if confirmed by a rheumatologist.
DNA direct sequence analysis was carried out with a BigDye ® Terminator version 3.1 Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. A 300-ng aliquot of DNA template and the internal overlap primers F2: 5'-TGTGACCACAGCAATGGG-TAGG-3', F3: 5'-CCAAACACAGGCCTCAGGACTC-3', and R5: 5'-GAAAGCTGAGTCCTTGAGGGAG-3' were used. Sequencing reactions were carried out with an initial step of 1 minute at 96°C, followed by one cycle of 96°C for 10 seconds and 60°C for 4 minutes (25 times). Purification of sequencing reactions was performed by ethanol precipitation. The purified reaction mixture was analyzed on a four-capillary automated sequencer ABI PRISMR 3100-Avant Genetic Analyzer (Applied Biosystems) with Sequencing Analysis Software version 5.2 (Applied Biosystems).

Statistical analysis
Multiple linear regression models were used to assess the association between the five polymorphisms and DAS28 and HAQ score. For systemic manifestations, work disability, joint surgeries, and RF, logistic regression models were used instead. All the analyses were adjusted for the effects of sociodemographic and clinical covariates. Except for RF, the models were stratified by DD (less than 2 years, 2 to 10 years, and more than 10 years). In regression models, SNPs were separated into two groups: more prevalent homozygous versus heterozygous (aggregated with the less prevalent homozygous).
SvdH score is known to be strongly associated with DD. To estimate the distribution of the SvdH score in the three different age groups, a hierarchical Bayesian model was built [13]. For each group, the SvdH score was considered to follow a gamma distribution with a shape parameter equal to 3 and scale parameter β i , i = 1, 2, 3, with i representing the groups. These β i values were supposed to be independent a priori with a non-informative distribution. The group distribution was considered to be categorical with probabilities π 1 , π 2 , and π 3 following a prior Dirichlet distribution. These estimated probability curves (predictive conditional densities of the score) for the scores are essentially descriptive and should not be used for classification purposes. These curves are expected, one at a time as the SvdH score value increases, to exhibit higher values than the other two curves. The points at which the curves intersect can be seen as cutoff points. As can be seen in Figure 1, two cutoff points are expected to be found. The cutoff points found are 85 and 115, meaning that SvdH score values below 85 are more likely to occur in the first group, values between 85 and 115 are more likely to occur in the second group, and values above 115 are more likely to occur in the third group. Conversely, for classification purposes, the SvdH score of a particular patient can be used to calculate the conditional probability of his or her belonging to a given group and, once this has been done for each group, to classify the patient in the group for which the conditional probability is highest. These calculations take into account the way the patients distribute along the groups, namely the weight of each group in the sample. The predictive probability of the group as a function of the SvdH score ( Figure 2a) showed that patients with less than 2 years and those with 2 to 10 years of DD could not be separated into two groups by any SvdH score cutoff point. The method classifies them all as belonging to the second group. However, an SvdH score close to 90 points could distinguish two groups: patients with less than 10 years and those with more than 10 years of DD. The patients were then separated into these two groups and distributions were recalculated. The value of 110 points was found to be a sensible choice for the cutoff point for the SvdH score ( Figure  2b). This cutoff point was confirmed using the cross-validation procedure by dividing the available dataset into an experimental group (2/3 of the data) and a test group (1/3). This approach allowed us to classify the patient as having a 'good prognosis,' as having an 'expected evolution of the disease,' or as having a 'bad prognosis.' Table 1 shows that this cutoff point classified 67.4% of the patients as having had a disease course as theoretically expected (long DD and high SvdH score or short DD and low SvdH score), 20.6% as having had a disease course milder than theoretically expected, possibly revealing a subset of patients with good prognosis (long DD and low SvdH score), and 12.0% as having had a disease course worse than theoretically expected, possibly depicting a subset of patients with bad prognosis (short DD and high SvdH score). Generalized linear models with gamma distribution and logarithmic link function were used to study the association between genotypes and SvdH score, also adjusted for the effects of socio-demographic and clinical covariates.
All SNP analyses, including conformance with Hardy-Weinberg equilibrium, linkage disequilibrium, and haplotype frequency estimation, were performed using Haploview version 3.32 [14]. Conformance with Hardy-Weinberg equilibrium was computed using an exact test [14], pairwise standardized disequilibrium coefficient measures (D') were calculated between each marker, and haplotype frequencies were estimated using an accelerated expectation-maximization (EM) algorithm. Association of haplotypes with covariates was assessed by χ2 test. The public domain software R (2006, R Development Core Team) [15] was used to process the regression analysis.

Patient characteristics
The 491 patients who were effectively included in the analysis had a mean age of 57 ± 13.3 years, 85% were women, mean DD was 12.7 ± 10.5 years, RF was detected in the serum of 319 patients (65%), systemic manifestations were present in 124 patients (25%), the mean DAS28 was 4.1 ± 1.5, the mean HAQ score was 1.2 ± 0.8, the mean modified SvdH score was 117.2 ± 77.8, 113 patients (23%) had been P value of χ 2 test is less than 0.001.

Figure 1
Estimated distribution of the Sharp/van der Heijde (SvdH) score according to the duration of the disease (DD) Estimated distribution of the Sharp/van der Heijde (SvdH) score according to the duration of the disease (DD). submitted to joint surgeries, and 282 (57%) were not working due to RA. The patients had been previously treated with 2.3 ± 1.6 DMARDs. The characteristics of the RA population studied were similar to the disease pattern previously described in Portugal [16] ( Table 2).

Effect of genetic variants on disease activity and outcome measures
A total of 554 patients were evaluated and genotyped for 10 SNP markers, and five of these markers were excluded due to failure to fall within Hardy-Weinberg equilibrium or to monomorphism. For the remaining five markers (positions -1,036, -863, -857, -308, and -238), haplotypes were constructed using an accelerated EM algorithm implemented in the Haploview software. The subsequent analysis was carried out in a subgroup of 491 Caucasian patients with at least one of the five SNPs identified. In the analysis of the five SNPs (Table 3), patients with more than 10 years of DD presented significant associations between the -857 SNP and systemic manifestations (p < 0.05), as well as joint surgeries (p < 0.01). Associations were also found between the -308 SNP and work disability in patients with 2 to 10 years (p < 0.05) and those with more than 10 years (p < 0.01) of DD. DAS28 was weakly associated (p < 0.10) with the -863 SNP for those with a DD of less than 2 years and with the -308 SNP for patients with from 2 to 10 years of DD.
A borderline effect was found between the -238 SNP and HAQ score in patients with 2 to 10 years of DD (p = 0.052), and a weak association (p < 0.10) was detected between the -863, -857, and -238 SNPs and RF. Due to a lack of data in some of the covariates, the models for systemic manifestations, work disability, and joint surgery applied to the group of patients with less than 2 years of DD were not able to make an accurate estimation of the association (Table 3).
Using the previously described model for x-ray analysis, we found an association between the -308 SNP and SvdH score in patients with less than 10 years of DD (p < 0.05). For those with more than 10 years of DD, a weak association between the -238 SNP and SvdH score was detected (p < 0.10). Given that combinations of allelic variants at each of the five markers may provide valuable information regarding functional variation in terms of TNF-α transcription, we estimated haplotype frequencies for each particular subgroup of patients and evaluated its association with clinical covariates in addition to analyzing the association of individual SNPs with clinical covariates (Table 4). This analysis revealed an association between haplotypes and the SvdH score for those with more than 10 years of DD (Table 4).

Discussion
An association was found between some SNPs (-857, -308, and -238) and systemic manifestations, radiological progression, work disability, and joint surgeries. A weak association was also observed between other SNPs (-863 and -238) and DAS28, HAQ score, and RF, particularly in some categories of DD. In addition, an association between haplotypes and radiological progression for those with more than 10 years of DD was detected.
Previous reports regarding the -308 position are contradictory, and different works suggest that both the -308GG [17] and -308GA [18] genotypes are implicated in increased RA severity. In line with this controversy, several independent groups using reporter gene constructs have shown that the -308A allelic variation presents a higher transcriptional activity than the -308G form [19][20][21], although other studies were unable to show any difference between the transcriptional activity of -308G and -308A alleles [22,23]. Nevertheless, more recent data suggest a direct functional effect of the -308 SNP by modifying the binding of transcription factors [24]. On the other hand, the -857CC genotype appears to be associated with a worse response to etanercept in patients with RA [25], and studies in healthy controls have shown higher production of TNF-α in whole-blood cultures stimulated with lipopolysaccharide in individuals homozygous for the -857C allele [26]. In addition, the -857T (but not the -857C) allele strongly binds the transcription factor OCT1, which blocks the interaction of nuclear factor-kappa-B (NF-κB) to the nearby region -873 to -863, thereby inhibiting TNF-α transcription   [27]. To conclude, the data gathered so far favor a possible influence of the -308 and the -857 TNF-α promoter positions on the production of TNF-α, consistent with the association that we found with RA long-term outcome measures.
Another nearby SNP, at position -863, is involved in NF-κB binding, and a report has suggested that the rare -863A allele is associated with a lower transcriptional activity [27], which can be inferred as an indication of possibly higher disease activity and worse prognosis in patients with the -863CC genotype. This interpretation is consistent with the association trend that we have depicted between this genotype and RF, a well-known RA prognostic factor.
One of the most studied TNF-α gene polymorphisms is the one in position -238 (G→A). Different authors have associated both the -238A and -238G allelic forms to high TNF-α production but with clearly contrasting results [28,29]. A significant number of variables may contribute to this apparent contradiction, including differences in cell line types, the length of the promoter sequence, and the presence/absence of the 3' untranslated region [30]. Regardless of whether there is a direct functional effect of this SNP, some studies have shown an association of the -238GG genotype with worse RA prognosis [31,32], so the trend that our study has shown for an association of this genotype with HAQ score and RF is not surprising.

Conclusion
Although genetic influences on RA outcome remain incompletely understood, our results suggest that TNF-α gene promoter polymorphisms influence the outcome of this chronic disease. Despite this evidence, the value of genotyping RA patients in order to define their clinical course will remain unproven until a proper prospective evaluation of this cohort of patients validates this hypothesis.