Genetic polymorphisms in PTPN22, PADI-4, and CTLA-4 and risk for rheumatoid arthritis in two longitudinal cohort studies: evidence of gene-environment interactions with heavy cigarette smoking

Introduction PTPN22, PADI-4, and CTLA-4 have been associated with risk for rheumatoid arthritis (RA). We investigated whether polymorphisms in these genes were associated with RA in Caucasian women included in two large prospective cohorts, adjusting for confounding factors and testing for interactions with smoking. Methods We studied RA risk associated with PTPN22 (rs2476601), PADI-4 (rs2240340), and CTLA-4 (rs3087243) in the Nurses' Health Study (NHS) and NHSII. Participants in NHS were aged 30 to 55 years at entry in 1976; those in NHSII were aged 25 to 42 years at entry in 1989. We confirmed incident RA cases through to 2002 in NHS and to 2003 in NHSII by questionnaire and medical record review. We excluded reports not confirmed as RA. In a nested case-control design involving participants for whom there were samples for genetic analyses (45% of NHS and 25% of NHSII), each incident RA case was matched to a participant without RA by year of birth, menopausal status, and postmenopausal hormone use. Genotyping was performed using Taqman single nucleotide polymorphism allelic discrimination on the ABI 7900 HT (Applied Biosystems, 850 Lincoln Centre Drive, Foster City, CA 94404 USA) with published primers. Human leukocyte antigen shared epitope (HLA-SE) genotyping was performed at high resolution. We employed conditional logistic regression analyses, adjusting for smoking and reproductive factors. We tested for additive and multiplicative interactions between each genotype and smoking. Results A total of 437 incident RA cases were matched to healthy female control individuals. Mean (± standard deviation) age at RA diagnosis was 55 (± 10), 57% of RA cases were rheumatoid factor (RF) positive, and 31% had radiographic erosions at diagnosis. PTPN22 was associated with increased RA risk (pooled odds ratio in multivariable dominant model = 1.46, 95% confidence interval [CI] = 1.02 to 2.08). The risk was stronger for RF-positive than for RF-negative RA. A significant multiplicative interaction between PTPN22 and smoking for more than 10 pack-years was observed (P = 0.04). CTLA-4 and PADI-4 genotypes were not associated with RA risk in the pooled results (pooled odds ratios in multivariable dominant models: 1.27 [95% CI = 0.88 to 1.84] for CTLA-4 and 1.04 [95% CI = 0.77 to 1.40] for PADI-4). No gene-gene interaction was observed between PTPN22 and HLA-SE. Conclusion After adjusting for smoking and reproductive factors, PTPN22 was associated with RA risk among Caucasian women in these cohorts. We found both additive and multiplicative interactions between PTPN22 and heavy cigarette smoking.

Twin studies conducted in the UK and Finland have estimated that 50% to 60% of the variation in RA susceptibility is accounted for by genetic factors [29], leaving 40% to 50% probably due to environmental exposures. Cigarette smoking is the best established environmental risk factor for RA, with risk increasing in proportion to duration and intensity of exposure [30][31][32][33][34][35]. Case-control studies conducted in Sweden, Holland, and North America have identified an interaction between presence of the HLA-SE alleles and cigarette smoking in determining RA risk, in particular that of anti-CCP-positive RA [36][37][38]. Female reproductive factors such as early age at menarche, irregular menses, and use of postmenopausal hormones have also been related to increased RA risk, and prolonged duration of breast-feeding was found to be protective against development of RA in the Nurses' Health Study (NHS) [39][40][41].
We aimed to validate previous findings of increased risk for RA associated with polymorphisms in the PTPN22, PADI-4 and CTLA-4 genes, and to assess whether of behavioral and reproductive factors that are known to be associated with RA risk influence these findings. We also investigated potential additive and multiplicative interactions between each of these polymorphisms and the presence of the HLA-SE. To do this, we conducted a case-control study nested within the NHS and NHSII; those studies include two large cohorts of women, who were followed closely over many years for behavioral and reproductive factors before the onset of disease.

Study population
The NHS includes a prospective cohort of 121,700 female nurses, aged 30 to 55 years in 1976 when the study began. The NHSII was established in 1989, when 116,608 female nurses aged 25 to 42 years completed a baseline questionnaire about their medical histories and lifestyles. Ninety-four per cent of the NHS participants from 1976 to 2002, and 95% of NHSII participants from 1989 to 2003 have remained in active follow up (5% to 6% no longer respond to questionnaires and have not been confirmed as dead). All aspects of this study were approved by the Partners' HealthCare Institutional Review Board.

Identification of rheumatoid arthritis
As previously described [35], we employed a two-stage procedure in which all nurses who self-reported any connective tissue disease received a screening questionnaire for connective tissue disease symptoms [42], and -if positive -a detailed medical record review for American College of Rheumatology (ACR) classification criteria for RA [43], in order to identify and validate incident cases of RA. The presence or absence of rheumatoid factor (RF) and other features of RA was based on medical record review. Those in whom four of the seven ACR criteria were documented in the medical record were considered to have definite RA. For this nested case-control study, we also included a small number of women (n = 14) with three documented ACR criteria for RA, a diagnosis of RA by their physician, and agreement by two rheumatologists on the diagnosis of RA.

Population for analysis
We excluded prevalent RA cases diagnosed before the cohort was assembled, nonresponders, and women who reported any connective tissue disease that was not subsequently confirmed to be RA by medical record review. Women were censored when they failed to respond to any subsequent biennial questionnaires. Among the women in each cohort who had provided a sample for genetic analyses, each participant with confirmed incident RA was matched by year of birth, menopausal status, and postmenopausal hormone use to a healthy woman in the same cohort without RA. To minimize population stratification, and given that most cohort participants are Cau-casian, we limited the analyses to Caucasian matched pairs of women. In 1992 (NHS) and 1989 (NHSII), all participants were asked to provide data concerning their own racial backgrounds in more detailed categories. Of the Caucasian women in NHS and NHSII included in this analysis, 2% reported pure Scandinavian heritage, 15% reported pure Southern European, and 83% reported other or mixed Caucasian backgrounds. There were no significant differences in the distributions of these ethnicities between cases and controls (χ 2 with two degrees of freedom, P = 0.30).

Blood sampling
From 1989 to 1990, 32,826 (27%) NHS participants aged 43 to 70 years agreed to provide blood samples for future NHS studies. Between 1996 and 1999, 29,613 (25%) of the women included in the NHSII cohort (aged 32 to 52 years at that time) also agreed to have their blood drawn for future investigations. All samples were collected in heparinized tubes and sent to us by overnight courier in chilled containers. On receipt, the blood samples were centrifuged, aliquoted, and stored in liquid nitrogen freezers at -70°F (-57°C). The demographic and exposure characteristics of the NHS and NHSII participants who provided blood samples were found to be very similar to those of the overall cohorts [44,45].
DNA extraction from blood DNA was extracted from buffy coats from 96 samples in 3 to 4 hours. A volume of 50 μl of buffy coat was diluted with 150 μl phosphate-buffered saline and processed using the QIAmp™ (QIAGEN Inc., Chatsworth, CA, USA) 96-spin blood kit protocol. The protocol entails adding protease, the sample, and lysis buffer to 96-well plates. The plates are then mixed and incubated at 158°F (70°C), before adding ethanol and transferring the samples to columned plates. The columned plates are then centrifuged and washed with buffer. Adding elution buffer and centrifuging elutes the DNA. The average yield from 50 μl of buffy coat (based on 1,000 samples) is 5.5 μg with a standard deviation of 2.2 (range 2.0 to 16.4). These methods are semiautomated using a Qiagen 8000 robot to increase throughput and decrease manual pipetting errors.

Buccal cell collection method and DNA extraction in NHS
Forty thousand women in NHS who did not give blood in 1989 to 1990 were asked to give a buccal cell sample in 2002. To date we have collected an additional 21,733 buccal cell samples (18% of the NHS cohort). A collection kit was sent to participants, consisting of instructions for the buccal cell collection and the necessary supplies (a small bottle of mouthwash, a plastic cup with a screwtop cap, a ziplock plastic bag and absorbent sheet, and a stamped, self-addressed bubble envelope), as well as an informed consent form. Participants were instructed to fill the cup with mouthwash, swish the mouthwash in their mouth vigorously, and then spit back into the cup. Returned samples were processed using ReturPure-Gene DNA Isolation Kit (Gentra Systems, Minneapolis, MN, USA) to extract genomic DNA from human cheek cells. The extracted DNA was archived in liquid nitrogen freezers using specific tracking software. The average DNA recovery from these specimens measured using PicoGreen was 59 ng/μl.

Whole-genome amplification
For all genomic DNA samples, an aliquot was put through a whole-genome amplification protocol using the GenomPhi DNA amplification kit (GE Healthcare, Piscataway, NJ) to yield high-quality DNA sufficient for single nucleotide polymorphism (SNP) genotyping.
Single nucleotide polymorphism genotyping DNA was genotyped using Taqman SNP allelic discrimination on the ABI 7900 HT (Applied Biosystems, 850 Lincoln Centre Drive, Foster City, CA 94404 USA) using published primers [8,9,18,46]. We studied only the CTLA-4 CT60 (rs3087243) allele. We chose the PADI4_94 allele (rs2240340) of the haplotype first described by Suzuki and coworkers [8], because it had the strongest association in a Japanese population and was replicated in a large meta-analysis [25]. Using the same methods, we also genotyped the lactase gene (rs4988235), which is known to exhibit substantial variation in allele frequency from Northern to Southern Europe, in order to test for population stratification in this nested case-control study [47,48].

HLA-DRB1 shared epitope determination
Low-resolution HLA-DRB1 genotyping was performed by polymerase chain reaction with sequence specific primers using OLERUP SSP kits (QIAGEN, West Chester, PA, USA). We used primers to amplify DNA samples that contained sequences for HLA-DRB1*04, *01,*10 and *14, along with consensus primers and appropriate positive and negative control samples. For samples with positive two-digit HLA signals, sequence-specific primers were used for high resolution fourdigit shared epitope allele detection of DRB1*0401, *0404, *0405, *0101, *0102, *1402, and *1001. OLERUP SSP computer software (QIAGEN) was used to determine four-digit HLA types. Quality control split samples were included, randomly interspersed with study samples.

Covariate information
Information was collected from the women in both cohorts via biennial questionnaires regarding diseases, lifestyle, and health practices. Age was updated in each cycle. Reproductive covariates were chosen based on our past findings of associations between reproductive factors and risk for developing RA in this cohort [41]. Data on parity, total duration of breast-feeding, menopausal status, and postmenopausal hormone use were selected from the questionnaire cycle before the date of RA diagnosis (or index date in controls). Selfreported menopausal status and age at menopause are highly reproducible in our cohorts; in a validation study of a subsample of NHS participants, 82% of naturally postmenopausal (page number not for citation purposes) women reported the same age at menopause (within 1 year) on two questionnaires mailed 2 years apart [49].
Participants in both cohorts were asked at baseline whether they were a current smoker or had ever smoked in the past and the age at which they began to smoke. Current smokers were asked for the number of cigarettes typically smoked per day and former smokers reported the age at which they stopped smoking and the number of cigarettes smoked per day before quitting. On each subsequent questionnaire, participants reported whether they currently smoked and the number of cigarettes smoked per day. From these reports, we calculated pack years of smoking (product of years of smoking and packs of cigarettes per day).
Other potential confounders examined included, body mass index, which was computed for each 2-year time interval using the most recent weight (in kilograms) divided by height (in meters squared), as reported at baseline. Alcohol intake was reported at least every 4 years and categorized in grams per day. Husband's educational level was assessed in 1992 in NHS and 1999 in NHSII, and was included as a proxy for socioeconomic level.

Statistical analyses
We verified Hardy-Weinberg equilibrium for each of the genotypes among controls in each of the datasets (NHS blood, NHS cheek cells, and NHSII blood). We employed conditional logistic regression analyses, conditioned on matching factors, and adjusted for potential confounders, including cigarette smoking and reproductive factors assessed before diagnosis of RA. All analyses were first conducted separately in each cohort and then on data pooled from the two cohorts. Because the P value for heterogeneity was significant for the CTLA-4 genotype, we also meta-analytically pooled results from the two cohorts using a DerSimonian and Laird random effects model [50]. In analyses stratified by the presence of RF among the RA cases, we employed unconditional logistic regression analyses, adjusting for each of the matching factors, in addition to the covariates above. For analyses of PTPN22, we employed a dominant model because the minor allele frequencies were low (9% in controls and 14% in cases). In analyses involving CTLA-4 and PADI-4, we assessed the risk for RA in dominant, additive, and recessive models.

Gene-environment and gene-gene interactions
We conducted assessments for gene-environment interactions by testing for both additive and multiplicative interactions. For additive interactions, we calculated the attributable proportion due to interaction using a 2 × 2 factorial design to analyze the data [51][52][53]. (There is evidence of interaction when the attributable proportion is not equal to 0.) Ninety-five per cent confidence intervals (CIs) were calculated using the delta method as described by Hosmer and Lemeshow [54].
We tested for multiplicative interaction using an interaction variable (for example, gene × smoking) in the conditional logistic regression models. The significance of the interaction was determined using the Wald χ 2 test of the interaction variable. In the combined NHS-NHSII nested case-control study dataset, we assessed for interactions between the presence of each polymorphism and cigarette smoking categorized both as ever/never, and then dichotomized as ≤10 or ≥10 packyears of smoking, because this is the threshold we previously identified to be associated with increased risk for RA [35].
Using similar methods, we tested for gene-gene interaction between PTPN22 and HLA-SE in influencing RA susceptibility in analyses limited to NHS and NHSII blood samples. SAS version 9.1 (SAS Institute, Cary, NC, USA) was used for all analyses.

Results
A total of 437 pairs of Caucasian women, each containing one woman with incident RA and her matched control, were included in these analyses, after removing 18 women because of missing data for all genotypes examined. The characteristics of the RA cases at diagnosis in each of the two cohorts are shown in Table 1. The cases in the NHS had a mean (± standard deviation) age of 57 years (± 9), as compared with 43 (± 5) in the younger NHSII cohort, because of the different ages targeted for enrollment in the two cohorts. Otherwise, the cases were similar in terms of the prevalence of RF, erosions, nodules, and proportion diagnosed by a member of the ACR. All cases and controls in these analyses were Caucasian, and the mean (± standard deviation) number of ACR criteria for the classification of RA was 5 (± 1) [43]. Table 2 shows the characteristics of the RA cases and matched controls at the time of RA diagnosis (or index date for the controls). A higher proportion of RA cases and controls were postmenopausal at RA diagnosis in the NHS than in the NHSII cohort, but the proportions of premenopausal and postmenopausal women among cases and controls were similar in each of the cohorts, as were the proportions currently receiving postmenopausal hormones. In NHSII a slightly higher percentage of women with RA were parous as compared with their matched controls (94% and 86%), but this was not true in the NHS cohort (91% of RA cases and 95% of controls). Among women in the NHSII with RA, a higher proportion had husbands who were college educated as compared with their matched controls (39% compared to 18%), but this was not true in the NHS cohort (20% in each group). No significant differences in allele frequencies of the lactase gene (rs4988235) in cases compared with controls were found. This argues strongly against any significant population stratification in our samples.
The genotype and allele frequencies of the RA cases and controls for the three candidate genotypes are shown in Table 3. None of the PTPN22, CTLA-4, or PADI-4 genotype distribu-tions deviated from Hardy-Weinberg equilibrium, either in each cohort or in the combined dataset. Overall, genotyping call rates were 97.5% for PTPN22, 96.4% for CTLA-4, 97.9% for PADI-4, and 98.7% for HLA-SE. The frequency of the T allele of the PTPN22 polymorphism was significantly higher among RA cases than among controls (χ 2 with one degree of freedom, P = 0.001 for pooled NHS and NHSII cohorts). The mutant alleles were not statistically associated with RA case status for the other two genotypes, namely PADI-4 and CTLA-4. As expected, HLA-SE alleles were highly significantly associated with risk for RA. (A slightly higher frequency of NHS cheek cell DNA samples could not be HLA genotyped: 3% of cases and controls, as compared with 0% to 2% of NHS and NHSII case and control DNA samples from blood.) Table 4 includes the results of conditional logistic regression analyses of risk for RA associated with each of the genotypes, performed separately in each cohort, and then on pooled data. The final multivariable model includes pack-years of cigarette smoking, age at menarche, regularity of menses, parity, and total duration of breast-feeding. Further adjustment for body mass index, alcohol intake, husband's educational level, and   To pursue potential associations of these polymorphisms with different RA phenotypes, we conducted analyses stratified by RF positivity, because many risk factors, including HLA-SE and cigarette smoking, have been shown to be more strongly associated with RF-seropositive RA [35,36]. Results of these analyses are shown in Table 5. The effect of the PTPN22 polymorphism was seen primarily for the development of RF-seropositive RA (OR = 1.75 [95% CI = 1.18 to 2.59]).
Cigarette smoking is a strong environmental risk factor for the development of RA, in particular RF-positive RA, and amount and duration are associated with increased risk [35]. We thus investigated potential interactions between the three polymorphisms of interest and the amount and duration of cigarette smoking at the time of RA diagnosis. Table 6 presents the results of analyses in which we tested for both multiplicative and additive interactions between smoking, categorized as ever/never smoking and then dichotomized as ≤10 or ≥10 pack-years of smoking, for each of the genotypes. Among those with the CC genotype of PTPN22, a modest effect of heavy smoking was observed (OR = 1.22 [95% CI = 0.81 to 1.83). However, among those with the PTPN22 T risk allele, the effect of heavy smoking was much more pronounced (OR = 2.50 [95% CI = 1.25 to 5.00]). We observed significant additive and multiplicative gene-environment interactions between heavy cigarette smoking and the presence of the PTPN22 T allele (additive interaction: P = 0.0006; multiplicative interaction: P = 0.04). When smoking was dichotomized as never/ever, there was marginal evidence for additive but not multiplicative interaction. We also tested for genotype-smoking interactions in RF-positive and RF-negative RA cases separately. In stratified analyses, we found significant additive but not multiplicative interactions between the PTPN22 risk allele and heavy smoking for both seropositive and seronegative RA. We did not observe similar gene-smoking interactions for CTLA-4 or PADI-4, for the overall risk for RA, or for RF-positive or RF-negative RA separately. No additive or multiplicative interactions were observed between PTPN22 and HLA-SE (Table 7). (Given potential difficulties with HLA-SE genotyping The analysis was unconditional logistic regression adjusting for year of birth, pack-year smoking, parity, breast-feeding, menstrual irregularity, age at menarche, menopausal status and postmenopausal hormone use. Values are expressed as odds ratio (95% confidence interval). CI, confidence interval; NHS, Nurses' Health Study; OR, odds ratio; RA, rheumatoid arthritis; RF, rheumatoid factor. Values are expressed as odds ratio (95% confidence interval). a Conditional logistic regression, adjusting for parity, breast-feeding, menstrual irregularity, and age at menarche, menopausal status and postmenopausal hormone use. b Unconditional logistic regression adjusting for year of birth, parity, breast-feeding, menstrual irregularity, and age at menarche, menopausal status and postmenopausal hormone use. c RF-positive RA cases and all controls. d RF-negative RA cases and all controls. e P add is the P value for attributable proportion (AP), one of the indices of additive interactions between binary smoking variable and binary PTPN22 genotype. f P multi is the P value for multiplicative interaction term between binary smoking variable and binary PTPN22 genotype with one degree of freedom. NHS, Nurses' Health Study; RA, rheumatoid arthritis; RF, rheumatoid factor.
NHS cheek cell DNA samples, we performed sensitivity analyses with these samples excluded, and the interaction analyses yielded similar and nonsignificant findings.)

Discussion
In these two cohorts of women followed prospectively for the development of RA and for multiple potential environmental exposures, we have confirmed that the R620W polymorphism in the PTPN22 gene is associated with increased risk for RA. We did not confirm that the PADI-4 (rs2240340) or the CTLA-4 (rs3087243) polymorphism were associated with increased risk for RA or for RF-positive RA in this population. We did not find that cigarette smoking, parity, total duration of breastfeeding, age at menarche, regularity of menses, menopausal status, or postmenopausal hormone use -all associated with risk for RA in past studies -were important confounders of the relationships between these genotypes and RA. However, we did uncover a significant multiplicative gene-environment interaction between heavy smoking and PTPN22 in determining RA risk.
The C→T polymorphism at position 1858 of the PTPN22 gene interferes with the function of the PTPN22/Csk complex, which is an important inhibitor of T-cell signaling, hindering its ability to suppress T-cell activation [9,26,55]. Similar to past reports, we have found the elevated risk to be primarily for RF-positive disease [9,25,[56][57][58]. Several reports and a meta-analysis have suggested that those with the PTPN22 risk allele have more severe disease [25,57]. We have confirmed that a significant association exists after adjustment for potential confounders, including smoking and reproductive factors. We also found a significant multiplicative interaction between heavy cigarette smoking (≥10 pack-years) and the presence of the PTPN22 risk allele, with a threefold elevated odds of developing RA in the presence of both factors.
Kallberg and colleagues [38] recently explored potential geneenvironment and gene-gene interactions in RA susceptibility, combining data from three large RA cohort studies. The results of their study are slightly different from ours, in that they did not find a significant interaction between the presence of the PTPN22 polymorphism and smoking in determining RA risk. Their gene-smoking interaction analyses used data from the Swedish Environmental Investigations in RA incident RA cohort, in which participants were asked to recall past smoking and were classified as ever or never smokers. Using the detailed prospective data regarding smoking amount and duration available for NHS and NHSII participants, we demonstrated a multiplicative interaction between the presence of the PTPN22 risk allele and heavy cigarette smoking of ≥10 pack-years in this female cohort. In past studies, we have found that the risk for RA was significantly elevated with ≥10 pack-years [35]. Our results now suggest that it may be necessary to exceed a threshold of heavy smoking to trigger a biologic pathway in RA pathogenesis involving the PTPN22 gene. Both HLA-SE and PTPN22 primarily affect the risk for RF-positive and anti-CCP-positive RA [25,36,[59][60][61].
In the case of HLA-SE, it is hypothesized that cigarette smoking leads to inflammation and citrullination of certain peptides, which -when presented within the context of HLA-DR4 molecules -are specifically recognized, contributing to anti-citrulline autoimmunity [37]. The newly described interaction between PTPN22 and heavy cigarette smoking suggests that the smoking/citrullination/T-cell recognition and activation pathway in RA pathogenesis may be influenced by both PTPN22 and HLA-SE.
The CTLA-4 gene is an attractive candidate gene for RA susceptibility, given the role played by CTLA-4 (cytotoxic T-lymphocyte associated 4) in T-cell activation and that a CTLA-4-IgG1 fusion protein is very effective in treating RA [62]. The CT60 polymorphism was associated with a modest increase in RA risk in the NHS cohort alone, and not in the NHSII cohort or pooled results, possibly because of a lack of sufficient power to detect a small elevation in risk (with OR in the order of 1.2) reported in other studies [25]. In a post hoc power calculation, for this CTLA-4 genotype with a risk allele frequency of 0.56 among controls and a two-sided type I error rate of 0.05, we had 71% power to detect an effect of 40% or greater (OR = 1.4).
The enzyme peptidylarginine deiminase-4, responsible for the citrullination of peptides to which anti-CCP antibodies are formed, is encoded by the PADI-4 gene. We were unable to detect an effect of this polymorphism on the risk for RA in these cohorts of women, and this could reflect inadequate power to detect a risk estimate of that magnitude. Given that the allele frequencies in the controls were similar in each of the cohorts to that reported in the literature, the significant P value for heterogeneity across the cohorts we observed was probably due to small sample size.
Limitations of this study that should be noted include the fact that, in the NHS and NHSII cohorts, the presence or absence of RF in the blood among RA cases was confirmed by medical record review at diagnosis, and thus not assayed at the same laboratory, and was not assayed in controls. Rheumatoid nodules and radiographic erosions are likewise documented at the time of diagnosis from thorough medical record review, but cohort participants have not been followed longitudinally for RA disease activity or complications. Similarly, we have limited data in the medical record on antibodies to CCP among the cases, which is important in the subphenotyping of RA [64], because the dates of diagnosis for most of the RA cases in this cohort preceded the clinical use of anti-CCP. Further analysis by anti-CCP status could be potentially informative.
Although all participants included in this analysis were of selfreported Caucasian ancestry, potential population stratification, or confounding by ethnicity, still exists, in particular if the inclusion of individuals of Northern compared with Southern European origin varied between cases and controls [48,65,66]. We assessed the potential for this bias in two ways. First, we examined and did not find significant differences in the more precise racial backgrounds reported by the Caucasian women included as cases or controls in these analyses. Second, we genotyped the lactase gene, which is known to exhibit substantial variation in allele frequency from Northern to Southern Europe [47,48], and found no significant differences in allele frequencies between cases and controls. A recent whole-genome association study investigating breast cancer risk alleles [67] found no evidence of population stratification among self-reported Caucasian women in the NHS cohort.
This study is unique in that the participants were followed for many years, in great detail, before the onset on their RA, and environmental and reproductive risk factors for RA have been well studied in this cohort [35,41]. This has allowed the investigation of possible gene-environment interactions with each of these recently described polymorphisms, and known and suspected RA risk factors assessed prospectively, such as cigarette smoking and menopausal status.

Conclusion
Our data confirm that the PTPN22 R620W polymorphism is a strong risk factor for RF-positive RA, and that presence of this polymorphism interacts with heavy cigarette smoking in a multiplicative manner. These findings contribute to the growing understanding of how genetic and environmental factors interact in RA pathogenesis, and suggest that heavy cigarette smoking and PTPN22 may be acting in a similar mechanistic pathway.