A genome-wide association study follow-up suggests a possible role for PPARG in systemic sclerosis susceptibility

Introduction A recent genome-wide association study (GWAS) comprising a French cohort of systemic sclerosis (SSc) reported several non-HLA single-nucleotide polymorphisms (SNPs) showing a nominal association in the discovery phase. We aimed to identify previously overlooked susceptibility variants by using a follow-up strategy. Methods Sixty-six non-HLA SNPs showing a P value <10-4 in the discovery phase of the French SSc GWAS were analyzed in the first step of this study, performing a meta-analysis that combined data from the two published SSc GWASs. A total of 2,921 SSc patients and 6,963 healthy controls were included in this first phase. Two SNPs, PPARG rs310746 and CHRNA9 rs6832151, were selected for genotyping in the replication cohort (1,068 SSc patients and 6,762 healthy controls) based on the results of the first step. Genotyping was performed by using TaqMan SNP genotyping assays. Results We observed nominal associations for both PPARG rs310746 (PMH = 1.90 × 10-6, OR, 1.28) and CHRNA9 rs6832151 (PMH = 4.30 × 10-6, OR, 1.17) genetic variants with SSc in the first step of our study. In the replication phase, we observed a trend of association for PPARG rs310746 (P value = 0.066; OR, 1.17). The combined overall Mantel-Haenszel meta-analysis of all the cohorts included in the present study revealed that PPARG rs310746 remained associated with SSc with a nominal non-genome-wide significant P value (PMH = 5.00 × 10-7; OR, 1.25). No evidence of association was observed for CHRNA9 rs6832151 either in the replication phase or in the overall pooled analysis. Conclusion Our results suggest a role of PPARG gene in the development of SSc.


Introduction
Systemic sclerosis (SSc) is a complex autoimmune disease with heterogeneous clinical manifestations characterized by extensive fibrosis in the skin and multiple internal organs, vascular damage, and immune imbalance with autoantibody production [1]. SSc patients are commonly classified in two major subtypes: limited cutaneous SSc (lcSSc) and diffuse cutaneous SSc (dcSSc), the latter with more progressive fibrosis of the skin, lungs, and other internal organs and, ultimately, with worse prognosis [2].
The etiology of this disorder is still unclear. However, epidemiologic and genetic studies clearly reflect the existence of a complex genetic component together with the influence of environmental factors [1]. During recent years, great advances have been made in our knowledge of the genetic basis of SSc [3,4], in part, thanks to the two independent genome-wide associations studies (GWASs) conducted in Caucasian populations that have been recently published [5,6], and several consequent follow-up studies [7][8][9][10].
However, despite these advances, the number of currently known loci explaining the genetic component of SSc is limited. To date, 13 loci have been identified as genetic risk factors for SSc at the genome-wide significance level. In other autoimmune diseases with multifactorial inheritance, such as Crohn disease, ulcerative colitis, or systemic lupus erythematosus, individual GWAS scans and follow-up meta-analyses have identified more than 71, 47, and 35 susceptibility loci, respectively [11][12][13]. Therefore, it is expected that additional risk factors for SSc remain to be discovered, and further meta-analyses and large replication studies are needed to identify part of the missing heritability of this disease.
Follow-up studies focused on the so-called grey zone of the GWASs, where SNPs with tier 2 associations (P values between 5 × 10 -8 and 5 × 10 -3 ) are located, constitute one of the most useful GWAS data-mining methods, because possible real association signals could be masked in that area because of a lack of statistical power. On this basis, we aimed to perform a follow-up study of the SNPs located in the grey zone of the GWAS by Allanore et al. [6], taking advantage of our GWAS data sets. We hypothesize that using a larger cohort would increase the statistical power and might lead to the identification of new suitable SSc genetic risk factors.

Study design
In the first step of this study, we focused on the 90 GWAS-genotyped SNPs that reached a P value < 10 -4 in the discovery phase of the GWAS carried out by Allanore et al. [6]. Then, we analyzed the SNPs overlapping with those included in Radstake et al. [5]. After excluding those SNPs located within MHC genes or in previously associated loci, data for 66 SNPs were selected. A metaanalysis including these 66 SNPs was performed on the combined data set from the two SSc GWASs, showing only two SNPs (rs310746 PPARG and rs6832151 CHRNA9 genetic variants) with a P value < 10 -5 (see later). These two genetic variants were genotyped in independent replication cohorts. Finally, we performed a metaanalysis for these two selected SNPs combining genotype data from both first and replication steps.

Study population
The first step of the study comprised a total of 2,921 SSc patients and 6,963 healthy controls of Caucasian ancestry from two previously published GWASs (European, USA, and French) [5,6]. The replication cohort was composed of 1,068 SSc patients and 1,490 healthy controls from two case-control sets of European ancestry (Italy and United Kingdom). We also included 5,272 extra English controls from The Wellcome Trust Case Control Consortium for the replication step comprising a total of 6,762 controls for this stage.
All SSc patients fulfilled the classification criteria by LeRoy et al. [2]. Approval from the local ethical committees (

Genotyping
In the first stage, genotype data for the 66 selected SNPs were obtained from both published SSc GWASs [5,6]. QC filters and principal component analysis were applied to the GWASs data, as described in Radstake et al. [5] and Allanore et al. [6].
In the replication phase, DNA from patients and controls was obtained by using standard methods. Genotyping was performed by using TaqMan 5′ allele discrimination

Statistical analysis
Association analyses of the genotype data was carried out with StatsDirect V.2.6.6 (StatsDirect, Altrincham, UK) and PLINK V.1.07 [14] software. Statistical significance was calculated by 2 × 2 contingency tables and χ 2 or Fisher Exact test, when necessary, to obtain P values, odds ratios (ORs), and 95% confidence intervals (CIs) in the population-specific analyses. Mantel-Haenszel tests under fixed effects or random effects, when appropriate, were performed to meta-analyze the combined data. Breslow-Day method (BD) was used to assess the homogeneity of the associations among the different populations (Breslow-Day P values <0.05 were considered statistically significant). Hardy-Weinberg equilibrium (HWE) was tested for all cohorts (HWE P values lower than 0.01 were considered to show significant deviation from the equilibrium). None of the included cohorts showed significant deviation from HWE for the two genotyped SNPs. Since the analyses were performed by using GWAS data, the statistical threshold for considering a P value as a significant P value in the allelic association analyses was set at 5 × 10 -8 . The statistical power of the combined analysis was 70% for the PPARG rs310746 and 100% for the CHRNA9 rs6832151 to detect associations with OR = 1.3 and a statistical significance of 5 × 10 -8 , according to Power Calculator for Genetic Studies 2006 software [15]. Table 1 shows the results of the 66 GWAS-genotyped SNPs selected for the combined meta-analysis of the two GWAS data sets performed in the first step of this study (see Additional file 1: Table S1 provides the results from both GWASs and the combined meta-analysis for the 66 selected SNPs). Two SNPs showed a P value lower than 10 -5 (PPARG rs310746: P MH = 1.90 × 10 -6 ; OR, 1.28; CI,  ) showing homogeneity in the ORs among populations. Therefore, these two SNPs were selected to genotype in independent cohorts. Patients and healthy controls were found to be in HWE at 1% significance level for both selected SNPs.

Results
In the replication phase, we observed a trend of association for the PPARG rs310746 genetic variant (P value = 0.066; OR = 1.17; CI 95%, 0.99 to 1.38) in the combined analysis of the two replication cohorts ( Table 2, upper rows). However, no evidence of association was observed for CHRNA9 rs6832151 either in the pooled analysis ( Table 2, upper rows) or in the analysis of each individual population (see Additional file 2: Table S2).

Discussion
In this study we conducted a meta-analysis combining previously published SSc GWASs data for 66 SNPs and analyzed the possible role of two selected SNPs, PPARG rs310746 and CHRNA9 rs6832151, in SSc risk by using independent replication cohorts.
Meta-analyses are a useful tool to increase the statistical power of genetic studies, thus improving the accuracy of the estimations of statistical significance. Of  note, associations identified from a single GWAS often tend to have inflated effect sizes [16]. On this basis, our data suggest that most signals from the grey zone observed in the discovery phase of the GWAS by Allanore et al. [6] presented inflated effect sizes, also called the winner's curse. In fact, this effect was already observed in the replication study conducted by our group for the novel SSc genetic risk factors identified by Allanore et al. [6], in which we could not replicate the association described for RHOB [17]. Our overall combined meta-analysis showed that the association of the PPARG rs310746 genetic variant with SSc remained with a nominal but non-genome-wide significant P value. This SNP is located upstream of PPARG, which encodes the peroxisome proliferator-activated receptor gamma (PPARG). PPARG was initially identified in adipose tissue, where this nuclear receptor plays important roles in adipogenesis, insulin sensitivity, and homeostasis [18]. Interestingly, during recent years, several studies have identified a novel role of PPARG as an antifibrotic effector. Thus, it has been reported that fibroblasts exposure to pharmacologic PPARG ligands give rise to suppression of collagen synthesis, myofibroblast differentiation, and other TGF-β-induced fibrotic responses in vitro [19][20][21]. Moreover, functional studies showed that PPARG agonist attenuated dermal fibrosis in mice with bleomycin-induced scleroderma [22,23].
These findings are remarkable in SSc, in which fibrosis is one of the main hallmarks of the disease. In this regard, Wei et al. [24] demonstrated that PPARG expression and function are impaired in SSc patients. Therefore, defects in PPARG expression may influence the uncontrolled progression of fibrosis in SSc. In addition, PPARG has been associated with other autoimmune diseases, such as inflammatory bowel disease [25,26] and psoriatic arthritis [27], and it is also a confirmed susceptibility locus in type 2 diabetes mellitus [28].
Although PPARG was the most likely biologic candidate gene for the reported suggestive association signal, we could not rule out TIMP4 as another possible gene for this signal. Further analyses are required to elucidate the functional implication of the reported signal.
Regarding the CHRNA9 genetic variant, despite the suggestive association found in the first step of the present study, the overall combined meta-analysis did not show evidence of association with SSc. Moreover, the effect size of the analyzed genetic variant was heterogeneous between the different populations. Although our data showed heterogeneity and lack of association in this locus, a slight or modest effect of CHRNA9 cannot be ruled out, and further studies will be required to determine whether this region is associated with SSc.
It is worth mentioning that the analyzed CHRNA9 SNP has been previously associated with Graves disease (first, through a GWAS performed in the Chinese Han population [29], and subsequently, in a replication study performed in a Polish Caucasian population [30]), but this is the only reported association between this gene and an autoimmune disease.