- Research article
- Open Access
Discovery of a novel genetic susceptibility locus on X chromosome for systemic lupus erythematosus
Arthritis Research & Therapy volume 17, Article number: 349 (2015)
Systemic lupus erythematosus (SLE) is an autoimmune connective tissue disease affecting predominantly females. To discover additional genetic risk variants for SLE on the X chromosome, we performed a follow-up study of our previously published genome-wide association study (GWAS) data set in this study.
Twelve single nucleotide polymorphisms (SNPs) within novel or unpublished loci with P-value < 1.00 × 10−02 were selected for genotype with a total of 2,442 cases and 2,798 controls(including 1,156 cases and 2,330 controls from central China, 1,012 cases and 335 controls from southern China and 274 cases and 133 controls from northern China) using Sequenom Massarry system. Associaton analyses were performed using logistic regression with sample region as a covariate through PLINK 1.07 software.
Combined analysis in discovery and central validation dataset discovered a novel locus rs5914778 within LINC01420 associated with SLE at genome-wide significance (P = 1.00 × 10−08; odds ratio (OR) = 1.32). We also confirmed rs5914778 in the southern Chinese sample cohort (P = 5.31 × 10−05; OR = 1.51), and meta-analysis of the samples from the discovery, central and southern validations regions provided robust evidence for the association of rs5914778 (P = 5.26 × 10−12; OR = 1.35). However, this SNP did not show association with SLE in the northern sample (P = 0.33). Further analysis represent the association of northern was significantly heterogeneous compared to central and southern respectively.
Our study increases the number of established susceptibility loci for SLE in Han Chinese population and has further demonstrated the important role of X-linked genetic risk variants in the pathogenesis of SLE in Chinese Han population.
Systemic lupus erythematosus (SLE) is a systemic autoimmune disease that affects predominantly women aged 15–40 years, particularly women of child-bearing age. SLE has been estimated to affect 31–70 cases per 100,000 people in China  with a ratio of 9:1 between female and male patients. It is well known that both genetic and environmental factors contribute to disease susceptibility [2–6]. Numerous variants on autosomal loci have been found to be associated with SLE in multiple ethnic groups through candidate gene and genome-wide association studies (GWASs) . Despite the advances in the genetic studies over recent years, the pathogenesis of SLE remains poorly understood.
Because of the huge gender difference in disease prevalence, involvement of the genetic variants on the X chromosome has long been suspected. In recent years, genetic variants of several genes on the X chromosome, such as MECP2, IRAK1, TLR7, and PRPS2, have been confirmed to be associated with SLE [8–11]. In particular, single nucleotide polymorphism (SNP) rs3853839 on the 3′ untranslated region (UTR) of TLR7 was shown to be associated with SLE, especially in Chinese and Japanese male subjects compared with females , and a fine mapping study by Kaufman et al.  in four different ancestral groups suggested that the nonsynonymous SNP rs1059702 (S196F) within IRAK1 might be a causal risk variant for SLE. More recently, we performed a meta-analysis of GWASs in Chinese Han populations and followed up the top findings in four additional Asian cohorts . Besides confirming the previously reported associations within IRAK1-MECP2 (rs1059702) and L1CAM-MECP2 loci, we also identified a genetic variant (rs7062536) in PRPS2 on Xp22.3 as a novel susceptibility locus and novel independent associations within the NAA10 (rs2070028) and TMEM187 (rs17422) loci.
In this study, with the aim to discover additional X-linked genetic risk variants for SLE, we performed a follow-up study of our previously published GWAS dataset by improving the coverage of genetic variation through imputation and validating the top findings in an additional three independent Chinese Han sample collections . We discovered a novel susceptibility locus LINC01420 on Xp11.21 associated with SLE.
SLE cases and controls were all female and were recruited from multiple hospitals in three geographic regions of China (central, southern, and northern China). All subjects were of self-reported Chinese Han origin. Samples in the GWAS discovery stage (1017 SLE cases and 539 controls) were recruited from central China . Samples in the replication studies were recruited from multiple regions in China, mainly from central (replication: 1156 cases and 2330 controls), southern (replication: 1012 cases and 335 controls), and northern (replication: 274 cases and 133 controls) China. All patients were diagnosed as cases by at least two experienced physicians using the American College of Rheumatology (ACR) criteria revised in 1997 . Controls also were geographically and ethnically matched and clinically evaluated to be without SLE, autoimmune disorders, or family history of autoimmune diseases. Clinical information for all patients and controls was collected through a structured questionnaire. Written informed consent was acquired from all participants. This study was approved by the Institutional Ethical Committee of The First Affiliated Hospital of Anhui Medical University, China–Japan Friendship Hospital, Jiangmen Central Hospital, and The Third Affiliated Hospital of Sun Yat-Sen University, according to Declaration of Helsinki principles. The information for all subjects is summarized in Table 1.
The genotyping in the discovery stage for the central China cohort was conducted by Illumina 610-Quad Human Beadchip array (Illumina, Inc., San Diego, CA, USA). The genomic DNA was isolated from peripheral blood mononuclear cells (PBMCs) with standard procedures using Flexi Gene DNA kits (QIAGEN GmbH, Hilden, Germany) and was diluted to working concentrations of 50 ng/μl for genome-wide genotyping and 15–20 ng/μl for the validation study. The SNPs in the X chromosome for the validation stage were genotyped using the Sequenom MassArray iPlex Gold platform (Sequenom, Inc., San Diego, CA, USA).
Quality control criteria were applied to genotyped SNPs, and those with minor allele frequency (MAF) <5 % in cases and controls were excluded. SNPs with a genotype missing rate >10 % or Hardy–Weinberg equilibrium (HWE) P <3.14 × 10−6 in controls were also excluded. Association analysis was performed in PLINK v1.07  using the logistic regression test. We selected 12 SNPs within novel or unpublished loci with P <1.00 × 10−2 for further validation in 2442 cases and 2798 controls (SNP missing rate <10 % and HWE for female controls with P >1.00 × 10−2).
To control the impact of population stratification in the validation and combined analysis, we matched cases and controls in terms of ethnic and geographic origins as independent validation samples for combined analysis. Fixed-effects meta-analysis of the four independent studies in the discovery GWAS and three validation cohorts (central, southern and northern) was performed using the inverse variants weighted effect size method in Metasoft version 2.0.0 .
We performed the combined analysis of the central region (both discovery and central validation) cohort, southern validation cohort, and northern validation cohort using fixed-effects meta-analysis. The I 2 heterogeneity statistic shows the heterogeneity across studies, with I 2 < 50 and P het >0.05 considered insignificant (Table 2).
Imputation of the X chromosome SNPs was performed on the discovery dataset for female individuals using X chromosome nonpseudoautosomal region data from the 1000 Genomes project (phase 1 integrated version 3) as reference . As part of the quality control, SNPs with accuracy score <0.8, missing rate >10 %, MAF <5 % in cases and controls, or HWE P <2.89 × 10−7 in controls were also excluded. Association was carried out by logistic regression test. The imputation results show that there is no substantial improvement of significant signals between imputed or genotyped SNPs (Fig. 1). No imputed SNPs show better P values that would warrant further validation on top of the genotyped SNPs. Therefore, we proceeded with the validation of the selected genotyped SNPs which resided in novel regions.
X chromosome discovery and first-stage study
We conducted X chromosome association tests of SLE in the GWAS dataset which consists of 1017 cases and 539 controls, after stringent quality control filtering (see Statistical analyses). The discovery analysis revealed strong evidence of association for all previously identified susceptibility loci on the X chromosome and suggested additional novel risk loci (Additional file 1: Table S1).
To further investigate the observed associations, we imputed the genotypes of additional SNPs that were not genotyped using IMPUTE (v2.0) (Oxf, Oxford, Oxon, UK). After stringent quality control filtering (imputation), no imputed SNPs show better P values that would warrant further validation on top of the genotyped SNPs. Therefore, to validate the findings from the discovery analysis, we selected the top SNPs from 14 independent new loci with suggestive association with SLE (P <10−2) for a follow-up analysis in an additional 1156 cases and 2330 controls of Chinese Han descent from central China. Of the 12 successfully genotyped SNPs, two showed association at P <0.05 in the validation samples and six showed consistent effects between the discovery and validation samples. The meta-analysis results for the 12 SNPs in the combined discovery and central validation dataset totaling 2173 cases and 2869 controls, using fixed-effects and random-effects models, are presented in Table 3. The combined analysis discovered a novel locus rs5914778 within LINC01420 associated with SLE disease at genome-wide significance (P = 1.00 × 10−8; odds ratio (OR) = 1.32).
Further replication of selected SNPs and the heterogeneity test
We performed further replication analysis of rs5914778 in two additional independent samples of Chinese Han descent from the southern and northern regions of China. The replication in the southern Chinese sample cohort, consisting of a total of 1012 cases and 335 controls, provided strong supporting evidence for the association of rs5914778 with SLE (P = 5.31 × 10−5; OR = 1.51). The meta-analysis of the samples from the central and southern regions, totaling 3185 cases and 3204 controls, provided robust evidence for the association of rs5914778 (P = 5.26 × 10−12; OR = 1.35). In addition, the strength of the association is very consistent without any evidence of heterogeneity (P het = 0.46, I 2 = 0) (Table 2).
Intriguingly, this SNP did not show association with SLE in the northern sample with a total of 274 cases and 133 controls (P = 0.33, OR = 0.85), and the SNP actually showed an opposite effect in the northern sample as compared with the central and southern samples (Table 2). This could be because of the very small sample size of the northern replication cohort. Further studies are needed to confirm the heterogeneity of this association between the northern and central/southern Chinese populations.
Lastly, we performed a joint analysis for all of the discovery, central, southern, and northern validation samples totaling 3459 cases and 3337 controls, using a fixed-effects meta-analysis. The association at rs5914778 (LINC01420) on Xp11.21 surpassed the genome-wide significance (P = 1.22 × 10−10; OR = 1.31), but a moderate heterogeneity of association was observed within the samples (P het = 0.034, I 2 = 65.3) (Table 2 and Fig. 2).
Through the discovery and validation analyses in two independent female samples from the central region of China, we have discovered a novel SLE susceptibility locus at rs5914778 (LINC01420) on Xp11.21 at the genome-wide significance. Further replication analysis in the independent sample of southern Chinese confirmed the association with strong evidence. The analysis of the independent sample of northern Chinese failed to replicate the association, but the sample size of the northern cohort is very small.
rs5914778 is located within a long intronic region between the first and second exons of LINC01420. LINC01420 is a long noncoding RNA with enhancers marked by histone modifications in human umbilical vein endothelial cells (HUVEC) and HSMM based on HaploReg annotation  (Additional file 2: Table S2). LINC01420 was found to have sex-specific DNAse I hypersensitivity patterns which showed H3K4me3 histone enrichment and strong expression in females only . According to the regulatory annotation information from the ENCODE project , this SNP is within a DNase I hypersensitive site that was detected in the lymphoblastoid cell line. LINC01420 may maintain the X inactivation which avoids X-linked gene overexpression through dosage compensation in females . Long noncoding RNAs have been shown to be associated with many complex diseases such as psoriasis, breast cancer, gastric cancer, colorectal cancer, osteosarcoma, adrenocortical cancer, and cardiovascular diseases in recent years [22–28]. Some noncoding RNAs also play a role in the pathogenesis and progression of hepatocellular carcinoma, and may act as therapeutic targets for hepatocellular carcinoma . In order to reveal whether there are expression difference of LINC01420 between females and males, we performed gene expression analyses using the gene expression data from CD4+ T cells and monocytes from 461 healthy individuals  and the gene expression data from PBMCs of 82 controls  in GEO datasets. However, we did not obtain the gene expression result of LINC01420, indicating that LINC01420 might express too low to be detected in blood cells from healthy individuals. Hence, more work will be needed to elucidate the biological mechanism through which LINC01420 influences SLE pathogenesis.
We also observed another SNP, rs5913992, in perfect linkage disequilibrium with our top SNP rs5914778 (R 2 = 1) that was predicted to be functional by Regulome DB (LSJU, Stanford, CA, USA) with a score of 2b (likely to affect binding of motifs, transcription factors, and enhancer histone marks) in this locus . rs5913992 is also within the region of the binding sites of six overlapping transcription factors (transcription factor binding sites)—RELA, CTCF, CEBPB, RAD21, ZNF143, and SMC3—that were detected by ChIP-Seq analysis in lymphoblastoid, epithelial, endothelial, breast cancer, and myeloid leukemia cell lines (Fig. 3). The prediction by Regulome DB indicates that this SNP overlaps a potential consensus EWSR1-FLI1 binding motif within the binding sites of the six transcription factor binding sites (Additional file 3: Table S3).
We observed the same risk effect at rs5914778 in the central and southern validation results, while the opposite effect was observed in the northern validation results. The association of the northern cohort was significantly heterogeneous compared with the central and southern cohorts respectively (P het = 0.034, I 2 = 65.3). Several previous studies have demonstrated differences in disease risk between northern and southern Chinese, and further studies in more northern Chinese samples will be needed to confirm the genetic heterogeneity of this susceptibility locus among the central, southern, and northern Chinese populations [33–35].
We performed a three-stage X chromosome association analysis of SLE in the Chinese Han population and discovered a novel susceptibility locus on Xp11.21. Although further studies will be required to understand how the locus influences the etiology of SLE, the discovery of this novel locus has further expanded the role of the X chromosome in the development of SLE in the Chinese Han population.
American College of Rheumatology
Genome-wide association study
Human umbilical vein endothelial cells
Minor allele frequency
Peripheral blood mononuclear cell
Systemic lupus erythematosus
Single nucleotide polymorphism
Zeng QY, Chen R, Darmawan J, Xiao ZY, Chen SB, Wigley R, et al. Rheumatic diseases in China. Arthritis Res Ther. 2008;10:R17.
Harley IT, Kaufman KM, Langefeld CD, Harley JB, Kelly JA. Genetic susceptibility to SLE: new insights from fine mapping and genome-wide association studies. Nat Rev Genet. 2009;10:285–90.
Patel DR, Richardson BC. Epigenetic mechanisms in lupus. Curr Opin Rheumatol. 2010;22:478–82.
Kaiser R, Criswell LA. Genetics research in systemic lupus erythematosus for clinicians: methodology, progress, and controversies. Curr Opin Rheumatol. 2010;22:119–25.
Moser KL, Kelly JA, Lessard CJ, Harley JB. Recent insights into the genetic basis of systemic lupus erythematosus. Genes Immun. 2009;10:373–9.
Deng Y, Tsao BP. Genetic susceptibility to systemic lupus erythematosus in the genomic era. Nat Rev Rheumatol. 2010;6:683–92.
Cui Y, Sheng Y, Zhang X. Genetic susceptibility to SLE: recent progress from GWAS. J Autoimmun. 2013;41:25–33.
Jacob CO, Zhu J, Armstrong DL, Yan M, Han J, Zhou XJ, et al. Identification of IRAK1 as a risk gene with critical role in the pathogenesis of systemic lupus erythematosus. Proc Natl Acad Sci U S A. 2009;106:6256–61.
Sawalha AH, Webb R, Han S, Kelly JA, Kaufman KM, Kimberly RP, et al. Common variants within MECP2 confer risk of systemic lupus erythematosus. PLoS One. 2008;3, e1727.
Shen N, Fu Q, Deng Y, Qian X, Zhao J, Kaufman KM, et al. Sex-specific association of X-linked Toll-like receptor 7 (TLR7) with male systemic lupus erythematosus. Proc Natl Acad Sci U S A. 2010;107:15838–43.
Kaufman KM, Zhao J, Kelly JA, Hughes T, Adler A, Sanchez E, et al. Fine mapping of Xq28: both MECP2 and IRAK1 contribute to risk for systemic lupus erythematosus in multiple ancestral groups. Ann Rheum Dis. 2013;72:437–44.
Zhang Y, Zhang J, Yang J, Wang Y, Zhang L, Zuo X, et al. Meta-analysis of GWAS on two Chinese populations followed by replication identifies novel genetic variants on the X chromosome associated with systemic lupus erythematosus. Hum Mol Genet. 2014;24:274–80.
Han JW, Zheng HF, Cui Y, Sun LD, Ye DQ, Hu Z, et al. Genome-wide association study in a Chinese Han population identifies nine new susceptibility loci for systemic lupus erythematosus. Nat Genet. 2009;41:1234–7.
Hochberg MC. Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1997;40:1725.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.
Han B, Eskin E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 2012;8, e1002555.
Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, et al. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491:56–65.
Ward LD, Kellis M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 2012;40(Database issue):D930–4.
Joo JE, Novakovic B, Cruickshank M, Doyle LW, Craig JM, Saffery R. Human active X-specific DNA methylation events showing stability across time and tissues. Eur J Hum Genet. 2014;22:1376–81.
Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57-74.
Brooks WH, Renaudineau Y. Epigenetics and autoimmune diseases: The X chromosome-nucleolus nexus. Front Genet. 2015;6:22.
Tsoi LC, Iyer MK, Stuart PE, Swindell WR, Gudjonsson JE, Tejasvi T, et al. Analysis of long non-coding RNAs highlights tissue-specific expression patterns and epigenetic profiles in normal and psoriatic skin. Genome Biol. 2015;16:24.
Shi Y, Li J, Liu Y, Ding J, Fan Y, Tian Y, et al. The long noncoding RNA SPRY4-IT1 increases the proliferation of human breast cancer cells by upregulating ZNF703 expression. Mol Cancer. 2015;14:51.
Pan W, Liu L, Wei J, Ge Y, Zhang J, Chen H, et al. A functional lncRNA HOTAIR genetic variant contributes to gastric cancer susceptibility. Mol Carcinog. 2015. doi:10.1002/mc.22261.
Chu H, Xia L, Qiu X, Gu D, Zhu L, Jin J, et al. Genetic variants in noncoding PIWI-interacting RNA and colorectal cancer risk. Cancer. 2015;121:2044–52.
Wang B, Su Y, Yang Q, Lv D, Zhang W, Tang K, et al. Overexpression of long non-coding RNA HOTAIR promotes tumor growth and metastasis in human osteosarcoma. Mol Cells. 2015;38:432–40.
Glover AR, Zhao JT, Ip JC, Lee JC, Robinson BG, Gill AJ, et al. Long noncoding RNA profiles of adrenocortical cancer can be used to predict recurrence. Endocr Relat Cancer. 2015;22:99–109.
Uchida S, Dimmeler S. Long noncoding RNAs in cardiovascular diseases. Circ Res. 2015;116:737–50.
George J, Patel T. Noncoding RNA as therapeutic targets for hepatocellular carcinoma. Semin Liver Dis. 2015;35:63–74.
Raj T, Rothamel K, Mostafavi S, Ye C, Lee MN, Replogle JM, et al. Polarization of the effects of autoimmune and neurodegenerative risk alleles in leukocytes. Science. 2014;344:519–23.
Kong SW, Collins CD, Shimizu-Motohashi Y, Holm IA, Campbell MG, Lee IH, et al. Characteristics and predictive value of blood transcriptome signature in males with autism spectrum disorders. PLoS One. 2012;7, e49475.
Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012;22:1790–7.
Yao YG, Kong QP, Bandelt HJ, Kivisild T, Zhang YP. Phylogeographic differentiation of mitochondrial DNA in Han Chinese. Am J Hum Genet. 2002;70:635–51.
Yao YG, Kong QP, Man XY, Bandelt HJ, Zhang YP. Reconstructing the evolutionary history of China: a caveat about inferences drawn from ancient DNA. Mol Biol Evol. 2003;20:214–9.
Yao YG, Kong QP, Wang CY, Zhu CL, Zhang YP. Different matrilineal contributions to genetic structure of ethnic groups in the silk road region in china. Mol Biol Evol. 2004;21:2265–80.
This study was approved by the respective institutional Ethical Committee at each institution and according to Declaration of Helsinki principles. Patient consent was also obtained.
The authors thank all study participants and all of the volunteers who have willingly participated in this study. This study was supported by grants from the National Key Basic Research Program of China (2014CB541901, 2012CB722404, 2011CB512103), the International (Regional) Cooperation and Exchanges major project (81320108016), the National Natural Science Foundation of China (81573033, 81171505, 81402590, 81371722), the Research Project of the Chinese Ministry of Education (213018A), the Program for New Century Excellent Talents in University (NCET-12-0600), and the Natural Science Fund of Anhui province (1408085MKL27).
The authors declare that they have no competing interests.
JJL shared senior author. XJZ, YCu, WLY, and JJL conceived of this study, obtained financial support, participated in the study design, and revised the manuscript. ZWZ, ZYL, HL, CY, and LLW were responsible for sample selection, genotyping, and project management, and drafted the manuscript. ZML, YJShe, YL, LY, YYC, YCh, LL, LLY, YJShi, CBS, HYT, LDS, and SY conducted sample selection, undertook recruitment, collected clinic data, managed recruitment, obtained biological samples, and helped to revise the manuscript. JZ, BL, YTD, and YZ were responsible for sample management and DNA extraction, and helped to draft the manuscript. HL, YJShe, XDZ, XYY, and XBZ conducted data management, undertook related data handling and calculation, and revised the manuscript. FSZ and JZ performed genotyping analysis and helped to draft the manuscript. HL, YJShe, XDZ, XYY, XBZ, HYT, and J-XB performed data processing and statistical analysis, and revised the manuscript. All authors read and approved the final manuscript.
Zhengwei Zhu, Zhuoyuan Liang, Herty Liany, Chao Yang and Leilei Wen contributed equally to this work.
Additional file 1: Table S1.
Presenting association results for the X chromosome SLE GWAS cohort. (DOC 241 kb)
Additional file 2: Table S2.
Presenting HaploReg annotation for rs5914778 (Query SNP: rs5914778 and variants with r 2 ≧0.8). (DOC 46 kb)
Additional file 3: Table S3.
Presenting the motifs predicted to be affected by rs5913992 SNP (Regulome DB). (DOC 171 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Zhu, Z., Liang, Z., Liany, H. et al. Discovery of a novel genetic susceptibility locus on X chromosome for systemic lupus erythematosus. Arthritis Res Ther 17, 349 (2015). https://doi.org/10.1186/s13075-015-0857-1
- Systemic lupus erythematosus
- X chromosome
- Single nucleotide polymorphisms