Integrative blood-derived epigenetic and transcriptomic analysis reveals the potential regulatory role of DNA methylation in ankylosing spondylitis

Background The currently known risk loci could explain a small proportion of the heritability of ankylosing spondylitis (AS). Epigenetics might account for the missing heritability. We aimed to seek more novel AS-associated DNA methylation alterations and delineate the regulatory effect of DNA methylation and gene expression with integrated analysis of methylome and transcriptome. Methods Epigenome-wide DNA methylation and mRNA expression were profiled in peripheral blood mononuclear cells (PBMCs) from 45 individuals (AS: health controls (HCs) = 30:15) with high-throughput array. The methylome was validated in an independent cohort (AS: HCs = 12:12). Pearson correlation analysis and causal inference tests (CIT) were conducted to determine potentially causative regulatory effects of methylation on mRNA expression. Results A total of 4794 differentially methylated positions (DMPs) were identified associated with AS, 2526 DMPs of which were validated in an independent cohort. Both cohorts highlighted T cell receptor (TCR) signaling and Th17 differentiation pathways. Besides, AS patients manifested increased DNA methylation variability. The methylation levels of 158 DMPs were correlated with the mRNA expression levels of 112 genes, which formed interconnected network concentrated on Th17 cell differentiation and TCR signaling pathway (LCK, FYN, CD3G, TCF7, ZAP70, CXCL12, and PLCG1). We also identified several cis-acting DNA methylation and gene expression changes associated with AS risk, which might regulate the cellular mechanisms underlying AS. Conclusions Our studies outlined the landscapes of epi-signatures of AS and several methylation-gene expression-AS regulatory axis and highlighted the Th17 cell differentiation and TCR signaling pathway, which might provide innovative molecular targets for therapeutic interventions for AS. Supplementary Information The online version contains supplementary material available at 10.1186/s13075-021-02697-3.

(IL)-23/IL-17 signaling pathway [1]. However, most of the AS susceptibility loci are undefined and located in non-coding regions [2]. Despite the numerous diseaseassociated variants identified, they cumulatively explain only a small proportion (< 28%) of the heritability of the disease. The "missing heritability" has been proposed to come from several mechanisms, including unknown gene-gene and gene-environmental interactions, rarer genetic variants, copy number variation, or epigenetics [3].
Epigenetics refers to functional modifications to DNA without sequence alteration, which is heritable from one cell cycle to the next, potentially reversible, and is largely responsible for the cell-specific expression of genes [4]. The modifications include histone modifications, DNA methylation, and non-coding RNAs. DNA methylation, the best studied and understood epigenetic modification, plays a critical role in the regulation of gene transcription and nuclear organization and ultimately influences cellular function. Disruption of the methylome can result in aberrant gene expression, which in turn might favor the development of the disease. Many environmental risk factors, such as diet, cigarette, and infection [5,6], have been proved to be related to the development of AS disease. Since environmental factors influence DNA methylation profiles, it may provide an essential link between genes and the environment.
Thus far, emerging data suggest that DNA methylation plays an important role in the pathogenesis of AS. Several DNA methylation studies in AS have been carried out, and some inflammatory-related genes have been identified, including SOCS1, DNMT1, BCL11B, IRF8, and IL12B [7][8][9][10][11]. However, these studies concentrated on known AS-genetically associated and/or inflammatory genes and methylation-specific PCR (MSP) method was frequently used, which were underpowered due to single-gene coverage. As more high-throughput methods for analyzing DNA methylation are available, an epigenome-wide association study was conducted using high-throughput microarray technology, Illumina 450K with 5 AS patients, and 5 HCs. A total of 1915 differentially methylated positions (DMPs) were identified, and the HLA-DQB1 gene achieved the most significant signal [12]. Apart from low-throughput methods used in most studies, some limitations of the aforementioned studies hampered interpretation of the results and limited applicability of the findings, including the small size of the cohorts and absence of further validation in independent cohort. Furthermore, integration of epigenetics with other "omics" data, which could improve our understanding of AS pathogenesis, is still lack.
With the advent of large-scale, base-resolution methylation technologies, we analyzed epigenome-wide DNA methylation changes using a more high-throughput microarray technology, Infinium MethylationEPIC array, to seek more novel AS-associated DNA methylation alterations. To delineate the regulatory effect of DNA methylation and gene expression, we performed an integrated analysis of epigenome-wide methylation array and RNA expression. The findings may provide better insight into pathological mechanism for AS and facilitate the development of new targets for intervention.

Study participants and sample preparation
Cohorts participating at the discovery stage included 45 individuals (AS: health controls (HCs) = 30:15). A total of 24 subjects (AS: HCs = 12:12) were recruited in the validation stage. All patients fulfilling the 1984 modified New York criteria for AS [13] were retrospectively enrolled from the Third Affiliated Hospital of Sun Yatsen University between 2017 and 2020. The study was approved by the ethics committee of the Third Affiliated Hospital of Sun Yat-sen University ([2013]2-93-01) and was carried out in compliance with the Helsinki Declaration. All participants signed the written informed consent about the probable use of the blood samples prior to sample collection.
Whole peripheral blood samples were collected from all patients and controls, and then diluted with phosphate-buffered saline, laid with Lymphocyte Separation Medium (TBD Sciences), and centrifuged under 2500 rpm/min for 15 min to separate peripheral blood mononuclear cells (PBMCs). Genomic DNA was extracted from blood samples with MAGEN DNA Kit (MAGEN Inc., China). Total RNA was isolated from PBMCs by standard phenol-chloroform extraction using Trizol reagent (Invitrogen Life Technologies, Inc., USA). The quality and concentration of extracted DNA and RNA were determined by NanoDrop ND-2000 Spectrophotometer (Thermo Fisher Scientific Inc., USA), and the integrity was analyzed by agarose gel electrophoresis. The extracted DNA was stored at −80°C until DNA methylation analysis. All procedures were performed according to the manufacturer's protocols.

Genome-wide DNA methylation and expression profiling
Infinium MethylationEPIC BeadChip (Illumina, Inc., San Diego, CA, USA) array was used to analyze DNA methylation. This platform covers more than 850,000 methylation sites per sample to be interrogated at single-nucleotide resolution, covering 99% of reference sequence (RefSeq) genes. The samples were bisulfiteconverted using EZ DNA Methylation-Gold ™ Kit (Zymo Research, Irvine, CA, USA) and were hybridized in the array following the manufacturer's instructions.
Statistical analysis of the microarray data was performed with R package ChAMP [14], Minfi [15]. Before analysis, the probes with a detection p value above 0.01, with a bead count <3 in at least 5% of samples per probe, and all probes located in chromosome X and Y were discarded. Besides, we also removed all single nucleotide polymorphism (SNP)-related probes and cross-reactive probes that are known to measure more than one site in the genome. The details of probes filtered out in each process were presented in Table S1. From the initial 865,918 CpGs, 127,755 were excluded during quality control, leaving 740,697 CpGs for further analysis. The methylation level for each CpG was represented as a β values. The β value was the ratio of the methylated probe intensity to the overall intensity (β = M/ (M + U+100), where M and U denoted the methylated and the unmethylated signal intensity, respectively), which ranged from 0 (nonmethylated) to 1 (completely methylated).
The total RNA was amplified and reverse-transcribed into cDNA with Quick Amp Labeling kit (Agilent Technologies, Palo Alto, CA, USA), and the mRNA expression was profiled with Agilent Customed LncRNA & mRNA Human Gene Expression Microarray (Agilent Technologies). Raw data was extracted by Agilent Feature Extraction (V10.7), and raw signal intensities were normalized using GeneSpring GX program (V11.5.1). The low-intensity mRNAs were filtered out. The log2 transformation was used for statistical analysis, and differential expression comparison intergroups were analyzed with R package limma [16]. A false discovery rate (FDR) <0.05 using the Benjamini and Hochberg method [17] was considered as a differential expression gene (DEG).

Identification of differentially methylated positions (DMPs) and differentially variable positions (DVPs)
The probe intensity was normalized with beta-mixture quantile (BMIQ) normalization procedures [18]. The singular value decomposition (SVD) method was implemented for assessing the technical and biological variations in the methylation dataset [19]. Combat method [20] was used to remove the technical variations (such as batch effects). The differential methylation comparisons of AS patients versus the health donors were conducted with the limma package. A β difference (Δβ) >0.05 and an FDR <0.05 in t test was determined as the threshold for DMPs in our study. Among DMPs, the sites elevating in AS patients were designated as hypermethylated positions and designated as hypomethylated positions otherwise. The promoter regions were defined as TSS1500, TSS200, and 5′UTR according to the Illumina annotation file.
Owing to the evidence that the methylation alterations of disease exhibit a more heterogeneous pattern, we also detected the differentially variable and methylated CpGs (DVPs) in the current study using the recently developed iEVORA algorithm [21]. This algorithm employs Bartlett's test (FDR < 0.001) in combination with a t test (p<0.05) to identify the DVPs. The modified test was superior to Bartlett's test because Bartlett's test is overly sensitive to single outliers.
Gene ontology (GO) analysis, pathway, and transcription factor (TF) motif enrichment analysis GO enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) biological pathways were performed using gometh function in missMethyl package [22]. To identify the enriched TF motifs for the DMPs, the HOMER method in ELMER package was used to find motif occurrences in a ±250bp region around DMPs. The DMPs which locate 2Kb away from TSS (called distal probes) were target probe set and all distal probes in the BeadChip were considered as background.

Statistical analysis
To determine whether the methylation levels were associated with gene expression of genes which the DMP physically located within TSS1500, TSS200, 5′UTR, gene body, or 3′UTR (cis-regulation), the Pearson correlation test was conducted. Furthermore, to infer the potentially causative regulatory effects of methylation, a causal inference test (CIT) was employed [23]. This analysis calculates the likelihood of whether CpGs (causal factor) causes AS disease (outcome) via gene expression (mediator) by four statistical tests: step (1) DNA methylation is associated with AS after adjusting for age, gender; step (2) mRNA expression is associated with AS after adjusting for age, gender; step (3) DNA methylation is associated with expression after adjusting for age, gender, and AS; and step (4) DNA methylation is independent of AS after adjusting for age, gender, and expression. The multivariate linear regression was applied in each step and only simultaneously satisfied the FDR adjusted p values less than 0.05 in steps 1-3 and greater than 0.05 in step 4 can we conclude the potential causality. Data were analyzed with R software (Version 3.6.1, https:// www.r-proje ct. org). Statistical significance was set at a p value less than 0.05, except as indicated.

PBMCs from AS patients displayed aberrant DNA methylome
The flow diagram of the study design and analytical pipeline was depicted in Fig. 1A. Here, we described a genome-wide DNA methylation analysis and RNA microarray analysis comprising a cohort of 30 AS patients and 15 controls, in which we aim to identify AS-associated DNA methylation changes in PBMCs and further to study the effect of DMPs on the gene expression. Moreover, we also validated the DNA methylation alteration in an independent cohort. The basic demographic and clinical characteristics of cohort 1 at the time of blood sampling were presented in Additional file 1, Table S2. Overall, most AS patients and healthy volunteers were male (90% vs. 80%, p=0.384) and the ages were matched (34.53 ± 9.92 vs. 32.13 ± 9.42, p=0.441). More patients were HLA-B27 positive comparable to controls (76.7% vs. 40.0%, p=0.015).
Following filtering procedures, 740,697 probes were retained for further analysis. Latent confounding factors including technical variations (such as the samples position on the BeadChip), gender, age, and HLA-B27 status were included in the SVD to assess the significant components of variation (Additional file 2, Figure  S1A and S1B). To reduce batch effect, we designed the similar proportion of two groups samples in two plates in the study design. However, the results suggested that overall methylome contributions to AS disease were not associated with the gender, age, or HLA-B27 status but with sample groups and position on the array (such as array plate and slide), which emphasizes the significance of sample group in DNA methylation difference in the raw data. After correcting the covariates including array plate and slide, which did not confound with the same group, only the sample group variance was captured by the top principal component (PC1, 41.5%, p<0.05). The PCA analysis based on top 1000 most variable CpG sites showed no apparent distinction between two groups (Additional file 2, Figure S1C), but there was a separation trend. The epigenome-wide association analysis identified 4794 DMPs, including 3294 (68.7%) hypermethylated and 1500 (31.3%) hypomethylated positions in AS patients (Fig. 1B). The complete list of significant DMPs is provided in online supplementary data S1. The identified DMPs allowed a clear distinction of most AS cases from controls in the principal component analysis (PCA) (Fig. 1C) and unsupervised hierarchical clustering (Fig. 1D). According to the genomic annotation, the AS DMPs were mapped to 2256 differentially methylated genes (DMGs). DMPs were predominantly located in open sea and gene body, only 6.5% and 22.6% DMPs were in CpG island and promoter regions (Fig. 1E, F).

AS patients manifested increased DNA methylation variability
As previously reported, a higher heterogeneity of DNA methylation profiles was implicated in tumors and rheumatoid arthritis [21,30]. Hence, we applied the recently developed iEVORA algorithm to test the methylation variability between AS patients and healthy subjects. A total of 1048 DVPs were identified, the majority of which (974, 92.9%) were hypervariable in AS, while only 74 DVPs were hypovariable ( Fig. 2A and online supplementary data S2). The increased DNA methylation variability in disease was in line with the previous observation in other diseases, indicating the intrinsic heterogeneity in AS patients, which might be influenced by diverse factors, such as disease activity and treatment. Figure 2B gave the examples of hypervariable and hypovariable methylation positions in AS patients.

Validation of the DNA methylome profiles in an independent cohort
To validate the DNA methylome changes in AS, we also performed the genome-wide methylation profiles in an independent cohort consisting of 12 AS patients and 12 HCs. The demographic characteristics of validation cohort were presented in Additional file 1,  Figure  S3A, S3B). The location of the DMPs in CpG island and gene position was similar to that of the discovery cohort (Additional file 2, Figure S3C, S3D). Both cohorts share 2526 DMPs and 1753 DMGs in common (p= 5.6×10 -09 and 6.1×10 -14 , Additional file 2, Figure S4A, S4B). Six out of the top 20 enriched GO terms based on DMGs were overlapped, such as leukocyte activation, immune response, lymphocyte activation, immune system process, immune effector process (p= 2.2×10 -16 , Additional file 2, Figure S4C). Besides, 10 out of the top 20 enriched KEGG pathways were shared in two cohorts, including TCR signaling pathway, Th17 cell differentiation, chemokine signaling pathway, osteoclast differentiation (p=2.2×10 -16 , Additional file 2, Figure S4D). The aforementioned AS-associated DMPs were also repeated with the same direction in the validation cohort (Additional file 2, Figure S5). The independent validation confirmed the robust alteration of DNA methylation in AS patients.

DNA methylation has regulatory effect on mRNA expression in AS patients
To better understand the regulatory effect of methylation in AS, we integrated the individual-level DNA methylation and mRNA expression data. The expression data of 31 AS patients and 20 age and gender-matched HCs were detected with customed Gene Expression Microarray. With the threshold of FDR <0.05, a total of 4144 DEGs were identified, of which 1752 genes were highly expressed and 2392 genes were lowly expressed in AS patients (online supplementary data S3). First, we examined the intersection of differentially methylated and differentially expressed genes. As shown in the Venn plot (p=1.1×10 -07 , Fig. 3A), a total of 411 genes covering 623 CpGs manifested differential DNA methylation and RNA expression simultaneously. The integrating analysis demonstrated that the inverse association was more common between methylation and expression, and most genes were hyper-methylated and hypo-expressed (Fig. 3B), which supported the recognized "switch" role of DNA methylation for gene silencing. To define the precise regulatory association of methylation on mRNA expression, the pairwise Pearson correlation analysis were performed between the overlap 628 DMPs and proximity 411 DEGs, namely the DMPs located in the positional window including TSS1500, TSS200, 5′UTR, gene body, and 3′UTR of the DEGs, among the sample set who had detected DNA methylation as well as mRNA expression (AS=26, HCs=9). As a result, the methylation levels of 158 DMPs were correlated with the corresponding mRNA expression levels of 112 genes (p<0.05) (cis-regulation) (online supplementary data S4). Negative correlation dominated the 158 cis-regulated methylation-mRNA pairs (negative: posi-tive= 127:31), which was also consistent with the gene silencing role of DNA methylation.
Pathways of correlated DMPs were enriched in Th17 cell differentiation, Th1 and Th2 cell differentiation, phospholipase D signaling pathway, NF-kappa B signaling pathway, and TCR signaling pathway (Fig. 3C). Of note, several genes have widely involved these disturbed pathways, such as LCK, FYN, ZAP70, CD3G, JAK3, LTBR, and PLCG1. Besides, we constructed the proteinprotein interactions (PPI) network among the 112 correlated genes with STRING software (Fig. 3D). The most Fig. 3 The integrating analysis of individual-level DNA methylation and mRNA expression data. A Venn diagram depicting differentially methylated positions related genes (DMGs) and differentially expressed genes (DEGs) between AS cases and health controls (HCs). B Starburst plot of gene expression changes (log2 FC) vs. DNA methylation changes (Δβ). Red dots represented the probes with significantly differential methylation and differential expression. FC, fold change. C Top 20 KEGG biological pathways of 112 DEGs whose expression levels are correlated with the corresponding methylation levels. The orange squares denote that gene present in the pathway. D The protein-protein interactions network among the correlated genes with STRING software. The sizes of the nodes are correlated with the degree (the connected number). The blue, orange, and the red denote the degree from 1 to 4, from 5 to 9, above 10, respectively interacted hub genes were LCK, FYN, CD3G, TCF7, ZAP70, CXCL12, and PLCG1.
The DMPs related to the key genes in the discovery cohort were shown in Fig. 4A, which were verified in the validation cohort (online supplementary data S5). The expression levels of two Src family tyrosine kinases, FYN and LCK, were decreased in AS patients, and inversely, the methylation levels of 6 CpGs in FYN and 3 CpGs in LCK, most of which located in promotor regions, were elevated in AS patients (Fig. 4A, B). CD3G and ZAP70 were with decreased expression and increased methylation level of 5 and 1 promotor-located CpGs. JAK3 and LTBR had 2 and 1 decreased DMPs located in TSS1500, respectively, and the expression levels were significantly upregulated in AS patients (Fig. 4A, B). Figure 4C demonstrated a significantly negative correlation between the methylation and expression of these genes. The overview of genomic organization and methylation sites of FYN and CD3G was illustrated in Fig. 4D, E, the differential sites clustered surrounding the promotor region. Collectively, these observations indicated that the DNA methylation had a regulatory effect on the mRNA expression of predisposing genes in AS.

Causal inference tests and transcription factor motif enrichment analysis
The correlation analysis could not determine the direction of causality of the strong association between methylome and transcriptome; therefore, we conducted the CIT analysis to infer the cis-acting DNA methylation and gene expression changes associated with AS risk. Among the 158 correlated DMP-DEG pairs, we identified 17 negatively correlated pairs corresponding 12 genes satisfying the four steps (Table 1). Notably, FYN, which widely interacted with other genes in the PPI network and involved in several significant pathways, was also highlighted with the differential expression potentially regulated by five DMPs (cg02789394, cg06009645, cg11883561, cg12389112, cg25529557). Besides, the aberrant expression of LCK in PBMC was mediated by promoter hypomethylation (cg02184418). Moreover, the CIT analysis also suggested the potentially causal role of other 11 CpGs.
DNA methylation can be used to identify functional changes at transcriptional enhancers. Therefore, we performed the TF motif enrichment analysis with the DMPs falling into distal feature regions (distal-probes). As a result, a total of 1951 hypermethylated and 816 hypomethylated distal-probes were identified. The enrichment analysis revealed that 57 TF motifs and 40 TF motifs were significantly enriched in the hypermethylation and hypomethylation set, respectively. The top 20 motifs were presented in Fig. 5. For the hypermethylation set, Runtrelated factors, TCF-7-related factors, and interferonregulatory factors were significantly enriched. For the hypomethylation set, C/EBP family together with ETSrelated factors family were enriched. We also observed the expression difference of TF in the C/EBP family, including CEBPB, CEBPD, and NFIL3. The expression of C/EBP TF family members could be modulated by several cytokines through several downstream pathways, such TNF, IFNs, and IL 17, which had been recognized to be involved in AS [31].

Discussion
The susceptibility to AS has a strong genetic component dominated by MHC allele HLA-B*27. The polygenic nature of AS has been slowly unraveled over time with more than 100 loci identified. However, a small proportion of the heritability could be explained by currently known risk loci. Several mechanisms, such as epigenetics, have been proposed to account for the "missing heritability. " Thus, our study carried out a genome-wide DNA methylation analysis, along with integration with transcriptomic analysis. In this study, we demonstrated the alteration of DNA methylome and gene transcriptome in PBMCs underlying predisposition to AS. We also identified several cis-acting DNA methylation and gene expression changes associated with AS risk, which might regulate the cellular mechanisms underlying AS. Moreover, we described the perturbated methylation in the highly interconnected network concentrated on Th17 cell differentiation and TCR signaling pathway.
Methylation of promoter CpGs favors a closed chromatin conformation and blocks transcription factors, resulting in negatively regulatory effect on mRNA expression. Correlation analysis in our study demonstrated that negative correlation dominated the methylation-mRNA pairs, which favored the "silencing" epigenetic mark of DNA methylation. The integrating analysis identified several genes with aberrant and correlated methylation and expression level, including LCK, FYN, CD3G, ZAP70, PLCG1, LTBR, JAK3, TCF7, and CXCL12, which particularly concentrated on Th17 cell differentiation, Th1 and Th2 cell differentiation, and TCR signaling pathway. Apart from integrating analysis, TCR signaling pathway was also highlighted in pathway enrichment analysis of all DMPs in two cohorts. LCK and FYN are the predominant Src family kinases found in T lymphocytes, the activation of which is central to the initiation of TCR signaling pathways. Active LCK or FYN can phosphorylate the CD3, resulting in the recruitment of the ζ-chain-associated protein of 70 kDa (ZAP70) kinase, further recruits and facilitates the activation of further downstream kinases. TCR proximal signaling via the Src-family kinases, LCK and FYN, can influence T cell activation, differentiation, and tolerance [32]. The CIT analysis showed that several DMPs of FYN had regulatory effects on mRNA expression and consequently increased disease susceptibility. FYN has been reported to promote osteoarthritis by activating the β-catenin pathway [33]; nevertheless, its role in the development of AS has yet to be determined. To our knowledge, imbalance of immune cells, including Th1, Th2, Th17, and Treg cells, was relevant to AS [34].
Curdlan-treated SKG mice which possess ZAP70 mutation is a good vivo animal for AS. Abnormal ZAP70 could impaired TCR signal transduction in SKG mice and transgenic expression of the normal human ZAP70 gene in SKG mice could rescue the SKG arthritis, which emphasizes the critical function of ZAP70 [35]. Here, our study also revealed that the downregulated expression of ZAP70 and downstream molecules (such as PLCG1) are associated with AS. Therefore, we speculated these genes might contribute to the immunity imbalance in AS through insufficiently activated downstream TCR signaling pathways, and DNA methylation might involve in this regulatory mechanism besides gene mutation. Over the past 10 years, the predisposing role of IL-17 in AS is increasingly recognized. Most strikingly, we also identified many DMGs associated with IL-23/17 axis. When IL-23 binds to its receptor complex (IL-12Rβ1/IL-23R), it subsequently recruits JAK2 and Tyk (Janus family of tyrosine kinases) and mediates STAT3 phosphorylation and facilitates differentiation into IL-17-expressing cells [36]. Th17 cells are the most common class of IL-17-producing adaptive lymphocytes [24]. Once secreted, IL-17 stimulate stromal cells and immune cells to promote inflammation through the activity of TFs such as NF-ĸβ and AP-1 [37]. Besides, IL-17 is known to alter the activity of osteoclasts and osteoblasts, probably contributing to aberrant bone formation [24]. Several genes in IL-23/IL-17 axis were reported association with AS in GWAS studies, including IL23A, IL23R, IL12RB2 [26], STAT3 [25], and NFKB1 [38], the CpG sites of which were found dysregulated in our study. Furthermore, several gene, with correlated expression and methylation level, involved in Th17 differentiation (Fig. 3C), including JAK3, ZAP70, LCK, RORA, CD3G, and PLCG1, most of which were also hub genes in PPI network (Fig. 3D). Given widely perturbations of DNA methylation in IL-23/IL-17 axis in AS and motivated by success of IL-17 inhibitor in AS, other therapeutic targets in this axis deserve further test. LTBR (lymphotoxin beta receptor), a member of the TNF receptor superfamily, the elevated expression and the downregulated promotor methylation were observed in AS. The previous study found that a variant in LTBR was convincingly associated with AS regardless of the evidence of involvement of TNF pathways in AS pathogenesis [29]. It was first reported the gene displayed an altered methylation in AS. ERAP2 (endoplasmic reticulum aminopeptidase 2), with two upregulated DMPs found in our study, has reported to be associated with AS [39]. ERAP2 has crucial role in trimming peptides to an optimal length for binding to class I MHC molecules [40]. The altered ERAP1 or ERAP2 activities could lead to abnormal peptide complexes, ultimately trigged the unfolded protein response (UPR) and autophagy, which is the putative mechanism in AS [24]. RUNX3 (runt-related transcription factor 3), which was strongly associated with AS by GWAS [29], were most enriched transcription factor in the hypermethylation set (Fig. 5A). Five sites of RUNX3 were hypermethylated in AS (Additional file 2, Figure S2 and S5) and 3 of which located in promoter region. The expression of RUNX3 was decreased in AS without significance (FDR=0.08) in our study; however, the previous study reported a lower expression in AS than healthy controls (P<0.002) [41]. We speculated that the relatively small sample size might conceal the expression difference. RUNX3 plays key roles in the development and differentiation of immune cells and two distinct SNPs appear to exert their influence in different cell types-rs4648889 in CD8+ T cells and rs4265380 in monocytes [41]. Our study indicated that the aberrant methylation might play roles in AS predisposition. Although high-throughput techniques considerably facilitate our understanding of AS pathogenesis, more progress towards the precise mechanistic explanation in cellular or animal experiments are required in future studies.
Previous studies detected several differentially methylated genes associated with AS using single-gene methylation-specific PCR, some of which have been repeated in our study. We have found 6 methylation sites in BCL11B were upregulated; moreover, the expression of BCL11B was decreased in AS patients, which was in line with the report [8]. Besides, we also detected the hypermethylation in IRF8 and SOCS1 coincident with the previous findings [9,11]. Compared to the previous genome-wide DNA methylation profiles in AS with Illumina Infinium HumanMethylation450 BeadChip [12], we increased the sample size and adopted larger coverage arrays. Besides, we also performed validation in an independent cohort to identified the robust DNA methylome changes associated with AS. Integrating with mRNA expression data could provide better understand to the regulatory effect of methylation in AS.
Our study has several limitations. Methylation modification can be affected by demographic and lifestyle factors. In our study, the overall AS related methylome were not associated with the age, gender, and HLA-B27 status in the SVD analysis. However, the contribution of these factors could not conclude easily due to the limited sample sizes of our study and need more validation. Furthermore, the lifestyle factors (such as smoking, diet, and medication) may confound the DMP detection without correction the variates. Another limitation for our study is that we do not analysis the cell type composition of the PBMC samples as well as the methylation profiles of specific cell type, so we could not conclude whether the methylation and transcription alteration is a cause or a consequence of the cellular composition. To note, it is impossible to conclusively prove causal relationships based on observational data alone, so the aims of our study are to identify the potential regulatory axis, which proves the foundation for our further mechanistic research. Although plenty of DMGs have been confirmed to involve in the development of AS, we did not investigate whether the methylation mediated the risk of ASassociated SNPs at these loci. Integration with GWAS data should be conducted to define the methylationmediated the regulatory mechanisms of noncoding risk variants in the future.