MHC associations of ankylosing spondylitis in East Asians are complex and involve non-HLA-B27 HLA contributions

Background The association of HLA-B*27 with AS is amongst the strongest of any known association of a common variant with any human disease. Nonetheless, there is strong evidence indicating that other HLA-B alleles are involved in the disease. European ethnicity studies have demonstrated risk associations with HLA-B*40 and multiple other HLA-B, HLA-A, and HLA class II alleles, and demonstrated that in that ethnic group, the amino acid sequence at position 97 in HLA-B is the key determinant of HLA associations with AS. A recent study in Korean AS cases and controls additionally identified association at HLA-C*15:02. In the current study, we examined the MHC associations of AS in an expanded East Asian cohort. Methods A total of 1637 Chinese, Taiwanese, and Korean AS cases meeting the modified New York Criteria for AS, and 1589 ethnically matched controls, were genotyped with the Illumina Immunochip, including a dense coverage of the MHC region. HLA genotypes and amino acid composition were imputed using the SNP2HLA programme using the Han-MHC reference panel based on the data of Han Chinese subjects (n = 9689), and association tested using logistic regression controlling for population stratification effects. Results A strong association was seen with HLA-B*27 (odds ratio (OR) = 205.3, P = 5.76 × 10−244). Controlling for this association, the strongest risk association is seen with HLA-C*15 at genome-wide significant level (OR = 7.62, P = 9.30 × 10−19), and confirmed association is also seen with HLA-B*40 at suggestive level (OR = 1.65, P = 2.54 × 10−4). At amino acid level, the strongest association seen in uncontrolled analysis was with histidine at position 114 in HLA-B (P = 7.24 × 10−241), but conditional analyses suggest that the primary amino acid associations are with lysine at position 70 and asparagine at position 97. Restriction of the ERAP1 association with HLA-B27-positive AS, previously reported in European subjects, was confirmed in East Asians. Conclusions This study confirms in East Asians that the HLA associations of AS are multiple, including previously reported associations at HLA-B*27, HLA-B*40, and HLA-C*15, as well as novel association with HLA-DQB1*04. The HLA-B associations are driven by the amino acids at positions 70 and 97, in the B pocket of HLA-B.


Background
Ankylosing spondylitis (AS) is a highly heritable rheumatic disease characteristically causing chronic inflammation of the spine and sacroiliac joints, as well as in some patients affecting the peripheral joints, the anterior uvea, and less commonly other organs. The worldwide distribution of AS is closely related to the prevalence of HLA-B*27, although the underlying mechanism remains unclear. Whilst the HLA-B*27 allele is found in approximately 85% of patients, there is strong evidence indicating that other HLA-B alleles and MHC genes are involved in the disease, as well as non-MHC loci.
Direct genotyping studies in European case-control cohorts have demonstrated risk associations consistently with HLA-B*40 and variably reported associations with multiple other HLA-B, HLA-A, and HLA class II alleles. The development of accurate HLA imputation methods from single nucleotide polymorphism (SNP) microarray data has enabled far larger case-control studies to be performed, with, for the first time, proper control for population stratification effects. Using this approach and studying 22,647 AS cases and controls of European descent, Cortes et al. demonstrated that the amino acid sequence of HLA-B at position 97, in the epitope-binding groove, is the key determinant of HLA associations with AS. After controlling for the associated alleles in HLA-B, independent associations with variants in the HLA-A, HLA-DPB1, and HLA-DRB1 loci were observed [1].
Differences in HLA-B*27 subtype distributions between Asian and European descent populations have been well reported, and further non-HLA-B*27 HLA class I associations in East Asian AS have been reported. Also using HLA imputation methods, a study in 654 Korean cases of AS and 3166 controls additionally identified association at HLA-C*15:02 [2]. Additionally, using direct genotyping in 360 Han Chinese AS cases and 350 controls with no genomic control for population stratification, risk association of HLA-B*40 and protective association of HLA-B*07 have been demonstrated [3].
In this study, using HLA imputation methods, we analyse the associations of AS with major histocompatibility complex (MHC) polymorphisms to identify functional and potentially causal variants using a large cohort of East Asian ancestry AS cases and controls [4]. In addition to our primary analysis of this cohort, we perform fine mapping of the MHC region with imputation of SNPs, HLA class I and II classical alleles, and amino acid residues within the classical HLA proteins. In addition to HLA-B*27, we identify further HLA-B and other HLA class I and II alleles associated with AS.

Subjects and SNP data
A total of 1637 Chinese, Taiwanese, and Korean AS cases meeting the modified New York Criteria for AS [5] as confirmed by qualified rheumatologists, and 1589 ethnically matched controls (Table 1), were genotyped with the customised SNP array (Illumina Immunochip [6]), including a dense coverage of the MHC region. Cohort descriptions and genotyping protocols are as previously reported [4]. By standard quality control procedures, SNPs with a minor allele frequency of at least 1% (MAF > 0.01), call rates of ≥ 0.98, and P values in Hardy-Weinberg disequilibrium tests ≤ 10 −7 were analysed in this study. To confirm ethnicity, we performed a continental principal components analysis (PCA), merging the study genotype data available from 51 available populations genotyped by Illumina 650Y from the Human Genome Diversity Panel (HGDP-CEPH) [7]. Cases or controls lying more than 6 standard deviations from the population mean on principal components (PCs) 1-10 were then excluded.

HLA imputation and association analysis
We conducted a 2-step imputation. We densely imputed SNPs across the MHC using the Michigan Imputation Server [8] and the 1000 Genomes Phase 3 reference dataset (26 populations across the world), then further using the Han-MHC reference panel [9], to ensure maximum SNP coverage to enable accurate imputation of HLA-B alleles, including of particular interest, HLA-B27. Using this SNP data and the Han Chinese reference panel (N = 9869), the programme SNP2HLA was used to impute the classic HLA alleles and amino acid residues  In the output file of SNP2HLA, imputed classical HLA alleles and HLA protein amino acid positions were defined as binary markers coding the presence or absence of the allele or residue being tested, and each different allele or residue was tested as a biallelic position. Association with AS was then tested using logistic regression function in PLINK [10] by including all allele/residues/SNP conditioning on 10 principal components to control for population stratification effects. We then performed conditional analysis repeatedly in an iterative fashion by adding the dosage of HLA-B*27 allele and other significant alleles/residues/SNPs as covariates until no significant allele/residue/SNP was observed. Only HLA alleles or amino acids with imputation information scores > 0.5 were considered. All results are presented unadjusted for multiple testing.

Results
PCA indicated that all study subjects were ethnically East Asian (Supplementary Figure 1). The genomic inflation factor calculated using a set of 1767 negative control SNPs in regions included on Immunochip for studies of reading and writing disabilities, psychosis, and schizophrenia was 1.03 (lambda (1000) = 1.02). No evidence of statistical inflation is seen in the Q-Q plot (Supplementary Figure 2). After quality control and imputation, 15,748 SNPs across the MHC (from 25 to 35 Mb, hg18) were available for analysis in 1482 cases and 1512 controls. Imputed HLA-B allele frequencies amongst controls in the current study were not significantly different from those in previously reported directly genotyped studies (P > 0.05), confirming the high accuracy of HLA imputation, particularly at a two-digit resolution [3].
At the amino acid level, the strongest association seen in the uncontrolled analysis was with histidine at position 114 in HLA-B (P = 7.24 × 10 −241 ), followed by multiple HLA-B amino acids including lysine at 70 (P = 1.49 × 10 −237 ) and asparagine 97 (P = 2.51 × 10 −237 ) ( Table 3). Asparagine 97 and histidine 114 were previously reported to be the main amino acid determining HLA-B associations with AS in European descent and Korean populations, respectively [1,2].
Controlling for HLA-B*27 alone or in combination with HLA-B*40 did not fully control for the association of asparagine 97, lysine 70, or histidine 114 (P < 5 × 10 −8 for both analyses, Table 3).

ERAP1 variants in association with AS
The key ERAP1 variant associated with AS is rs30187 (ccc-5-96150086-T-C, chr5:96150086[hg18], encoding K528R) [4,13] (Table 5). It has previously been observed in European populations that the association with the variant rs30187 in the ERAP1 locus is restricted to HLA-B*27-positive subjects, or HLA-B*40-positive, HLA-B27negative subjects, consistent with epistatic interactions. Here, we investigated the possibility of interaction between the HLA-B*27 and HLA-B*40 alleles and the previously reported tag SNP of ERAP1 locus (rs30187) [4]. When testing for interaction with the HLA-B*27 alleles, we found that rs30187-A risk allele increased the risk of disease in the strata where HLA-B*27 was present (OR = 1.29; P = 2.71 × 10 −6 ) ( Table 5), but no association was seen in HLA-B27-negative cases (OR = 1.06, P = 0.61). No evidence of interaction was observed between rs30187 and the HLA-B*40 allele, although the power to identify this was low as the number of HLA-B27negative cases was low.

Discussion
This study confirms that in East Asians, the primary MHC associations with AS are with HLA-B*27 and HLA-B*40, and confirms the risk association of HLA-C*1502 with the disease. The association of HLA-B*40 with AS has been convincingly demonstrated now in both European descent [1,[14][15][16] and East Asian studies [3,17], using both direct genotyping-and imputationbased methods. HLA-B*4001 has also been shown to be associated with IgA nephropathy (OR = 1.34, P = 5.64 × 10 −7 ) [18], a known though uncommon association of AS. The functional mechanism of association of this allele has been little studied. It does not share the lysine 70, asparagine 97, or histidine 114 residues found in most HLA-B*27 alleles. As with HLA-B*27, it is known to interact with AS-associated ERAP1 variants to cause AS, suggesting that it is likely to operate by the same mechanism. Further studies to compare its properties with HLA-B27, such as its peptide-binding characteristics, folding rate, and whether it forms homodimers, are indicated to investigate its association further.
No protective association was seen with HLA-B*07 as has previously been reported in East Asians [3] and European descent cohorts [1,15], although the allele frequency was very low and the study may not have had adequate power to detect any association with the allele (frequency = 0.024).
The study indicates that in East Asians, the key amino acid drivers of the HLA-B associations in AS are amino acid positions 70 and 97. These remain AS-associated controlling for any other HLA-B amino acid. HLA-B position 97 was previously shown in European descent cohorts to be the key amino acid association in the broad ethnicity, whereas in a Korean study, the association of histidine 114 could not be distinguished from associations with lysine 70 and asparagine 97 [2]. The difference in these findings may be explained by three key factors, sample size, ethnicity, and the reference haplotype dataset. Cortes et al.'s study of European descent subjects involved 9069 AS cases and 13578 controls, over seven times as many subjects as involved in the current study (1637 cases, 1589 controls) and nearly six times the number involved in the previous Korean study (654 cases, 3166 controls). Therefore, the European descent study had greater power, potentially explaining the absence of signal in the East Asian cohorts for some of the HLA-B allele and the HLA class II associations seen in the European dataset. The European descent study also has greater power in conditional analyses, potentially explaining the differences in results regarding the role of lysine 70, which remains positively associated with AS after conditioning on asparagine 97 in the current study, but not in the European descent dataset. The different studies have also used different reference haplotype datasets, potentially affecting the accuracy of the imputation data. Ethnic differences could also play a role through differences in HLA-B*27 subtypes or other HLA-B allele frequencies, particularly comparing the European descent and East Asian cohorts.
Both HLA-B amino acid residues 70 and 97 are found within the B pocket of the HLA-B peptide-binding groove. However, it has been noted that position 70 is tightly coupled with positions 67 and 97 and that position 70 hardly changes the peptide-binding repertoire, suggesting that position 70 is "hitch-hiking" along with positions 67 and 97 in their ability to change the peptide-binding repertoire [19]. Our study and the previous HLA amino acid imputation studies suggest that other amino acid positions in addition to 70 (like position 97 and 114) are also involved in HLA-B risk attribution. The association of these amino acids independent of other amino acids found in the HLA-B27 B pocket, and having controlled for HLA-B27, indicates that their effect on disease risk is partially independent of HLA-B27.
Although the HLA allele frequencies imputed in controls in this study closely match those reported by direct genotyping studies in Han Chinese [3], the accuracy of imputation in such studies is very dependent on the ethnic matching of the imputed and reference datasets. Whilst the Han-MHC reference dataset used here is of large size (n = 9689), the number of East Asian in 1000 Genomes Phase 3 (n = 524), which we used in the Michigan Imputation Server, is far smaller than the European  [1]. The smaller reference dataset size precluded imputation to four-digit levels and may have affected the accuracy of the imputation of low-frequency alleles in particular. As SNP-based HLA imputation is a highly efficient method enabling largescale HLA association studies, there is a clear need for much larger publicly available HLA imputation reference datasets for Asian populations.
In this study, we have also confirmed the interaction between ERAP1 and HLA-B*27, with association only observed of the key ERAP1 variant, rs30187, only seen in HLA-B*27-positive individuals. This confirms the original finding in Europeans [1] and the previous finding in a case-only analysis of Taiwanese AS patients of different ERAP1 genotypes in HLA-B*27-positive and HLA-B*27-negative cases [20]. We did not see an association of ERAP1 variants in HLA-B*27negative and HLA-B*40-positive individuals as previously reported [1], although the sample size was not large. The confirmation of the gene-gene interaction in an East Asian population increases the evidence that this is a true-positive interaction and is critical to AS pathogenesis.

Conclusions
This study confirms that the HLA associations of AS are complex and that multiple non-HLA-B*27 alleles, including both HLA class I and likely HLA class II variants, also contribute to risk and protection from the disease. Further investigation of the mechanisms involved in these associations is likely to assist in determining the pathogenesis of this disease.
Additional file 1.  Table 4 Conditional analysis P values of HLA-B amino acid residues. Significance for association of lysine 70 (70K), asparagine 97 (97N), and histidine 114 (114H) is given in columns, either in unconditional analysis or conditioning on specific amino acid positions (where no letter is given after the HLA-B amino acid position number) or for specific amino acids (where a letter is given after the HLA-B amino acid position number), either individually or in combinations. NS = P > 0.05