Current status of lupus genetics

Over the past 40 years more than 100 genetic risk factors have been defined in systemic lupus erythematosus through a combination of case studies, linkage analyses of multiplex families, and case-control analyses of single genes. Multiple investigators have examined patient cohorts gathered from around the world, and although we doubt that all of the reported associations will be replicated, we have probably already discovered many of the genes that are important in lupus pathogenesis, including those encoding human leukocyte antigen-DR, Fcγ receptor 3A, protein tyrosine phosphatase nonreceptor 22, cytotoxic T lymphocyte associated antigen 4, and mannose-binding lectin. In this review we will present what is known, what is disputed, and what remains to be discovered in the world of lupus genetics.


Introduction
Systemic lupus erythematosus (SLE) has long been appreciated to arise from both genetic and environmental factors. Although environmental factors, such as the Epstein-Barr virus, are clearly important [1], this review focuses on genetic factors that are involved in SLE. Evidence for the genetic origins of the disease come from the observation of familial aggregation [2] (up to 10% of patients with SLE have another family member with the disease) and increased concordance in monozygotic twins [3]. The patterns of inheritance are complex, however, and it is generally thought that variations in a number of genes are involved, each contributing a small amount to the overall genetic risk [4]. Two major strategies have been used to search for the 'lupus genes': genome-wide screening, using multiplex families and linkage analysis; and candidate gene studies, usually performed on trios or case-control collections. With either strategy, a high threshold is necessary to establish genetic risk, and follow-up testing of an independent cohort is required to confirm the results.

Genome-wide linkage studies for systemic lupus erythematosus
The genetic basis of SLE is well established, but the genetic transmission of SLE has proven to be highly complex. Consequently, gene identification has been accomplished for only a handful of genes. Genome-wide linkage scanning is a comprehensive and unbiased approach to identifying chromosomal loci that may be linked to complex diseases [5]. Testing for genome-wide linkage is fundamentally a statistical process that evaluates for co-inheritance of genetic markers (such as DNA polymorphisms) with the disease phenotype in families with multiple affected members. Consistent coinheritance of the marker with the disease in families means that they are 'linked' and indicates that the actual disease gene is in close proximity. As with other complex diseases, genome scans for SLE susceptibility genes suffer from low power to detect true-positive linkages. Causes of this include relatively small study populations in some studies and common causative alleles with low penetrance.
Several different study designs have been used for genomewide scanning to identify novel susceptibility loci for SLE. Some of the study designs involve sibling pairs, for whom parents may or may not be available. Others use extended pedigrees with several generations available for study. Several genome scans have been carried out by four major scientific groups (located in California, Oklahoma, Minnesota, and Sweden), and these have identified many loci spread across the genome. To date, nine independently identified linkages have been established and replicated in an independent sample (Table 1). Because each of these linkages has passed the recommended threshold for establishing significant evidence of linkage, a susceptibility gene or genes is likely to be found eventually within these linkage regions, although most remain to be identified.

Genes found through linkage studies
The search for genes in the 1q23 linkage interval [6,7] has led to intensive study of the immunoglobulin receptors encoded there. There are three distinct but closely related classes of Fcγ receptors (FCGRs) in humans: FCGR1 (CD64), FCGR2A (CD32), and FCGR3A (CD16). They have different affinities for IgG and its subclasses, and those encoded on 1q23 include FCGR2A, FCGR2B, FCGR3A, and FCGR3B. The arginine variant at amino acid position 131 of FCGR2A (or R131) is associated with SLE, particularly in African-Americans [8], whereas FCGR3A-F176 is associated with SLE in European derived peoples and other ethnic groups [7]. A gene dose effect with FCGR2A-R131 for the risk for SLE was also identified in a metaanalysis [9], with the risk for SLE increasing with the number of R alleles (RR > RH > HH). The data that FCGR3A-F176 is a risk factor for lupus nephritis are also convincing [10]. Because both of these mutations produce receptors with lowered affinity for IgG [11], it is thought that these variants may predispose to autoimmunity through delayed clearance of immune complexes, but this remains an unproven hypothesis. Although one of these two variants is probably responsible for the linkage in this region, there are conflicting data on which is the most important, and this remains an area of active interest [7,9,10,12]. PDCD1 (programmed cell death 1) is generally accepted as the gene responsible for the linkage at 2q34 [13], and it is also associated with lupus nephritis [14]. To date, this is the only gene to have been identified through fine mapping of a linkage interval, although this association does not go unchallenged in other populations tested [15]. The presumed mechanism of action is through an intronic single nucleotide polymorphism (SNP) that alters a binding site for the RUNX1 transcription factor, leading to decreased expression of the PDCD1-encoded protein and delayed apoptosis [13]. Autoreactive T cells that fail to undergo apoptosis properly may persist to support autoimmune responses.
The genes responsible for linkage at the other loci are not so straightforward. Although it is generally accepted that human leukocyte antigen (HLA)-DR is associated with SLE [16], there are a number of other genes in the HLA region that may also contribute to this linkage, as discussed below. PARP (poly-[ADP-ribose] polymerase) was initially identified as the gene responsible for linkage at 1q41 [17], but two subsequent studies in European-American [18] and French Caucasian [19] cohorts failed to confirm this association. Two additional studies conducted in Asian populations also failed to find an association with disease, although both found correlation of PARP alleles with clinical manifestations (discoid rash and anticardiolipin IgM [20], and nephritis and arthritis [21]).

Linkage analysis through pedigree stratification
Clinical manifestations of SLE are extremely diverse and variable, both in individual patients and over time. We hypothesize that genetic factors contribute to this clinical diversity and that there will be subsets of genes that are over-  [17], subsequent studies have failed to confirm an association [18][19][20][21]. † The initial linkage at 10q22 was described using allele sharing statistics; therefore, a P value is generated instead of a log of odds (LOD) score. represented in families with particular clinical manifestations. We therefore used stratification of multiplex pedigrees by phenotype to improve the genetic homogeneity of our cohorts and discover new loci linked to SLE. For example, by analyzing only the families in which one or more members have vitiligo, we identified linkage at 17p12 [22] and six additional linked loci were discovered through stratification by other clinical and laboratory criteria. All of these have been established and confirmed in an independent cohort ( Table 2).

Genome scan meta-analysis
Although several susceptibility loci for SLE have been identified by individual genome-wide scans, many of these loci have yielded inconsistent results across studies. Additionally, many individual studies are at the lower limit of acceptable power recommended for declaring significant linkage. The genome search meta-analysis has been proposed as a valid and robust method for combining several genome scan results [23]. Recently, the results of two genome search meta-analyses were reported [24,25]. These studies identified many linked regions that may harbor the SLE susceptibility genes. The most interesting results emerging from these studies are significant linkages in the intervals of 6p21-6p22 and 16p12-16q13.

Overview of candidate gene studies
The candidate gene approach is the technique most frequently used to explore SLE genetics. It is simple and straightforward, namely recruit lupus patients and matched controls, assay them for variations in a gene of interest, and determine whether allele frequencies differ between the two groups. Because of the relative ease of the approach, there are literally hundreds of association studies in SLE (for review, see [26]). If they were all the same, then comparing them and correlating the results would be a simple matter, but science is, of course, performed by individuals, each with their own ideas. This has lead to variations in nearly every aspect of methodology, from the way in which patients were recruited and matched to the number of SNPs assayed in each gene. The ethnic groups studied are as varied as the international sites at which this work was accomplished, and of course everyone has different ideas about what genes are 'of interest'.
Currently, 115 different genetic loci have been reported to be in association with SLE, but there are conflicting reports that claim no association for 56 of these. Of course, many of these 'conflicting' reports were conducted in patients from different ethnic groups, and so both reports may be correct and merely indicate ethnic specificity for a gene. There are also 71 genes for which only a single study has been published to date. Within these 39 positive and 32 negative analyses, there exist both strong associations in large cohorts (which are generally more reliable) and weak associations in small, isolated populations. It therefore remains to be seen which of these unconfirmed associations will prove to be consistent in future studies. Sample size, ethnicity, and number of SNPs studied should be considered when reading a single report on the role of any given gene in SLE, and one must keep in mind that the literature in this area is vast and multiple studies often exist. It is currently believed that on the order of 20 to 40 genes have variants that play a role in SLE risk. Therefore, although the majority of the genetic risk factors for SLE may be on this list of 115, we do not yet know which ones are really important. Nevertheless, out of this body of work in progress, several strong associations rise to the top. These include components of the C3b activation pathway, the FCGRs, HLA region genes, and a number of genes that have been implicated in immune regulation, as listed in Table 3.

Complement deficiencies
There are a few instances in which mutation of a single gene causes lupus or a lupus-like syndrome. The most common of Available online http://arthritis-research.com/content/9/3/210 , which may prove to be even more important than delayed immune complex clearance in the pathogenesis of SLE in these patients.

Human leukocyte antigen region
After considering these single genes, perhaps the next clearest genetic effect in SLE is in the HLA region. Although multiple studies conducted during the past 40 years have shown clear HLA associations [4,26,37], it is currently uncertain which gene or genes may be responsible for increasing genetic risk. This region contains not only the HLA class I, II and III genes, but also the genes that encode complement components C2 and C4, tumor necrosis factor (TNF)-α and TNF-β (also known as lymphotoxin-α LTA), transporter associated with antigen processing (TAP)1 and TAP2, butyrophilin-like protein 2, and numerous heat-shock protein genes and others with possible immune significance. Furthermore, these genes are often inherited as a block, a phenomenon known as linkage disequilibrium, so that -for example -the TNF-α -308A variant associated with overexpression is often found in a haplotype block that also contains HLA-B8, C4A null, and HLA-DR3 [38]. It is unfortunate that most of the studies in this region focus on a single marker, most often either HLA-DR or TNF-α, either of which could be responsible for this extended haplotype. To add additional confusion to the issue of pathogenicity, more than one HLA allele has been found to associate with disease, for example DR3 and DR2, and these associations are not necessarily confined to a single ethnic group [37].  Meta-analysis [44] It is reasonable to think that any of these variants could contribute to the lupus phenotype. As discussed above, complete deficit of C2 or C4 appears to cause SLE, and more subtle alterations in the classical pathway may also cause some tendency toward autoimmunity. The HLA proteins are directly involved in antigen presentation, and in some cases, such as in HLA-B27 arthritis, this has been shown to lead to alteration in the immune repertoire [39]. TNF-α is central to regulation of many inflammatory pathways, and treatment with TNF-α inhibitors can cause lupus flares or lupus-like symptoms, possibly through upregulation of IFN-α [40], indicating a complex role for this cytokine in SLE. TNF-β is key to the formation of normal lymph nodes [41], and the gene encoding TNF-β is one of the most consistently associated across populations, with seven positive reports and no negative reports to date [26]. TAP1 and TAP2 are involved in peptide processing for antigen presentation, and transgenic mice that lack TAP are resistant to experimentally induced SLE [42]. It is possible that any one of these defects alone could predispose to SLE, but it is also possible that it takes some combination of 'hits' to produce an extended haplotype that correlates more directly with disease risk.

Meta-analysis of candidate genes in systemic lupus erythematosus
When there are several studies on the same allele in a gene, meta-analysis can be a useful tool for sorting out any conflicting reports, but usually there are not enough studies performed to make this practical. For some of the more extensively studied genes, however, it represents a powerful tool. For example, the majority of the reports on the gene encoding interleukin-10 support association, as does a recent meta-analysis [43], although there is a body of literature that supports only associations with specific phenotypes as well as half a dozen negative reports. Metaanalysis also favors association with the mannose-binding lectin (MBL) gene, although the individual reports are evenly divided in favor and in opposition [44]. The situation is similar with cytotoxic T lymphocyte associated antigen (CTLA)4, in which association is also favored even though the literature is mixed, with the strongest effects seen in Asian populations [45]. TNF-α also has mixed reports but a positive metaanalysis, particularly in European-Americans [46]; interpretation of this finding is problematic, however, because TNF-α is in linkage disequilibrium with HLA-DR [38]. The cumulative data support an association of the protein tyrosine phosphatase nonreceptor (PTPN)22 gene with SLE as well [47]. Meta-analyses are not always positive, however; metaanalysis of the data on the widely studied insertion/deletion polymorphism in the angiotensin-converting enzyme gene does not favor association with SLE or lupus nephritis [48].

Single gene defects reflected in mouse models of lupus
A number of murine models of lupus have been characterized, and they include both transgenic constructs and strains with naturally occurring disease (for review, see [49][50][51][52][53][54][55][56]). Some of these animal models have led us to discover single gene deficiencies that are found rarely in human disease, as well as a number of candidate genes for association. MRL/lpr mice, a murine model of lupus, are deficient in Fas [57], and deficiencies of Fas in humans cause autoimmune lymphoproliferative syndrome [58]. DNAse1deficient mice also serve as a model of lupus [59], and there are reports of DNAse1 deficiency leading to SLE in human families [60,61]. Although these examples suggest a general effect, the literature contains roughly equal numbers of reports for and against the association of Fas, Fas ligand, and DNAse1 with SLE in larger case-control studies [26]. It is therefore difficult to draw any firm conclusions about the role of these genes in SLE pathogenesis in the general population. Other interesting mouse knockout models that develop autoimmune phenotypes include those for C1Q [62], Fcγ [63], and Toll-like receptor-7 [64]. Mapping of the genes responsible for disease in spontaneous models of lupus in mice is another area of active interest, and the overlap between the search for autoimmune genes in human and mouse is due to expand and integrate rapidly as new technologies are brought to bear on this area [51].

Interferon-related candidate genes
Genes in the IFN family have also been implicated in SLE.
The well known IFN signature [65] has provided the inspiration for a number of candidate gene studies. Initial associations with IFN-γ [66] and the IFN receptors [67] have not been confirmed in additional cohorts [68][69][70]. Most recently, however, a study conducted in a large Nordic cohort [71] demonstrated an association with the IFN-regulatory factor (IRF)5 gene; and this has sparked a flurry of strong confirmation and characterization reports [72][73][74]. These four studies all confirm the association of IRF5 with SLE, which appears to be quite robust, although the genetics in this region are complex and several variations appear to combine to form the risk haplotypes [74]. Additional work characterizing the IRF5 alleles associated with SLE is in progress.

Epigenetic work
Epigenetics refers to the inherited chromatin changes that alter gene expression without affecting DNA sequence. Although there is a clear evidence that genetic factors contribute to the pathogenesis of lupus, as detailed above, epigenetic abnormalities have also been implicated in this disease. Over the past 20 years, a series of reports documented a role for abnormal DNA methylation in the pathogenesis of both drug-induced and idiopathic lupus [75]. DNA methylation is an epigenetic mechanism, which refers to adding a methyl group, donated by S-adenosylmethionine, to the fifth carbon on cytosine residues within CpG dinucleotide pairs. CpG pairs located within CpG islands are present in promoter sequences of about 40% to 50% of mammalian genes [75]. In general, methylated CpG pairs suppress gene expression whereas hypomethylated CpG pairs are associated with transcriptional activity [76].
DNA methylation serves several functions, such as suppressing unnecessary genes during tissue differentiation, inhibiting the expression of parasitic DNA, genomic imprinting, and female X chromosome inactivation. De novo DNA methylation takes place early on in fetal life and during differentiation, and is mediated by DNA methyltransferase (DNMT)3a and DNMT3b enzymes, which are capable of methylating previously unmethylated DNA. The pattern of DNA methylation is then maintained during cell division by the enzyme DNMT1 [77].
Global hypomethylation in T cell DNA has been described in lupus [78]. Indeed, this was subsequently found to result from reduced expression of DNMT1 in lupus T cells [79].
Lupus-inducing drugs such as procainamide and hydralazine result in T cell hypomethylation in vitro [80], similar to T cells from active lupus patients. Although procainamide is a competitive inhibitor of DNMT1, hydralazine reduces DNMT1 expression by inhibiting signaling through the extracellular signal-regulated kinase (ERK) signaling pathway, which, at least in part, regulates DNMT1 expression in T cells [81]. T cells treated with DNA methylation inhibitors or ERK pathway signaling inhibitors become autoreactive in vitro and cause autoimmunity, manifested as lupus-like disease, when injected into syngeneic mice [75]. For example, D10 mouse T cells treated with 5-azacytidine and adoptively transferred into syngeneic female AKR mice resulted in anti-dsDNA antibodies, anti-histone antibodies, immune complex glomerulonephritis, alveolitis, and meningitis [82]. Hypomethylation in lupus T cells is thought to contribute to the increased expression of several methylation sensitive genes, including ITGAL (CD11a), PRF1 (perforin), and TNFSF7 (CD70) [83]. The expression of these genes is increased in lupus T cells, T cells treated with the DNA methylation inhibitor 5-azacytidine, as well as T cells treated with the lupus inducing drugs procainamide and hydralazine [75]. Promoter sequence hypomethylation of these genes has been demonstrated in T cells from lupus, and the pattern of hypomethylation is similar to that observed in T cells from normal donors that are treated with DNA methylation inhibitors in vitro [75].

Conclusion
Many of the important genetic risk factors for SLE have been discovered through linkage and association studies, and the body of work in this area is impressive. Nine linkage regions have been established and confirmed for SLE, and an additional seven linkage regions have been established and confirmed using stratification by clinical and laboratory criteria. Two high-throughput platforms for SNP typing have been developed in recent years: AffyMetrix GeneChip ® Mapping Arrays (AffyMetrix, Inc., Santa Clara, CA, USA), which type up to 500,000 SNPs at a time; and the Illumina HumanHap300-Duo bead chip system (Illumina, Inc., San Diego, CA, USA), which covers 318,000 markers largely derived from the Phase I HapMap set. Both companies plan the release of improved technology within the year, with AffyMetrix releasing a new gene chip system covering over a million SNPs and Illumina releasing a new bead chip with expanded coverage of Phase II HapMap SNPs and improved MHC coverage. The first major whole-genome scan using high-throughput SNP technology is now in progress, and we expect it to confirm many of the known effects as well as allow discovery of new gene associations. The major effects confirmed through more traditional single gene studies include complement components C2, C4, and C1q, the HLA region, the FCGR2A and FCGR3A, PDCD1, CTLA4, interleukin-10, MBL, and PTPN22. There are nearly 100 other genes that have been reported to be associated with SLE, the majority of which are either disputed or unconfirmed at this time. Much additional work remains to be done in this area. The ways in which these genes might interact also remains to be explored, and combinations of susceptibility factors may prove to be powerfully predictive. Epigenetic factors such as DNA hypomethylation are also likely to play a role in lupus pathogenesis.
The future of lupus genetics is exciting and complicated. As the major research projects currently underway come to fruition, we will see the largest cohort to date undergo a highdensity association genome scan. These data will be correlated with both clinical information and with gene expression data. Although data analysis will be complicated and ripe with false-positive effects, the end result should be the clearest picture of the cascade from risk allele to immune pathology that we have been able to generate to date. These models will not be nearly as biased by prior information, because both the allele association and gene expression data will be gathered globally, without focusing on what 'should be' downstream of each effect. With such a large dataset to explore, gene interaction effects should become clearer and unanticipated relationships will probably emerge. Candidate gene discovery from murine lupus models is also reaching a threshold, and the cross-talk between human and murine studies will continue to fuel productive research. As new data are gathered and analyzed, we should be able to sort through the false-positive effects more easily and understand the interactions of the true-positive effects. This will enable us to build a more cohesive picture of the genetic risk factors that are involved in the development of SLE and give direction for new and innovative therapeutic options.