Gene expression signatures for autoimmune disease in peripheral blood mononuclear cells
© BioMed Central Ltd 2004
Received: 10 November 2003
Accepted: 21 April 2004
Published: 29 April 2004
Skip to main content
© BioMed Central Ltd 2004
Received: 10 November 2003
Accepted: 21 April 2004
Published: 29 April 2004
The relatively new technology of DNA microarrays offers the possibility to probe the human genome for clues to the pathogenesis and treatment of human disease. While early studies using this approach were largely in oncology, many new reports are emerging in other fields including infectious diseases and pharmacology, and applications in autoimmunity have been recently reported by our group and others. Some of these investigations have examined animal models of autoimmune disease, but a number of human studies have also been carried out. Of special interest are those that have used peripheral blood samples because, unlike tissue biopsies, these are readily available from all subjects. Using this approach, patterns of gene expression can be detected that distinguish patients with autoimmune conditions from normal subjects. Furthermore, the genes that are identified provide clues to possible pathogenetic mechanisms and are likely to be useful in developing tests to establish diagnostic categories and predict therapeutic responses.
The relatively new technology of DNA microarrays has made it feasible to measure the expression levels of thousands of genes in small biological samples . It has been suggested that this methodology might be especially useful in analyzing the complex and parallel changes that occur within cells and tissues of the immune system in normal and pathologic states . Much of the early work using DNA microarrays was in the field of oncology; other studies have examined host responses to infectious agents or drugs . The gene array approach is especially well-suited to the type of multifactorial analysis that is needed to unravel the causes of human autoimmune disorders that involve both complex genetics and environmental factors [4, 5]. Studies in autoimmune disease have included the use of biopsy samples from affected patients, targeting tissues such as synovium, brain or skin [6–9]. While this approach can offer insights for some disease subsets, it does not permit study of all afflicted patients and cannot be applied to early phases of disease when therapeutic interventions are most likely to be useful. As an alternative, we and others have hypothesized that due to the systemic nature of autoimmune disease, clinically relevant changes in gene expression should be observed in peripheral blood mononuclear cells (PBMCs). Using peripheral blood as the source of gene expression material offers the possibility of sampling any individual at any time and also has the potential to detect early pathogenetic and prognostic factors. This review will examine studies in autoimmune disease, focusing on the utility of peripheral blood samples to identify genes of interest. The potential for this approach to provide insights into disease pathogenesis and to aid with diagnosis and management are also discussed.
Gene expression studies of peripheral blood mononuclear cells from patients with autoimmune diseases
Gene arrays used
MS (n = 27) and normal controls (n = 19)
14,000 cDNA clones
MS patients distinguished from normal controls using 53 discriminatory genes
Bomprezzi et al., 2003 
MS (n = 10)
Mini-lymphochip with >12,000 elements
Identification of a set of genes regulated by IFN-ß
Sturzebecher et al., 2003 
SLE (n = 21) and normal controls (n = 12)
Cytokine gene array with 375 genes in duplicate
Clustering distinguished most patients from controls
Rus et al., 2002 
SLE (n = 48) and normal controls (n = 42)
Affymetrix U95A array with >10,000 genes
Dysregulation of genes in the IFN pathway present in SLE patients with active disease
Baechler et al., 2003 
Pediatric SLE (n = 30); JCA (n = 12); normal children (n = 9)
Affymetrix U95AV2 array with >12,000 genes
SLE patients had overexpression of granulopoiesis-related and IFN-induced genes related to the presence of active disease
Bennett et al., 2003 
SLE, RA, MS, IDDM (n = 53) and normal controls (n = 9)
Research Genetics/Invitrogen with >4300 genes
Autoimmune patients clustered separately from normal and immunized controls
Maas et al., 2002 
While peripheral blood offers many advantages as a source of analysis material, one potential drawback is the small quantities of RNA that can be reasonably obtained. Surprisingly, information about the amount of blood needed to produce an analyzable sample has not been uniformly reported; one group used lymphocytopheresis, suggesting a need for large numbers of cells . Early chip protocols often required more than 25 µg of total RNA, which could only be obtained by using large blood volumes. This could be problematic, especially in studies of children or seriously ill subjects. In our initial studies, the gene filter from Research Genetics (now Invitrogen, Carlsbad CA), which contained clones for approximately 4300 identified human genes, was chosen because only 5 µg of total RNA was required and we were interested in testing the feasibility of analyzing small blood samples. These gene filters are, however, no longer available. Current recommendations for other platforms, such as the Affymetrix Gene Chip Arrays®, require no more than 5 µg total RNA, probably due to improved efficiency of the labeling techniques, and this can be readily attained from blood samples without amplification. Sample size, therefore, is probably no longer a limiting factor in experimental design.
Methods for verifying data from microarrays have become familiar to most users. Reproducibility has been achieved by performing replicate hybridizations of the same sample on different arrays [14, 17]. However, in general, replicate analyses are not required . In some studies, confirmation of the microarray findings has been accomplished using independent methods such as real-time PCR [14, 19] or detection of the encoded proteins . Of interest in human studies are clinical correlations made with gene expression levels that fit with predicted changes. For example, in a study of childhood SLE, the only patient in complete remission was clustered with the healthy controls, suggesting that the signature expressed in the ill patients was disease-related , and in an MS trial of interferon-ß (IFN-ß) clinically-defined responders and non-responders showed differences in gene expression profiles .
The large amount of data generated in microarray experiments necessitates the use of filtering to permit focus on the genes of interest. Approaches to this issue have included requiring that each gene have a minimal intensity across all conditions [12, 15], and that genes without significant changes be eliminated from further analysis . For studies in PBMC populations, analyses are generally limited to the approximately 5000 genes that are expressed in these cells . Other investigators have applied additional requirements, such as eliminating genes that show changes in expression levels with collection or shipping of the samples , although the advent of RNA stabilization tubes for blood collection may make this less of a concern in future studies.
Most sudies of autoimmune disease have used normal controls that reflect the demographics of the patient population of interest. Disease controls have also been used, as in the case of juvenile polyarthritis patients who were compared to juvenile SLE patients . We considered that prior to embarking on an analysis of immune responses in disease states, it would be of interest to establish parameters of the normal host response to an exogenous antigen challenge. This approach permitted verification of the feasibility of the design as well as establishment of a comparator for autoimmune diseases.
Multiple sclerosis is an organ-specific autoimmune disorder targeting myelinated fibers in the central nervous system. The disease may have many modes of presentation and has clinically distinct subsets. One report has described microarray findings in MS brain lesions using autopsy samples from human subjects . This study highlighted several genes, including some involved in T-cell activation and neurotransmitters, which have potential relevance in designing targeted therapies. More relevant to the current discussion are studies that have been done using PBMCs from MS patients. Since brain lesions are not readily available for biopsy, the possibility of obtaining useful information from the peripheral blood in this disorder has great potential for clinical applications. In one study comparing PBMCs from MS patients to normal control subjects more than a thousand differentially expressed genes were found; 53 of these were used to discriminate normal subjects from MS patients . Genes in the upregulated category included several encoding components of the tumor necrosis factor signaling pathway. Downregulated genes included heat shock protein-70, and others encoding proteins involved in cell cycling. A second report generated using MS patients is the only one that has examined longitudinal specimens from patients enrolled in a clinical trial . Patients treated with IFN-ß in this trial could be separated on the basis of MRI scan results into responders and nonresponders. Most responders showed changes in IFN-ß-regulated genes, while few of the nonresponders showed these changes. Genes of interest encoded cytokines and chemokines (IL-8, granulocyte-macrophage colony-stimulating factor, IL-3 receptor) and signaling molecules (JNK1, Jun B, PKC-ß). An implication of this study is that patients who have a greater chance of responding to IFN-ß might be identified by gene expression patterns in PBMCs. This hypothesis remains to be verified in a larger group of patients.
Three studies of PBMC gene expression in lupus patients have been published [14–16]. The earliest of these, published in 2002, used a cytokine gene array. Most of the changes observed were in genes that had not previously been identified as contributing to the pathogenesis of SLE, consistent with the view that microarray experiments are a method of data discovery . A major finding of this study in 21 SLE patients was that clustering analyses permitted clear separation of the patients from the controls, even with the relatively small number of genes (375) available on the array. No correlation with clinical disease status as measured by the SLE disease activity index (SLEDAI) score was seen, suggesting that the differences were related to the disease state itself and not to activity variables or medications.
A second, relatively large study compared 48 adult SLE patients to 42 healthy control subjects . Clustering analysis grouped 37 of the 48 SLE patients together while the remaining patients were clustered with the control subjects. Most of the discriminatory genes were those that had higher expression levels in the SLE patients than in the controls. Especially notable was the finding that genes in the IFN-regulated pathway were upregulated in about half of the patients while control subjects expressed low levels. Furthermore, high levels of IFN-regulated gene expression could be used to identify patients who had more severe disease manifestations. A similar IFN signature has been described in pediatric SLE patients . Children with SLE were generally clustered separately from controls, with the only exception being a patient who did not have active disease. These two studies suggest that the IFN signature is related to disease activity and that blocking IFN pathways might have therapeutic efficacy in SLE.
Our group has studied gene expression in adult subjects with four autoimmune disorders: RA, MS, IDDM, and SLE . In each instance, the patients were not restricted in any way other than satisfying appropriate diagnostic criteria (RA, SLE; [21, 22]) or being identified by a specialist physician (MS, IDDM). Individuals were not excluded on the basis of any clinical variables such as what medications they were taking or how long they had had disease. PBMC samples from the autoimmune patients were compared to control subjects by flow cytometry and no significant differences were seen. In addition, expression levels of genes encoding activation markers (CD54, CD38, CD71) were not significantly different in the autoimmune patients than in control subjects . These findings suggested that it was valid to compare gene expression levels in PBMC preparations between the two groups of subjects.
Differentially expressed autoimmune genes
Underexpressed in autoimmune
TRADD, TRAP1, TRIP, TRAF2, CASP6, CASP8, TP53, SIVA
UBE2M, UBE2G2, POH1
Cell cycle inhibitors
CDKN1B, CDKN2A, BRCA1
Overexpressed in autoimmune
CSF3R, HLA-DMB, HLALS, TGFBR2, BMPR2
MSTP9, BDNF, CES1, CYR61
FASTK, DGKA, DGKD
Downregulation of p53 explains a significant portion of the differentially expressed genes in the autoimmune signature, suggesting that this single gene may be central to the autoimmune state. In other ongoing studies we have confirmed, by independent methods, that cellular damage response pathways that are dependent on p53 are defective in patients with RA  and MS (S Sriram and T Aune, unpublished data). A cluster of 95 overexpressed genes was more heterogeneous, representing several distinct functional categories, including receptors, inflammatory mediators, signaling molecules and autoantigens (Table 2).
Our findings extended those of other groups (working in MS and SLE) by including patients with RA and IDDM, and by offering direct comparisons between them. The similar gene expression findings in these clinically diverse conditions are consistent with the hypothesis that autoimmune disorders share an underlying pathogenesis. Furthermore, the differences between autoimmune and vaccinated subjects suggest that autoantigens elicit responses that are distinct from the normal host defenses to exogenous antigens.
Prediction of disease class is a major goal of microarray studies . Since gene expression patterns permit clustering of patients with autoimmune disease from normal control subjects, there has been significant interest in using the gene expression data to classify disease subsets and predict responses to treatments. In one study of MS patients, two genes (HIF2 and CKS2) were used to discriminate between MS patients and controls . Although the correct prediction rate was 80%, the separation was not complete and some samples were misclassified. In SLE patients, Baechler et al. showed a significant correlation between gene expression data and number of SLE criteria (r = 0.51; P = 0.002), and they were also able to use the IFN score to distinguish SLE patients from controls with a high degree of accuracy (P = 2.7 × 10-7) . More than half of the SLE patients, however, had IFN scores that were not distinguishable from controls. The IFN score could, therefore, only be used to distinguish patients with more active SLE. Studies in children with SLE also showed a correlation between the IFN signature and disease activity as measured by the SLEDAI, reinforcing the hypothesis that the IFN signature is a measure of disease activity or severity .
To further test the predictive value of this equation, new sets of SLE and RA patients that were not included in the initial data analysis were subjected to scoring; none of these individuals had a score greater than 6, confirming that they belonged to the autoimmune set. Although normal subjects had generally high scores, this was not the case for four individuals tested who were first-degree relatives of autoimmune patients. All four of these normal individuals had a score of 0 in the 35-gene equation, indicating that they also carried the autoimmune signature (Fig. 3).
The autoimmune signature defined by the 35-gene equation most likely represents an inherited liability for development of disease rather than a consequence of the disease or its treatment. The are two reasons for this, the first of which is derived from the results in first-degree relatives, as described above. These persons did not have a clinical disease and were not being treated with immunosuppressive medications yet carried the entire set of downregulated genes. The second is the observation that MS and diabetes patients were not receiving the same drugs as those in the SLE and RA groups. Many of the MS patients were being treated with IFN without glucocorticoids; none of the IDDM patients were on immunosuppressives. In addition, since patients represented a broad range of disease activity and severity, it appears that these clinical variables did not impact on expression of this signature. Thus, the autoimmune signature may confer a liability for development of disease, while the specific disease syndrome that develops is likely dependent on additional factors that might include other genes or environmental stimuli like microbes or hormones . The concept that different autoimmune diseases share basic features is, in fact, not new . Furthermore, in clinical practice it is not uncommon for patients with features of more than one autoimmune syndrome to present diagnostic dilemmas . Occurrence of multiple autoimmune diseases within a family is also relatively common, suggesting that similar genetic components can underlie very different clinical syndromes . Studies in progress in autoimmune families suggest that many of the genes in the autoimmune signature display high levels of heritability (K Maas and T Aune, unpublished data).
Relationships between gene expression variables are very complex and patterns that have clinical significance may take many forms. It is likely that combinations of variables that utilize operators other than addition and subtraction will be revealing of significant relationships. Symbolic discriminant analysis (SDA) is an alternative approach that has been developed to identify complex gene relationships which may be nonlinear . We used SDA to compare the gene expression data derived from normal individuals to patients with either RA or SLE . Personal computers are not sufficiently powerful to cope with SDA requirements so analyses were carried out on four processors of the Vanderbilt Multi-Processor Integrated Research Engine (VAMPIRE), a 110-processor computer system running the Linux operating system. Cross validation was used to verify that associations detected were reproducible and not due to chance alone.
Differentially expressed autoimmune genes detected by symbolic discriminant analysis
RA versus control
Protein phosphatase 2, regulatory subunit B,
Ubiquitin fusion degradation 1-like
PDZ domain protein
Chondroitin sulfate proteoglycan 2
EST similar to 26s protease regulatory subunit 6
SLE versus control
Bone morphogenic protein receptor type II
Calguizzarin, calgranulin family member
Nucleolar protein I
EST, similar to platelet factor 4
Calcineurin A subunit
The application of gene expression analysis to the study of human disease is a new and rapidly evolving area of investigation. These are powerful techniques that permit capture of many simultaneous processes and convert the findings into quantitative, reproducible data. The large numbers of data points that can be generated from a single individual make it likely that significant findings can emerge from smaller groups of subjects compared with previous approaches. Furthermore, the patterns and associations that are developed between individual genes afford a close look at molecular processes. One might consider that in diseases with complex etiologies, the study of one gene in 5000 people may be less informative about the disease process than the study of 2500 genes a few individuals. The former approach is critically dependent on choosing the correct gene for study. The latter approach may reveal new genes of interest, and in that way has the potential to generate novel hypotheses. The fact that these powerful studies can be carried out in human subjects and not just in animal models is likely to advance discovery of new ways to diagnose, classify and treat human autoimmune disease.
insulin-dependent diabetes mellitus
polymerase chain reaction
peripheral blood mononuclear cell
Symbolic discriminant analysis
systemic lupus erythematosus
SLE disease activity index
Special thanks are extended to the Vanderbilt physicians who allowed us to study their patients. Support was from NIH (AI44924, AR41943, DK58765, AI053984 and CA90949), a Vanderbilt University Medical Center Discovery Grant and the Morgan Family Foundation. JHM is supported in part by the Vanderbilt-Ingram Cancer Center.