Identification of candidate genes and chemicals associated with osteoarthritis by transcriptome-wide association study and chemical-gene interaction analysis

Background Osteoarthritis (OA) is a common degenerative joint disease and causes chronic pain and disability to the elderly. Several risk factors are involved, such as aging, obesity, genetic susceptibility, and environmental factors. We conducted a transcriptome-wide association study (TWAS) and chemical-related gene set enrichment analysis (CGSEA) to investigate the susceptibility genes and environmental factors. Methods TWAS analysis was conducted to identify the susceptibility genes by integrating the summary-level genome-wide association study data of knee OA (KOA) and hip OA (HOA) with the precomputed expression weights from the Genotype-Tissue Expression Project (Version 8). The FUSION software was used for both single-tissue and cross-tissue TWAS, which were combined using an aggregate Cauchy association test. The biological function and pathways of the TWAS genes were explored using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) databases, and the human cartilage mRNA expression profiles were utilized to validate the TWAS genes. CGSEA analysis was performed to scan the OA-associated chemicals by integrating the TWAS results with the chemical-related gene sets. Results There were 44 and 93 unique TWAS genes identified in 7 and 11 chromosomes for KOA and HOA, respectively, fourteen and four of which showed significantly differential expression in the mRNA profiles, such as CRHR1, LTBP1, WWP2, LMX1B, and PTHLH. OA-related pathways were found in the KEGG and GO analysis, such as TGF-beta signaling pathway, MAPK signaling pathway, hyaluronan metabolic process, and chondrocyte differentiation. Forty-five OA-associated chemicals were identified, including quercetin, bisphenol A, and cadmium chloride. Conclusions Several candidate OA-associated genes and chemicals were identified through TWAS and CGSEA analysis, which expanded our understanding of the relationship between genes, chemicals, and their impact on OA. Supplementary Information The online version contains supplementary material available at 10.1186/s13075-023-03164-x.


Introduction
Osteoarthritis (OA), a common degenerative joint disease, causes chronic pain and disability in the elderly.According to the data from the Global Burden of Disease project, the age-standardized point prevalence and annual incidence rate of OA were 3754.2 and 181.2 per 100,000 in 2017, with an increase of 9.3% and 8.2% from 1990, respectively [1].
While the pathogenesis of OA is not entirely explained, the risk factors for OA development have been demonstrated in epidemiologic studies, such as age, obesity, ethnicity, family history and genetic factors, and environmental factors [2,3].Moreover, genetic factors have been reported to have a significant contribution to knee OA (KOA) and hip OA (HOA), and the heritability has been estimated to be 60% for the HOA and 50.4% for the KOA [4,5].Recently, genetic studies have yielded novel insights into the genetic propensity of OA.Several genome-wide association studies (GWAS) have identified more than 100 loci associated with OA [6][7][8][9][10], further elucidating the genetic architecture of OA.However, the specific mechanism between those genetic variants and OA has not been fully investigated.The genetic variants have been demonstrated to have impact on the gene expression to further influence the phenotypes [11,12], and transcriptome-wide association studies (TWAS) have offered the chance to integrate the summarylevel GWAS data with expression quantitative trait loci (eQTL) references to identify the trait-related genes.
Previous studies have identified the OA-associated genes by integrating the TWAS results and mRNA expression profiles for HOA and KOA [13,14].However, both studies adopted the skeletal muscle and blood as the eQTL references, which are not the causally relevant tissue of OA.In addition, performing TWAS using the eQTL panels of non-trait-related tissues leads to noncausal hits and dropped out of the real causal ones, and cross-tissue TWAS has been recommended if there is no closely trait-related tissue available [15].
In the present study, we conducted both single-tissue and cross-tissue TWAS using summary-level GWAS data of HOA and KOA to investigate gene-trait associations.Subsequently, biological pathways of the TWAS genes were explored.Finally, chemical-related gene set enrichment analysis (CGSEA) was performed to identify chemicals associated with OA using the Comparative Toxicogenomics Database (CTD).
As there was no causally OA-related tissue reference panel, such as cartilage and chondrocyte, we utilized all GTEx reference panels in the TWAS as recommended [15].Additionally, to enhance the power of the TWAS, we performed a sparse canonical correlation analysis (sCCA) TWAS with cross-tissue reference panels publicly available on the FUSION website.The single-tissue TWAS and sCCA-TWAS were subsequently merged using an aggregate Cauchy association test (ACAT).The comprehensive analytical methodology of the sCCA-ACAT approach was demonstrated in the previous study (https:// github.com/ fengh elian/ sCCA-ACAT_ TWAS) [20].
The TWAS was performed on the autosomal chromosomes with the default settings.A strict Bonferronicorrected P-value (P.Bonferroni ) < 0.05 was considered as a threshold for the significant TWAS associations.

Functional exploration of significant TWAS genes
To investigate the potential biological function of the identified significant TWAS genes, enrichment analysis was conducted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) databases [21,22].The KEGG and GO enrichment analysis was performed using the "clusterProfiler" R package (R Foundation for Statistical Computing, Vienna, Austria; https:// www.R-proje ct.org/).

Gene expression dataset of cartilage
For KOA, the present study leveraged the gene expression dataset of cartilage obtained from a previous study [23].This dataset comprised mRNA-sequencing data extracted from knee cartilage tissue of 18 healthy donors and 20 OA patients, characterizing all genes in terms of log2 fold change (FC) and an adjusted P-value.To ascertain differentially expressed genes (DEGs), a significance threshold of adjusted P-value < 0.05 was applied.
To validate the HOA TWAS genes, a total of 10 pairs of preserved and OA cartilage samples (age ≥ 73) from the Research Arthritis and Articular Cartilage study were included to analyze the gene expression profile of hip joint [24].We used the GEO2R program in Gene Expression Omnibus database (https:// www.ncbi.nlm.nih.gov/ geo/) to find out the DEGs, with a significance threshold of P-value < 0.05.

Chemical-gene expression interaction
The CTD (http:// ctdba se.org/) is a publicly available online database that provides access to data on chemical-gene interactions, chemical-disease associations, chemical-pathway associations, and chemical-phenotype associations.A flexible tool (CGSEA, https:// github.Fig. 1 The significant TWAS genes for knee OA with Z-scores across multiple reference panels.White spaces indicated the genes did not pass the significance threshold Fig. 2 The significant TWAS genes for hip OA with Z-scores across multiple reference panels.White spaces indicated the genes did not pass the significance threshold Fig. 3 Manhattan plots of the TWAS results.A and B showed the significant TWAS genes of knee OA and hip OA, respectively, across all autosomes.The blue lines indicated the significance threshold of the TWAS analysis.The labeled feature was the most significant one when there were multiple features with the same ID achieving significance com/ Cheng SQXJTU/ CGSEA) was introduced to screen potential chemicals implicated in complex diseases or traits, and 11,190 chemical-associated gene sets were generated using 1,788,149 annotation terms of chemicalgene pairs acquired from human and mice.The comprehensively analytical methodology was elucidated in the previous study [25].
In the present study, we integrated the TWAS results with the chemical-related gene sets to scan the candidate chemicals associated with OA using the CGSEA program with the default settings.The chemicals with both absolute value of normalized enrichment score (|NES|) > 1 and P-value < 0.05 were considered the significantly OAassociated chemicals.

Transcriptome-wide association studies
We identified 170 and 731 significantly associated genes (P.Bonferroni < 0.05) across multiple eQTL reference panels in the single-tissue TWAS of KOA and HOA, respectively (Figs. 1 and 2, Table S1-2).Amongst those features, 40 and 74 unique genes were identified in 7 and 11 chromosomes for KOA and HOA, respectively (Fig. 3).While most features showed the consistent expression patterns across expression panels, there were 4 and 10 genes with inconsistent direction of effect (Figs. 1 and 2).Additionally, we conducted cross-tissue TWAS through the sCCA + ACAT approach to improve the power of singletissue TWAS and found another 9 and 19 significant features (P.Bonferroni < 0.05) for KOA and HOA, respectively (Table 1, Table S3-4).

Functional annotation of TWAS genes
KEGG and GO analyses were applied to explore the potential biological function of the TWAS genes, which were identified by either single-tissue or cross-tissue TWAS analysis.There were 4 and 12 KEGG categories with P-value < 0.05 for KOA and HOA, respectively (Fig. 4 A, B), including TGF-beta signaling pathway and MAPK signaling pathway, which played an important role in OA [26].The TWAS genes were subjected to GO analysis as well (Table S5-6), and Fig. 4 C, D shows the top five GO terms in the biological process, cellular component, and molecular function category.Particularly, we found some OA-related biological processes in the biological process category, such as hyaluronan metabolic process, chondrocyte differentiation, chondrocyte proliferation, and mucopolysaccharide metabolic process.

Shared genes in mRNA expression profiling
To validate the significant TWAS features, we utilized the cartilage mRNA expression profiling from previous studies [23,24].In the case of KOA and HOA, a total of 14 and 4 TWAS genes, respectively, were observed among the DEGs in the mRNA expression analysis (adjusted P-value < 0.05) (Fig. 5).Notably, among these shared genes, six genes for KOA exhibited a |log2FC|> 1, as shown in Table 2.

CGSEA of the TWAS genes
We conducted CGSEA analysis to identify the OA-associated environmental factors using the significant TWAS genes with the Z-scores in the single-tissue TWAS (Table 3).For KOA, we identified the 5 enriched chemicals.Meanwhile, there were 45 significantly enriched chemicals for HOA (Table S7), such as quercetin, bisphenol A, cadmium chloride, and mercuric chloride.

Discussion
The present study identified 49 and 93 significantly associated genes of KOA and HOA, respectively, through single-tissue and cross-tissue TWAS, and the biological  4 The significant KEGG categories and top five GO terms of TWAS genes.A and B showed the significant KEGG categories for knee OA and hip OA, respectively.C and D showed the top five GO terms for knee OA and hip OA, respectively function and pathways of those signals were further investigated using KEGG and GO databases.We used the gene expression profiling of cartilage to validate the TWAS results.
More than 100 independent risk variants of OA have been reported in previous GWAS studies.While several the significant TWAS genes overlapped with the effector genes or reasonable candidate genes in the Fig. 5 The shared significant genes identified in both TWAS analysis and cartilage mRNA expression profiles.A for knee OA and B for hip OA GWAS analysis, such as GDF5, USP8, TNC, FGFR3, LTBP1, and UQCC1, we unraveled several novel potential risk genes by TWAS analysis, which also showed significantly differential expression in the cartilage mRNA expression profiling, such as EVI2A, FMNL1, and AARS.Actually, studies demonstrated that some of the TWAS genes were involved in the OA-related biological function, such as GDF5 [27][28][29], WWP2 [30,31], and MALAT1 [32][33][34], while several TWAS genes have not been fully investigated in OA.For example, the MLXIP gene, also known as MAX-interacting protein 1 or MIP1, encodes a protein that interacts with the MAX transcription factor, which plays a critical role in the regulation of gene expression.Studies have shown that the MLXIP gene is involved in various biological processes, including glucose metabolism [35,36], lipid metabolism [37][38][39], and cellular senescence [40], which have also been implicated in the pathogenesis of osteoarthritis.
Additionally, by integrating the TWAS results with the chemical-related gene sets, we found a total of 45 unique OA-associated chemical substances.Among the identified chemicals, quercetin was extensively studied in the field of OA and alleviated OA through its multiple biological functions, including suppression of the inflammation and cartilage degeneration, pain relief, and attenuation of oxidative stress, ER stress, and associated apoptosis [41][42][43][44][45]. Clinical studies also support the protective effect of quercetin supplement, which could significantly improve the joint function and collagen II synthesis/degradation balance [46].There were some hazardous chemicals among the identified OA-associated ones.For instance, bisphenol A was detected not only in the serum of the OA patients, but also in the synovial fluid of knee replacement patients, and exhibited a concentration-dependent antagonistic effect on the protective actions of E2 on chondrocyte, which decreased the NF-kappaB activation and MMP1 expression [47].In addition, the exposure of cadmium chloride could reduce the chondrocyte cell viability, increase the expression of the catabolic markers (MMP13, MMP9, MMP3, MMP1) and inflammatory markers (IL-1β and IL-6), and activate the expression of the cartilage extracellular matrix genes (aggrecan and collagen II), and the cadmium contributed the cartilage loss by activating the interleukins through the reactive oxygen species [48].Clinical findings supported the harmful effect of cadmium, and cadmium exposure through smoking was positively correlated with the severity of OA [49].
There were several limitations of the present study.First, the lack of causally OA-related tissues could reduce the power of the TWAS analysis, while we utilized both single-tissue and cross-tissue TWAS approaches to address this issue.Second, we employed the bioinformatic analysis to explore the OA-related candidate genes and chemicals, and the biological function of TWAS genes and related chemicals needed to be further investigated through biological experiments and clinical observation.Third, the summary-level GWAS data were exclusively derived from the European population, and caution should be exercised when extrapolating the results to other populations.

Conclusions
In summary, we identified multiple OA-associated genes and chemicals by performing the TWAS and CGSEA analysis and yielded novel insights into the relationship between genes, chemicals, and their impact on OA.

Table 1
The additional significantly associated genes identified through the ACAT approach OA Osteoarthritis, ACAT Aggregate Cauchy association test, P.ACAT P-value from ACAT analysis, P.Bonferroni Bonferroni-corrected P-value from ACAT analysis

Table 2
The shared genes between significant TWAS genes and DEGs with |log2FC|> 1 TWAS Transcriptome-wide association study, DEGs Differentially expressed genes, OA Osteoarthritis, FC Fold change

Table 3
The top five significantly OA-associated chemicals identified through CGSEA analysis OA Osteoarthritis, CGSEA Chemical-related gene set enrichment analysis, NES Normalized enrichment score