A machine learning-assisted model for renal urate underexcretion with genetic and clinical variables among Chinese men with gout

Sun, Mingshu; Sun, Wenyan; Zhao, Xuetong; Li, Zhiqiang; Dalbeth, Nicola; Ji, Aichang; He, Yuwei; Qu, Hongzhu; Zheng, Guangmin; Ma, Lidan; Wang, Jiayi; Shi, Yongyong; Fang, Xiangdong; Chen, Haibing; Merriman, Tony R.; Li, Changgui

doi:10.1186/s13075-022-02755-4

Research article
Open access
Published: 09 March 2022

A machine learning-assisted model for renal urate underexcretion with genetic and clinical variables among Chinese men with gout

Mingshu Sun¹^na1,
Wenyan Sun¹^na1,
Xuetong Zhao²^na1,
Zhiqiang Li¹^na1,
Nicola Dalbeth³,
Aichang Ji¹,
Yuwei He¹,
Hongzhu Qu²,
Guangmin Zheng²,
Lidan Ma¹,
Jiayi Wang⁴,
Yongyong Shi¹,
Xiangdong Fang²,
Haibing Chen⁵,
Tony R. Merriman^6,7 &
…
Changgui Li ORCID: orcid.org/0000-0002-4622-3731¹

Arthritis Research & Therapy volume 24, Article number: 67 (2022) Cite this article

2682 Accesses
4 Citations
2 Altmetric
Metrics details

Abstract

Objectives

The objective of this study was to develop and validate a prediction model for renal urate underexcretion (RUE) in male gout patients.

Methods

Men with gout enrolled from multicenter cohorts in China were analyzed as the development and validation data sets. The RUE phenotype was defined as fractional excretion of uric acid (FE_UA) <5.5%. Candidate genetic and clinical features were screened by the least absolute shrinkage and selection operator (LASSO) with 10-fold cross-validation. Machine learning algorithms (stochastic gradient descent (SGD), logistic regression, support vector machine) were performed to construct a predictive classifier of RUE. Models were assessed by the area under the receiver operating characteristic curve (AUC) and the precision-recall curve (PRC).

Results

One thousand two hundred thirty-eight and two thousand twenty-three patients were enrolled as the development and validation cohorts, with 1220 and 754 randomly chosen patients genotyped, respectively. Rs3775948.GG of SLC2A9/GLUT9, rs504915.AA of NRXN2/URAT1, and 7 clinical features (age, hypertension, nephrolithiasis, blood glucose, serum urate, urea nitrogen, and creatinine) were generated by LASSO. Two additional SNP variants (rs2231142.GG of ABCG2 and rs11231463.GG of SLC22A9/OAT7) were selected based on their contributions to gout in the development cohort and their reported effects on renal urate handling. The optimized classifiers yielded AUCs of ~0.914 and PRCs of ~0.980 using these 11 variables. The SGD model was conducted in the validation cohort with an AUC of 0.899 and the PRC of 0.957.

Conclusions

A prediction model for RUE composed of four SNPs and readily accessible clinical features was established with acceptable accuracy for men with gout.

Introduction

Gout is the most common inflammatory arthritis with multi-organ involvement, affecting <1% to 6.8% of the general population around the world, and is becoming more prevalent with younger age-at-onset [1,2,3,4]. Hyperuricemia is the biochemical basis of gout, with long-term urate-lowering therapy (ULT) a key element of gout management. The pathogenic causes of primary hyperuricemia include urate overproduction in the liver and renal or extra-renal urate underexcretion, depending on the enzymes or urate transporters involved [5, 6].

Fractional excretion of uric acid (FE_UA) is currently acknowledged as a precise measurement of renal urate clearance, with the normal FE_UA range of 5.5–11.1% [7, 8]. Those with FE_UA less than 5.5% are classified as with RUE [9]. Accurate assessment of RUE requires assessment of FE_UA, with 24-h urine sample under a 2-week washout period and 5-day purine-restricted diet considered the standard measurement [10], although there are studies exploring spot or a few hour urine samples as the substitutions [7,8,9, 11]. Given the fact that many drugs may interfere with renal urate excretion, all patients were required to withdraw all drugs during the washout time. However, withdrawal of medications during the washout period can be problematic. In addition, the inconvenience of 24-h urine collection and strict life control limits its application in daily practice. A simple and reliable method to identify patients with RUE is needed, both for research purposes and also in clinical practice, particularly when assessing younger patients with gout and those with a strong family history of gout.

Genetic and clinical research based on big data has set the stage for the development of prediction models with genetic and/or clinical variables in recent years. Genome-wide association studies (GWAS) have revealed urate-associated genetic variants, some of which are within genes of urate transporters or their regulators [12, 13]. Other studies have also reported genotypes associated with urate export parameters or even renal urate handling profiles [14, 15]. Many of the candidate loci and variants are causally associated with serum urate concentration, for example, those in SLC2A9/GLUT9, ABCG2, and SLC22A12/URAT1, and some are rather marker SNPs which are in linkage disequilibrium with a candidate causal SNP, like rs1797052T of PDZK1 [12, 16]. In a large participating general population-based pedigree study, 183 index SNPs identified in a trans-ancestry GWAS for serum urate levels explained 17% of heritability [5]. Additionally, a number of clinical variables including body mass index (BMI), age, and renal function are associated with renal urate handling [15]. These data provide the possibility to investigate prediction models for gout pathogenic phenotypes using genetic or readily accessible clinical data.

So far, no such models are available. This study was designed to investigate a prediction model for RUE in men with gout. First, RUE, defined as FE_UA <5.5%, was measured in a gout cohort from a single center in China, and gout and/or hyperuricemia-associated SNPs identified to be East Asian-specific as previously reported were genotyped [12, 13, 16]. Then we developed machine learning (ML) prediction models for clustering the RUE phenotype using genetic and clinical variables. The models were validated in a multicenter cohort from three Chinese hospitals.

Methods

Study population

Male patients with gout that met the 2015 American College of Rheumatology/European League against Rheumatism classification criteria were enrolled [17]. Exclusion criteria included blood pressure ≥180/110 mmHg, blood glucose ≥11.1 mmol/L, eGFR <45 ml/min/1.73m², taking regular anticoagulant and with severe heart, kidney, or brain disease; cancer; or mental or metabolism disorders. For each subject enrolled, 24-h urine sample was collected after a 14-day washout period of any drug and low purine diet (purine intake <200 mg/day) for 5 days [10].

Men who attended the Shandong Provincial Gout Clinical Medical Center, the Affiliated Hospital of Qingdao University (Qingdao, China) between July 2016 and March 2019 comprised the development data set. Men from gout clinics of Shanghai Jiaotong University Affiliated 6th People’s Hospital (Shanghai, China), Tongji University Affiliated 10th People’s Hospital (Shanghai, China), and the Affiliated Hospital of Qingdao University (Qingdao, China) between May 2015 and June 2020 served as the validation data set. The overall study design is shown in Fig. 1.

This study was approved by the Ethics Committee of the Affiliated Hospital of Qingdao University (Qingdao, China). All participants gave their written informed consents.

Clinical variables, detection of RUE subtype, and statistical analysis

Clinical data were obtained from each hospital’s electronic health record system, including demographic and medical information, height, weight, waistline, systolic blood pressure (SBP), diastolic blood pressure (DBP), tophi, and biochemical parameters. Serum urate (SU), blood glucose (Glu), triglyceride (TG), total cholesterol (TC), low-density lipoprotein (LDL), high-density lipoprotein (HDL), blood urea nitrogen (BUN), serum creatinine (sCr), urinary uric acid (uUA), and urinary creatinine (uCr) were detected using an automatic biochemical analyzer (TBA-40FR; TOSHIBA, Japan).

Parameters for renal urate handling were measured by FE_UA, which was the percentage of renal urate clearance over creatinine clearance (FE_UA = uUA/uCr × sCr/SU). Participants with FE_UA<5.5% were defined as with RUE [9].

The characteristics of the overall study patients are described in Table 1. For continuous covariates, summary statistics are reported as mean (standard deviation) or median (interquartile range), where appropriate. Proportions were compared using the Chi-square test and continuous variables were compared using ANOVA or Kruskal-Wallis tests, as appropriate. Univariate and multiple linear regression analyses were performed to investigate the effect of clinical features on FE_UA in the pooled gout patients [18, 19]. SPSS 25.0 software was used for all analyses. A two-sided p < 0.05 was designated as statistically significant for all analyses.

Table 1 Comparison of clinical features among development and validation data sets

Full size table

Genotyping and statistical analysis

The target genetic variations were 20 single nucleotide polymorphisms (SNPs) identified as gout-risk loci or associated with SU concentrations and FE_UA in the East Asian population as previously reported [12, 13]. Genomic DNA was extracted from peripheral blood mononuclear cells. Genotyping of the selected SNPs were performed with a SpectroCHIP®II-G384 array (Agena Bioscience, San Diego, USA). In the development data set, all 20 target SNPs were tested. 2638 controls were from the Chinese healthy male sample set of our previous gout GWAS to identify candidate SNPs associated with gout for the purposes of modeling [20]. Only SNPs included in the prediction model were tested in the validation data set. Association analyses of SNPs with the FE_UA were done using PLINK (http://www.bwh.harvard.edu/plink/) and association analyses of these loci with clinical variables were done using an additive genetic model implemented in SNPTEST (http://mathgen.stats.ox.ac.uk/genetics_software/snptest/snptest.html) (A two-sided p<0.0025 was assumed to be significant for the SNP association analyses). Association of the z-score of the residuals with SNP allele dose was tested by linear regression.

Prediction model analysis

Men with gout in the development data set with complete clinical and genetic data of interest were included for variable selection and classifier construction. Patients were classified with or without the RUE phenotype according to FE_UA <5.5% versus FE_UA ≥5.5%, respectively. Samples of the development data set were randomly divided into training and test sets (5:1). As described herein, 31 clinical and biochemical variables, as well as candidate SNP information were screened by Least Absolute Shrinkage and Selection Operator (LASSO) regression [21], augmented with 10-fold cross-validation in the training set for internal validation. Imputation for missing variables was performed if missing values were no more than 20%. The most predictive covariates to RUE phenotype were selected by the minimum criteria (lambda.min). The R package “glmnet” statistical software (R Foundation) was used to perform the LASSO regression. Subsequently, variables identified by LASSO regression analysis were entered into ML models to construct a classifier to identify RUE phenotype. We used three ML algorithms to perform modeling by a Python script, which were stochastic gradient descent (SGD) [22], logistic regression (LG), and linear support vector classifier (SVC). An external multicenter validation set of gout cases with complete data was employed to validate the classifier performance. The area under the receiver operating curve (AUC) and the precision-recall curve (PRC) were used to evaluate the prediction efficacy of the models. R software (http://www.R-project.org) and Python were used for all modeling analyses.

Another filter, the classical extreme gradient boosting (XGBoost) method, and other classifiers, random forests and neural networks, were also performed for comparisons to LASSO and the modeling algorisms described above.

Results

Clinical characteristics of the two data sets

A total of 1238 and 2023 male patients with gout from three hospitals were included and analyzed as the development and validation cohorts, respectively (Table 1). We also explored the effect of clinical variables on FE_UA in the pooled group of the two cohorts by linear regression analyses (Table 2). Overall, FE_UA was comparable between the two sets (4.21% vs 4.26%, p>0.05), after adjusting for age, BMI, other biochemical parameters, and the presence of nephrolithiasis, cardiovascular disease, hypertension, and tophi that were associated with FE_UA in the univariate regression analysis. The proportion of patients with RUE was also comparable between the two data sets (83.4% vs 83.0%, p>0.05). Multiple linear regression models for FE_UA showed that age, Glu, BUN, sCr, and the presence of hypertension or nephrolithiasis were independent positive predictors (p<0.05), while SU was a negative predictor for FE_UA (p<0.05).

Table 2 Linear regression analyses of clinical variables with FE_UA (%) in the pooled group of gout patients

Full size table

Association of SNPs with gout, SU levels, and FE_UA in the development data set

1220 patients in the development set were genotyped for the 20 SNPs. Compared to 2638 urate-normal non-gout controls based on association analysis, 42 variants of 14 SNPs were identified as gout-associated and served as candidate genetic variables for modeling, which were ABCG2, MUC1, PDZK1, GCKR, SLC2A9/GLUT9 (2 loci), SLC22A9/OAT7, PLA2G16, FLRT1, NRXN2/URAT1 (2 loci), AIP, ALDH2, and COMMD4. The other 6 SNPs were rs2762353 of SLC17A1, rs17145750 of MLXIPL, rs79105258 of CUX2, rs4966024 of IGF1R, rs73575095 of MAF, and rs9895661 of BCAS3, which are mostly associated with metabolic pathways or inflammation despite SLC17A1 and are not biologically immediately linked to renal urate handling. Odds ratio and 95% confidence intervals of each SNP allele are shown in Supplementary Table S1.

We also evaluated the association of the 14 candidate SNPs with SU levels and FE_UA (Table 3) in the development data set. Only rs2231142 of ABCG2 showed nominal association (β=0.155, p<2.5×10⁻³) with SU level. SNPs at three loci showed nominal association with FE_UA, which were one at ABCG2 and two at SLC2A9.

Table 3 Association analyses between 14 candidate SNPs and serum urate and FE_UA in the development cohort

Full size table

Prediction model

By applying the LASSO algorithm to the 42 genetic variants and 31 clinical variables in the training sample, the important variables for identifying RUE were determined, with the log (λ) values being summarized in Fig. 2A and B. Four SNP variations (rs7679724.TT and rs3775948.GG of SLC2A9/GLUT9, rs504915.AA of NRXN2/URAT1, and rs11227805.TT of AIP) and 7 clinical features (age, hypertension, nephrolithiasis, Glu, SU, BUN, and sCr) were selected by LASSO as the most important for predicting RUE phenotype. The ML models (Linear SVC, SGD, and LG) predicted the RUE with AUCs of ~0.622 using the 4 SNP variables, and ~0.899 using the combo of the 11 genetic and clinical variables in the internal test set (Supplementary Table S2).

To enhance the prediction efficacy of genetic predictors, we selected two additional variations, rs2231142.GG of ABCG2 and rs11231463.GG of SLC22A9/OAT7, based on their contributions to gout in the development cohort (Supplementary Table S1) and their effects on renal urate handling as reported [7,8,9, 11]. We optimized the models by performing with different combinations of these 13 variables following the principle of prediction efficacy and economy. Combining the two additional variations with rs3775948.GG and rs504915.AA, the models yielded higher AUCs of ~0.667, and AUCs of ~0.914 using the combination of the 4 artificially selected genetic variations and the 7 clinical variables (Supplementary Table S2). The SGD model for classifying RUE showed an AUC of 0.912 (95% CI 0.894 to 0.920) and a PRC of 0.956 in the internal test sets (Fig. 2C, D). The prediction performance of the SGD model in the external validation cohort (n=754) yielded an AUC of 0.899 (95% CI 0.887 to 0.904) and a PRC of 0.956 (Fig. 2E, F).

The RUE phenotype risk score was constructed based on the coefficients from the SGD model. The probability was calculated as following: f(x) = 1/[1 + e ( − x)], which was the mean after tenfold cross-validations. A calculator of the ML SGD model was developed to allow local clinicians to enter the values of the 4 SNP variations and 7 clinical variables required for the risk score with automatic calculation of the likelihood that a gout patient is a renal underexcretor.

The XGBoost obtained 18 features, which were SU, GLU, eGFR, Ccr, BUN, BMI, rs3775948.GG, LDL, DBP, sCr, age, nephrolithiasis, rs7679724.TT, history of smoking, SBP, hypertension, rs57633992.AC, and rs2762353.GG. The models developed using XGBoost selected variables, random forests or neural networks achieved the AUCs of 0.864~0.904. The results were presented in Supplementary Tables S2 and S3.

Discussion

Prediction models for individual disease diagnosis, incidence, or outcome are growing rapidly in recent years, with the development of new learning algorithms and the ongoing explosion of data. Good prediction models provide useful tools in disease management and greatly ease the clinical practice [23]. Here we developed for the first time a ML prediction model for the risk of RUE phenotype in gout patients. Eleven variables were selected by the LASSO algorithm or artificial selection based on their importance in modeling or impacts on SU and FE_UA, respectively. We established 3 ML prediction models for RUE using reliable genetic variants and easily accessible clinical features with stable and acceptable efficacy (AUC=0.91) and validated the model in a multicenter gout cohort. By comparing with XGBoost filter and other classifiers, the neural network Multi-layer Perceptron Classifier, and Random Forest, we confirmed that the models displayed here are the optimal ones with the highest predicting accuracy. A calculator based on the ML SGD model using these predictors was generated and readily available in the clinic, enabling clinicians to estimate the probability of a patient with RUE.

Four SNPs were selected by LASSO as the most important contributors for grouping RUE. However, the prediction accuracy was only about 0.62 in the ML models using these 4 genetic variables. We tried an artificial selection of the SNP variants to improve the prediction efficacy of the model. The rs2231142.T allele causes dysfunction of ABCG2, a urate transporter mainly located in the intestinal tract, and was previously demonstrated to be associated with extra-renal underexcretion [9]. It was the only locus with nominal association with SU level in our development data set and was significantly associated with FE_UA. The rs11231463.G allele of SLC22A9 increased the risk of gout by 2.2 times compared with urate-normal controls in this study. SLC22A9 encodes OAT7, which is expressed in the liver and exhibits modest uricosuric-sensitive urate uptake activity [12, 13]. These two variants were also selected according to their effectiveness on SU and FE_UA. Two SNPs selected by LASSO, rs3775838.GG of SLC2A9 and rs504915.AA of NRXN2/SLC22A12, were adopted in the model. SLC2A9 encodes GLUT9, and NRXN2/SLC22A12 encodes URAT1, both of which were the major urate exporters located in renal tubules. These two variants were both associated with gout in the development cohort. By combining these two LASSO selected genetic variants with the two additionally selected variants, and also with the clinical variables, the prediction capacity of the ML models was optimized (Supplementary Table S2).

Additional clinical variables were ranked and the top 7 were selected by LASSO, including 3 medical history features (age, presence of hypertension, and nephrolithiasis) which are easy to obtain, and 4 biochemical parameters (Glu, SU, BUN, and sCr). These predictors included by ML algorithm are statistically significant contributors to FE_UA in the pooled gout group and are also biologically meaningful. Aging and elevated sCr indicate impaired kidney function. In the early stage of kidney dysfunction (in this study, eGFR ≥45 ml/min/1.73m²), the tubular reabsorption function is the dominant problem contributing to renal urate handling, which may manifest as increased FE_UA. It is consistent with earlier research, in which urate reabsorption in the tubular was reduced, accompanied by increased urate excretion during the early stage of gouty nephropathy [24]. BUN shares a similar renal tubular excretion pattern with urate, but may compete with urinary urate during the process of tubule reabsorption and secretion. Since there is an osmotic diuretic effect of urinary glucose, it is reasonable to predict that FE_UA would increase with blood glucose. Hypertension may increase the glomerular filtration and induce a hyperfiltration-associated urinary urate excretion [25]. A high level of SU is a burden to renal tubule secretion rather than glomerular filtration [26]. Healthy subjects have increased FE_UA corresponding to SU elevation while most gout patients do not, implying that intrinsic defects may exist in renal tubule urate handling of these patients [7, 10]. At last, the presence of nephrolithiasis may be an indicator for renal urate excretion. Combining these 7 clinical features with the SNP variable combination, three ML models were performed and produced similar AUCs of approximately 0.9 in the development and multicenter validation cohorts.

To ensure the reliability and effectiveness of the model, multicenter cohorts of Chinese male adults with gout were adopted to serve as the development and the external validation data set in this study. The distributions of patients with RUE (83.4% versus 83%) are strictly consistent between analyses conducted in the two data sets, which also fits the same profile as existing reports [27]. Besides, vital clinical characteristics, the SU and FE_UA levels, are also identical between them. These features of the two data sets promised an appropriate data environment for the prediction model development and the reliability of any model generated.

Furthermore, the RUE phenotype was profiled using a standard method. RUE used to be defined solely based on the absolute 24-h renal urate excretion with the premise that a fixed fraction of daily urate production was excreted by the kidneys [28]. However, the urinary urate amount is a synergistic result of the renal urate load, the glomerular filtration function and the excretion capacity of renal tubules [7, 29,30,31]. FE_UA is a more precise measurement for the identification of low renal uric acid clearance phenotype, which has been adjusted for the SU level and normalized to the individual’s glomerular filtration rate [8].

This study has certain limitations. First is that the efficacy of the SNP predictors in this model is not powerful enough, mostly because that all index SNPs at 183 loci identified by trans-ancestry GWAS can explain only 7.7% of the SU genetic heritability in the East Asian population [5]. The prediction model will be improved by the inclusion of more genetic variants from future genetic research. The second limitation is that this model was established among Chinese gout patients. It should be tested carefully before application in diverse ethnics. The third one is that this model is tested in men alone, and should be validated in female gout cohorts. Finally, despite our efforts to make the prediction model as stable and robust as possible, some variable biochemical parameters like glucose, SU, and BUN were still included, which may have increased the instability of the model. However, these parameters are generally stable when obtained under controlled conditions as described in this study.

Conclusion

In conclusion, this research developed and validated a reliable and practical model to predict the RUE phenotype in Chinese men with gout, which were helpful for individualized therapy. Additional testing and independent validation would be of benefit, and we provide a calculator to assist with determination in other cohorts.

Availability of data and materials

The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

RUE:: Renal urate underexcretion
FE_UA :: Fractional excretion of uric acid
LASSO:: Least absolute shrinkage and selection operator
SGD:: Stochastic gradient descent
AUC:: The area under the receiver operating characteristic curve
PRC:: Precision-recall curve
ULT:: Urate-lowering therapy
GWAS:: Genome-wide association studies
ML:: Machine learning
SBP:: Systolic blood pressure
DBP:: Diastolic blood pressure
SU:: Serum urate
Glu:: Blood glucose
TG:: Triglyceride
TC:: Total cholesterol
LDL:: Low-density lipoprotein
HDL:: High-density lipoprotein
BUN:: Blood urea nitrogen
sCr:: Serum creatinine
uUA:: Urinary uric acid
uCr:: Urinary creatinine
SNPs:: Single nucleotide polymorphisms
LG:: Logistic regression
SVC:: Linear support vector classifier

References

Dehlin M, Jacobsson L, Roddy E. Global epidemiology of gout: prevalence, incidence, treatment patterns and risk factors. Nat Rev Rheumatol. 2020;16(7):380–90.
Article Google Scholar
Liu R, Han C, Wu D, Xia X, Gu J, Guan H, et al. Prevalence of Hyperuricemia and Gout in Mainland China from 2000 to 2014: a systematic review and meta-analysis. Biomed Res Int. 2015;2015:762820.
PubMed PubMed Central Google Scholar
Kim JW, Kwak SG, Lee H, Kim SK, Choe JY, Park SH. Prevalence and incidence of gout in Korea: data from the national health claims database 2007-2015. Rheumatol Int. 2017;37(9):1499–506.
Article Google Scholar
Gao Q, Cheng X, Merriman TR, Wang C, Cui L, Zhang H, et al. Trends in the manifestations of 9754 gout patients in a Chinese clinical center: a 10-year observational study. Joint Bone Spine. 2021;88(6):105078. https://doi.org/10.1016/j.jbspin.2020.09.010.
Tin A, Marten J, Halperin Kuhns VL, Li Y, Wuttke M, Kirsten H, et al. Target genes, variants, tissues and transcriptional pathways influencing human serum urate levels. Nat Genet. 2019;51(10):1459–74.
Article CAS Google Scholar
Nakayama A, Nakaoka H, Yamamoto K, Sakiyama M, Shaukat A, Toyoda Y, et al. GWAS of clinically defined gout and subtypes identifies multiple susceptibility loci that include urate transporter genes. Ann Rheum Dis. 2017;76(5):869–77.
Article CAS Google Scholar
Perez-Ruiz F, Calabozo M, Erauskin GG, Ruibal A, Herrero-Beites AM. Renal underexcretion of uric acid is present in patients with apparent high urinary uric acid output. Arthritis Rheum. 2002;47(6):610–3.
Article CAS Google Scholar
Indraratna PL, Stocker SL, Williams KM, Graham GG, Jones G, Day RO. A proposal for identifying the low renal uric acid clearance phenotype. Arthritis Res Ther. 2010;12(6):149.
Article Google Scholar
Ichida K, Matsuo H, Takada T, Nakayama A, Murakami K, Shimizu T, et al. Decreased extra-renal urate excretion is a common cause of hyperuricemia. Nat Commun. 2012;3:764.
Article Google Scholar
Puig JG, Torres RJ, de Miguel E, Sanchez A, Bailen R, Banegas JR. Uric acid excretion in healthy subjects: a nomogram to assess the mechanisms underlying purine metabolic disorders. Metabolism. 2012;61(4):512–8.
Article CAS Google Scholar
Simkin PA, Hoover PL, Paxson CS, Wilson WF. Uric acid excretion: quantitative assessment from spot, midmorning serum and urine samples. Ann Intern Med. 1979;91(1):44–7.
Article CAS Google Scholar
Nakatochi M, Kanai M, Nakayama A, Hishida A, Kawamura Y, Ichihara S, et al. Genome-wide meta-analysis identifies multiple novel loci associated with serum uric acid levels in Japanese individuals. Commun Biol. 2019;2:115.
Article Google Scholar
Boocock J, Leask M, Okada Y, Asian Genetic Epidemiology Network C, Matsuo H, Kawamura Y, et al. Genomic dissection of 43 serum urate-associated loci provides multiple insights into molecular mechanisms of urate control. Hum Mol Genet. 2020;29(6):923–43.
Article CAS Google Scholar
Major TJ, Dalbeth N, Stahl EA, Merriman TR. An update on the genetics of hyperuricaemia and gout. Nat Rev Rheumatol. 2018;14(6):341–53.
Article CAS Google Scholar
Narang RK, Vincent Z, Phipps-Green A, Stamp LK, Merriman TR, Dalbeth N. Population-specific factors associated with fractional excretion of uric acid. Arthritis Res Ther. 2019;21(1):234.
Article Google Scholar
Leask MP, Merriman TR. The genetic basis of urate control and gout: Insights into molecular pathogenesis from follow-up study of genome-wide association study loci. Best Pract Res Clin Rheumatol. 2021;35(4):101721.
Article Google Scholar
Neogi T, Jansen TL, Dalbeth N, Fransen J, Schumacher HR, Berendsen D, et al. 2015 Gout Classification Criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis Rheumatol. 2015;67(10):2557–68.
Article Google Scholar
Jezequel P, Loussouarn D, Guerin-Charbonnel C, Campion L, Vanier A, Gouraud W, et al. Gene-expression molecular subtyping of triple-negative breast cancer tumours: importance of immune response. Breast Cancer Res. 2015;17:43.
Article Google Scholar
Akhtar M, Elliott PM. Risk stratification for sudden cardiac death in non-ischaemic dilated cardiomyopathy. Curr Cardiol Rep. 2019;21(12):155.
Article CAS Google Scholar
Li C, Li Z, Liu S, Wang C, Han L, Cui L, et al. Genome-wide association analysis identifies three new risk loci for gout arthritis in Han Chinese. Nat Commun. 2015;6:7041.
Article CAS Google Scholar
Steyerberg EW, Eijkemans MJ, Harrell FE Jr, Habbema JD. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19(8):1059–79.
Article CAS Google Scholar
Robbins H, Monro S. A stochastic approximation method. Ann Math Stat. 1951;22:400–7.
Article Google Scholar
Jordan MI, Mitchell TM. Machine learning: trends, perspectives, and prospects. Science. 2015;349(6245):255–60.
Article CAS Google Scholar
Li F, Guo H, Zou J, Chen W, Lu Y, Zhang X, et al. Urinary excretion of uric acid is negatively associated with albuminuria in patients with chronic kidney disease: a cross-sectional study. BMC Nephrol. 2018;19(1):95.
Article Google Scholar
Scholz GH, Hanefeld M. Metabolic vascular syndrome: new insights into a multidimensional network of risk factors and diseases. Visc Med. 2016;32(5):319–26.
Article Google Scholar
Levinson DJ, Sorensen LB. Renal handling of uric acid in normal and gouty subject: evidence for a 4-component system. Ann Rheum Dis. 1980;39(2):173–9.
Article CAS Google Scholar
Choi HK, Mount DB, Reginato AM, American College of P, American Physiological S. Pathogenesis of gout. Ann Intern Med. 2005;143(7):499–516.
Article CAS Google Scholar
Boss GR, Seegmiller JE. Hyperuricemia and gout. Classification, complications and management. N Engl J Med. 1979;300(26):1459–68.
Article CAS Google Scholar
Emmerson BT. Identification of the causes of persistent hyperuricaemia. Lancet. 1991;337(8755):1461–3.
Article CAS Google Scholar
Matsuo H, Takada T, Nakayama A, Shimizu T, Sakiyama M, Shimizu S, et al. ABCG2 dysfunction increases the risk of renal overload hyperuricemia. Nucleosides Nucleotides Nucleic Acids. 2014;33(4-6):266–74.
Article CAS Google Scholar
Dalbeth N, Merriman TR, Stamp LK. Gout. Lancet. 2016;388(10055):2039–52.
Article CAS Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study was funded by the National Key Research and Development Program of China (2016YFC0903400), the National Natural Science Foundation of China (81520108007, 81871288, and 81770869), and the Shandong Province Key Research and Development Program (2018CXGC1207).

Author information

Mingshu Sun, Wenyan Sun, Xuetong Zhao and Zhiqiang Li contributed equally to this work.

Authors and Affiliations

Shandong Provincial Clinical Research Center for Immune Diseases and Gout & Shandong Provincial Key Laboratory of Metabolic Diseases, The Affiliated Hospital of Qingdao University, Qingdao, China
Mingshu Sun, Wenyan Sun, Zhiqiang Li, Aichang Ji, Yuwei He, Lidan Ma, Yongyong Shi & Changgui Li
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
Xuetong Zhao, Hongzhu Qu, Guangmin Zheng & Xiangdong Fang
Department of Medicine, University of Auckland, Auckland, New Zealand
Nicola Dalbeth
Department of Pharmacy, Peking University First Hospital, Beijing, China
Jiayi Wang
Department of Endocrinology and Metabolism, Shanghai 10th People’s Hospital, Tongji University, Shanghai, China
Haibing Chen
Department of Biochemistry, University of Otago, Dunedin, New Zealand
Tony R. Merriman
Department of Medicine, University of Alabama Birmingham, Birmingham, AL, USA
Tony R. Merriman

Authors

Mingshu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Wenyan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xuetong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Dalbeth
View author publications
You can also search for this author in PubMed Google Scholar
Aichang Ji
View author publications
You can also search for this author in PubMed Google Scholar
Yuwei He
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhu Qu
View author publications
You can also search for this author in PubMed Google Scholar
Guangmin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Lidan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jiayi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yongyong Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xiangdong Fang
View author publications
You can also search for this author in PubMed Google Scholar
Haibing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tony R. Merriman
View author publications
You can also search for this author in PubMed Google Scholar
Changgui Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.L. takes full responsibility for the work, had access to the data, and controlled the decision to publish. C.L., T.R.M., H.C., and N.D. conceived of the study, developed the protocol, and completed data interpretation and the manuscript revision. M.S., W.S., X.Z., and Z.Li did the database management and statistical analyses. Y.H., H.Q., L.M., A.J., and J.W. obtained the clinical and genetic data. Y.S. and X.F. offered genetic consult. M.S., Z.Li., and X.Z. drafted and N.D., T.R.M., and C.L. revised the manuscript. The authors read and approved the final manuscript.

Corresponding authors

Correspondence to Haibing Chen, Tony R. Merriman or Changgui Li.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committee of the Affiliated Hospital of Qingdao University (Qingdao, China). All participants gave their written informed consents.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary Table S1.

Association analyses between 20 SNPs and gout in the development cohort. Supplementary Table S2. Performances of different models for RUE in the internal test sets and validation cohort. Supplementary Table S3. Performances of models under XGBoost for RUE in the internal test sets.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Sun, M., Sun, W., Zhao, X. et al. A machine learning-assisted model for renal urate underexcretion with genetic and clinical variables among Chinese men with gout. Arthritis Res Ther 24, 67 (2022). https://doi.org/10.1186/s13075-022-02755-4

Download citation

Received: 29 July 2021
Accepted: 28 February 2022
Published: 09 March 2022
DOI: https://doi.org/10.1186/s13075-022-02755-4

A machine learning-assisted model for renal urate underexcretion with genetic and clinical variables among Chinese men with gout

Abstract

Objectives

Methods

Results

Conclusions

Introduction

Methods

Study population

Clinical variables, detection of RUE subtype, and statistical analysis

Genotyping and statistical analysis

Prediction model analysis

Results

Clinical characteristics of the two data sets

Association of SNPs with gout, SU levels, and FEUA in the development data set

Prediction model

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1: Supplementary Table S1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Arthritis Research & Therapy

Contact us

Association of SNPs with gout, SU levels, and FE_UA in the development data set