Data quality predicts care quality: findings from a national clinical audit

Background Missing clinical outcome data are a common occurrence in longitudinal studies. Data quality in clinical audit is a particular cause for concern. The relationship between departmental levels of missing clinical outcome data and care quality is not known. We hypothesise that completeness of key outcome data in a national audit predicts departmental performance. Methods The National Clinical Audit for Rheumatoid and Early Inflammatory Arthritis (NCAREIA) collected data on care of patients with suspected rheumatoid arthritis (RA) from early 2014 to late 2015. This observational cohort study collected data on patient demographics, departmental variables, service quality measures including time to treatment, and the key RA clinical outcome measure, disease activity at baseline, and 3 months follow-up. A mixed effects model was conducted to identify departments with high/low proportions of missing baseline disease activity data with the results plotted on a caterpillar graph. A mixed effects model was conducted to assess if missing baseline disease activity predicted prompt treatment. Results Six thousand two hundred five patients with complete treatment time data and a diagnosis of RA were recruited from 136 departments. 34.3% had missing disease activity at baseline. Mixed effects modelling identified 13 departments with high levels of missing disease activity, with a cluster observed in the Northwest of England. Missing baseline disease activity was associated with not commencing treatment promptly in an adjusted mix effects model, odds ratio 0.50 (95% CI 0.41 to 0.61, p < 0.0001). Conclusions We have shown that poor engagement in a national audit program correlates with the quality of care provided. Our findings support the use of data completeness as an additional service quality indicator.


Background
National clinical audits (NCAs) are a key lever to improve quality of care and limit unwarranted variation. The National Lung Cancer Audit and the Sentinel Stroke National Audit Programme (SSNAP) are examples of how care can be improved through national audit [1,2]. In order to assess care quality, NCAs rely on metrics that can be split into two distinct groups: 1. Process measures. Typically a guideline recommendation that applies a threshold of quality to a particular aspect of care delivery [3], for example, time to surgery in the hip fracture database [4] or time to thrombolysis in SSNAP [1]; 2. Clinical outcome measures. Quantifiable changes in health status as a result of an intervention [5], for example, blood glucose levels in the National Diabetes Audit [6] or disease activity scores in the National Early inflammatory arthritis audit [7].
Missing or incomplete data are a common finding in any longitudinal dataset, particularly in NCAs that, by and large, rely on clinician time and goodwill for data collection. Process measures tend to have higher data completeness as they are simple to collect from electronic health records. Clinical outcomes measures are more challenging to collect, often requiring blood tests, specific examination findings, or calculation of scores derived from multiple metrics, and so are more likely to have lower data completeness.
Rheumatoid arthritis (RA) is primarily managed in the outpatient setting with multiple follow-up appointments. This increases the likelihood of missing outcome data. Despite this, patterns of missing data are frequently not reported in RA observational studies [8].
It is not known if departments with higher levels of incomplete clinical outcome data are delivering lower quality care. We hypothesised that the completeness of key outcome data in a national clinical audit could predict departmental performance.
The objectives of this study are to utilise an NCA dataset to:

Sample
In 2014, the Healthcare Quality Improvement Partnership (HQIP)-commissioned National Clinical Audit for Rheumatoid and Early Inflammatory Arthritis (NCAR-EIA) was launched. This NCA sought to assess the quality of early inflammatory arthritis (EIA) care delivered and clinical outcomes. All rheumatology departments across England and Wales were mandated to participate. Data were collected at departmental level from February 2014 to October 2015 via an online portal. Data were only reported from departments that contributed > 5 participants to each analysis, in line with the audit's low number reporting policy to maintain anonymity. Patients aged > 16 years referred to a rheumatology department with symptoms consistent with EIA were eligible for inclusion, but the analyses in this study only required data for patients with symptoms consistent with RA. This was defined as any patient with a new onset of inflammatory arthritis, longer than 6 weeks duration, and either a positive rheumatoid factor (RF), anticyclic citrullinated protein (anti-CCP) antibody, or a swollen joint count of five or more. Data were collected at baseline appointment and at a 3-month follow-up.

Data
Patient demographic variables collected at baseline included age, gender, smoking status, ethnicity, work status, and the first three characters of postcode.
A proxy rank of the index of multiple deprivation (IMD) was calculated from partial patient postcodes by identifying all lower super output areas (LSOAs) covered by each postcode. The corresponding IMD scores were then averaged for each patient, allowing calculation of an estimation of deprivation across the cohort.
Departmental variables collected included staffing levels, catchment population, and dedicated EIA clinic presence. The process measures collected included (1) number of days from primary care review to referral to rheumatology care (referral time), (2) number of days from referral to rheumatology review (review time), and (3) number of days from referral to treatment commencement (treatment time).
Referral and review times were calculated at baseline appointment for all patients. Data from the 3 months follow-up was used to calculate treatment time for those who were not commenced on treatment at their baseline appointment. The process measures were collected in accordance with national guidelines for suspected inflammatory arthritis care [9].
Clinical outcome measures collected at baseline and 3 months follow-up included tender and swollen joint counts, patient-reported visual analogue scale of symptom severity, blood erythrocyte sedimentation rate (ESR), and C-reactive protein (CRP) levels.
The clinical outcome measures were utilised to calculate disease activity scores (DAS28), a multi-component score of disease activity in RA [10]. DAS28 is a key clinical outcome measure, utilised widely across rheumatology services to assess response to therapy. The score is calculated from tender and swollen joint counts, a patient assessment of overall symptom severity on 100-mm visual analogue scales, and blood inflammatory marker levels (ESR or CRP), with higher scores indicating greater disease severity. If patients had both an ESR and CRP level, the ESR level was preferentially used to calculate their DAS28.

Statistical analysis
Characterising departmental variation in data completeness A mixed effects model was conducted to identify departments with high or low proportions of missing baseline DAS28 data, adjusting for case mix. Departments who had recruited more than five patients were included in the model.
Missing baseline DAS28 data was the outcome variable with patient-level variables included as fixed effects (age, gender, work status, area level social deprivation, smoking status; ethnicity, symptom duration, antibody status) and department included as a random effect. The random effect captures the difference between the departmental proportion of incomplete data and the overall proportion of incomplete data in the sample. The estimated proportions from the model are empirical Bayes estimates, which, following the logic of Bayesian inference, is a 'shrinkage' estimator exploiting additional information that typically moves the estimate towards the overall proportion for the sample. Shrinkage estimates have several desirable properties, including greater precision and accounting for larger random variation observed at smaller centres [11].
The observed and model-estimated departmental proportions of missing baseline DAS28 data were plotted on a caterpillar graph, with 95% confidence intervals (CI) for the predicted values, to allow a ranking of departments by proportions of missing data.
A department was considered to have a high proportion of missing data if their predicted 95% CI lower bound was greater than the overall mean. Similarly, a department was considered to have a low proportion of missing data if their predicted 95% CI upper bound was less than the sample mean. The locations of the departments with high and low levels of missing data were mapped using the grmap software package [12] in Stata 15 to give a national representation.

Missing clinical outcome data as a predictor of service quality
To test the hypothesis that completeness of key outcome data is a usable surrogate for departmental performance, mixed effects modelling was performed with the department providing care included as a random effect. Incomplete baseline DAS28 was the independent variable, with treatment time as the dependent variable.
Treatment time was used in preference to referral time, as the latter is an assessment of primary care rather than rheumatology department performance. Shorter treatment times are considered a key marker of service quality in early RA care [13]. Analyses from the latest EIA NCA annual report show that shorter treatment times correspond to better clinical outcomes [14].
Treatment time was converted to a binary variable indicating whether treatment commenced within 90 days of referral to rheumatology specialist care. The model was first performed unadjusted for covariates with reporting of odds ratios, p values, and 95% CI for the primary hypothesis test.
The model was repeated adjusting for patient level covariates selected a priori: age, gender, work status, area level social deprivation, smoking status, ethnicity, symptom duration, rheumatoid, and anti-CCP antibody status. The adjusted model was then repeated with further adjustment for departmental characteristics selected a priori: specialist nurse and consultant staffing levels per head of catchment population; presence or absence of a dedicated EIA clinic, and departmental proportions of missing DAS28 data. Model fit was assessed by comparing area under the curve values (AUC) for observed and predicted performance data.
As analyses were exploratory in nature, no correction for multiple hypothesis testing was performed. All analyses were conducted using Stata 15 statistical software package.

Results
A total of 6205 patients with a diagnosis of RA and complete treatment time data were recruited from 136 departments across England and Wales. Table 1 details baseline and demographic characteristics, process measures, clinical outcome measures, and proportions of missing data. There were substantial proportions of incomplete DAS28 data at baseline with 2130/6205 (34.3%) missing data. Table 2 provides further detail on the missing components of baseline DAS28.
There were demographic differences between patients with complete and incomplete baseline DAS28 data (see Table 3). Those with incomplete baseline DAS28 data were younger, less likely to smoke, more likely to be in paid work, less likely to be RF or anti-CCP antibody positive, and had longer symptom duration prior to presentation. Amongst those with incomplete baseline DAS28, 50.3% were commenced on treatment within 90 days, compared to 65.3% in those with complete DAS28 data.

Characterising departmental variations in data completeness
Mixed effects modelling to identify departments with outlier levels of missing DAS28 data was conducted. The model identified 13 departments with high levels of missing data and 7 with low levels. The case-mix adjusted departmental effects are plotted in Fig. 1a. The analysis demonstrates wide variation across departments.
The model has accounted for sample size and case mix, shrinking estimates towards the grand mean as a result of them being empirical Bayes estimates, with the level of shrinkage dependent on the number of observations for each department.
The locations of outlier departments were mapped, displayed in Fig. 1b, with red markers indicating departments with high outlier levels of missing baseline DAS28 data. Geographic clustering of high outlier departments is visible in the North West region.
Missing clinical outcome data as a predictor of service quality Logistic regression models were constructed to assess if missing baseline DAS28 data predicted timely treatment commencement (within 90 days of referral). In mixed effects models, missing baseline DAS28 was associated with not commencing treatment promptly, with odds ratios of 0.48 (95% CI 0.42 to 0.54, p < 0.0001) in the unadjusted model and 0.50 (95% CI 0.41 to 0.61, p < 0.0001) in the model adjusted for patient level factors. The association was maintained after adjustment for department level factors, odds ratio 0.50 (95% CI 0.41 to 0.61, p < 0.0001). A comparison of the models is presented in Fig. 2.
Staffing levels had a substantial impact on the estimates with higher nursing levels and, paradoxically, lower consultant levels associating with a greater chance for prompt treatment initiation.
Sensitivity analyses were conducted to investigate the impact of staffing levels on data completeness. Mixed effects modelling with nurse and consultant densities as a predictor of missing baseline DAS28 gave odds ratios of 1.11 (95% CI 0.91 to 1.34) and 0.97 (95% CI 0.76 to 1.25) respectively, suggesting staffing levels were not independent predictors of missing DAS28.

Discussion
These data from an outpatient-based NCA identify a strong relationship between clinical outcome data completeness and care quality, maintained after extensive adjustment for patient and department level factors. The relationship suggests missing baseline disease activity data are an indirect indicator for service quality in RA care.
By using a mixed effects model in our analyses, we have been able to account for case mix and volume of activity in our estimates of data quality. As a result, we have robustly identified departments that substantially deviate from expected data return levels, accounting for confounding due to case-mix and potential small sample effects [11].
There was broad variation in departmental completeness of baseline disease activity data. High levels of incomplete clinical outcome measures detrimentally affect the face validity of NCA findings. Process measures, such as treatment time in NCAREIA, tend to have lower   [15]. This is perhaps explained by clinician familiarity with the clinical outcome measures from trials within the field, in contrast to the paucity of studies that evaluate process measures in healthcare.  There appeared to be a cluster of departments with high degrees of missing data in the North West region. The purpose of mapping performance was to seek evidence for regional clustering. The mapping presented suggests that clustering may be present, and this warrants further investigation. Clinical departments often share networks within their locality, with common training programs and regional education events. If regional trends in performance are apparent, and genuine, these provide an opportunity for a collective service improvement program.
There were striking differences between patients with complete and incomplete DAS28 data, suggesting the data are missing not at random. The drivers behind missing DAS28 data warrant further discussion. Participation in NCAREIA was mandatory, but still relied on clinician goodwill for data entry. It is likely that clinicians put greater emphasis on collecting data for patients with 'typical' or severe symptoms of RA. This is supported by the higher proportion of seropositive patients in those with complete DAS28 data.
Missing data are a challenge in any observational dataset, particularly in NCAs that rely on clinician goodwill for case ascertainment and data collection. HQIP, responsible for the commissioning of NCA activity within the NHS in England, recommend that missing data can be reduced, by rationalising of data collection, so only the most important information is sought; mitigated, via linkage to alternative data sources to 'backfill' data gaps; and imputed, utilising statistical modelling [16].
Rationalisation of data collection is a key component in the design of an NCA. A balance must be struck between the desire to collect as much information as possible and the challenge for clinicians tasked with recruiting patients while running busy services. Data linkage to fill in data gaps, for example using the clinical practice research datalink of primary care data to attain patient demographic details, is in theory possible but in practice not feasible as this will require additional approvals to access and link alternative data sources. Therefore, data imputation is often required to allow analysis without omitting individuals with incomplete records, which potentially introduces selection bias. A widely used principled approach to missing data is multiple imputation, which estimates a range plausible missing values [17]. It is now widely used in the published literature and has been utilised in NCAREIA analyses [18].
The impact of missing clinical outcomes in NCAs is often only partially addressed [19,20], increasing the likelihood of biased findings. All NCAs should have a robust missing data plan in keeping with HQIP Fig. 2 Mixed effects modelling of missing baseline DAS-28 data as a predictor of prompt treatment commencement. Model 1 (N = 6205) is unadjusted. Model 2 (N = 2979) is adjusted for patient covariates: age, gender, work status, socioeconomic position, smoking status, ethnicity, symptom duration, rheumatoid, and anti-CCP antibody status. Model 3 (N = 2943) is adjusted for patient covariates and the following departmental characteristics: specialist nurse and consultant staffing levels per head of catchment population, presence or absence of a dedicated EIA clinic, and departmental proportions of missing DAS28 data. Department was included as a random intercept in all three models recommendations. The findings presented here suggest that in circumstances where missing clinical outcome data cannot be avoided, it may have utility as a surrogate for service performance.
Work on other NCAs has highlighted nonparticipation as a predictor for departments that deliver lower quality care [21], but this study is the first to highlight the utility of partial participation, submitting incomplete data, as a predictor capable of identifying departments with outlier performance.
The strengths of this study are that it was conducted using a large dataset with missing data characteristics similar to other NCAs. Analyses were extensively adjusted for patient and departmental covariates, with department included as a random effect in the model to account for the hierarchy of the dataset. The findings identified that (1) the degree of missing clinical outcome data varies broadly across departments, allowing the detection of outliers, and (2) missing clinical outcome data associates with service quality. The combination of these findings supports the utility of missing outcome data as a usable proxy for service quality.
These findings do not in themselves support the utilisation of missing outcome data across NCAs. Further work by the respective methodology teams of NCAs is required to assess if these findings are generalisable. The heterogeneity of outcomes collected and differences in the degree of missing data will likely lead to this proxy having varying utility across different datasets.

Conclusion
For the first time, we have demonstrated that poor engagement in a national audit program directly correlates with the quality of care provided. Acknowledging methodological limitations of this work, the key message is unsurprising. Centre-level variation in care is frequently overlooked in observational studies [22], yet any epidemiologist working in the field will recognise that centre effects are large and significant. As research into health care quality evolves, our data suggest that failing to return outcome data is a viable service quality metric.