Efficacy assessed in follow-ups of clinical trials: methodological conundrum
© BioMed Central Ltd 2010
Published: 30 July 2010
Skip to main content
© BioMed Central Ltd 2010
Published: 30 July 2010
Increasingly, we see papers describing the long-term follow-up results of randomised clinical trials. Sometimes, like the article by Rantalaiho and colleagues in the previous issue of Arthritis Research & Therapy, the follow-up extends to more than 10 years. It is not uncommon that authors of such articles describe their results as a comparison of the original treatment groups in the original randomised clinical trial. Methodologically, such a comparison is fallible for several reasons. In this editorial, two important sources of bias that may jeopardise the results of such follow-up studies are discussed: confounding by indication and confounding by trial completion.
Long-term follow-ups of randomised clinical trials are a contradictio in terminis.
With this rather bold statement I do not mean that such studies are impossible to conduct. Rantalaiho and colleagues have proven with the publication of the 11-year follow up of their world-famous Fin-RACo trial that dedicated investigators and patients who believe in the goals of the study can create a dataset that is insurmountable in terms of wealth, from which we can learn a lot about the long-term fate of patients with rheumatoid arthritis (RA) . The authors have carefully analysed the available radiographic data, they have investigated important long-term outcomes such as mortality and joint-replacement surgery, and they have appropriately modelled longitudinal data. Their conclusion that early aggressive therapy with combinations of conventional disease-modifying antirheumatic drugs including corticosteroids pays off in terms of long-term radiographic and clinical benefits is credible. And their argument that 'treat to target' is the best way to exploit those benefits is convincing .
What concerns me most in Rantalaiho and colleagues' interpretation - and admittedly in similar exercises in which I took part myself [2, 3] - is the implicit assumption that two groups of patients formed a decade ago by a stochastic process that we call randomisation can be compared 11 years later under the same premise of prognostic similarity.
Groups in randomised clinical trials (RCTs) may violate prognostic similarity even at baseline. Chance theory tells us that if we were to perform the procedure of randomisation 1,000 times, we may face a number of attempts with a number of imbalances, sometimes even in prognostically relevant variables. We usually ignore such baseline differences, assuming that imbalances may occur in either direction, and their combined net effect on the outcome of interest is probably negligible. The important consideration is that these baseline differences are completely by chance (random), which means 'not driven by any tangible or impressionable process'.
I need this piece of theory to convince you that Rantalaiho and colleagues' 11-year-old RCT follow-up has suffered from many influences that may have jeopardised prognostic similarity. Let us look through the spectacles of the trial methodologist and play devil's advocate by working out two important biases: con founding by indication and confounding by trial completion.
The Fin-RACo trial had a protocol for only 2 years , implying that any treatment choice thereafter was up to the discretion of the doctor and the patient. Undoubtedly, the physician wanted the best for the patient, thus prioritising the patient's wellbeing over the fate of the study. A consequence of good clinical practice, however, is that - as confirmed by Rantalaiho and colleagues - the worst patients may have received the most intensive (effective, costly) treatment, which may in turn have unquantifiable influences on the outcome of interest. If such events occur in an unbalanced fashion, we speak about confounding by indication. I think in RA, with its many effective treatments to choose and its inextricable relationship between disease activity (determinant) and radiographic progression (outcome measure) , confounding by indication should be a number-one reason to refrain from statistical between-group comparisons in long-term follow-ups of RCTs.
The second issue is related to the first, but is slightly different in nature: confounding by trial completion. Obviously, the investigators have done their best in obtaining the outcome of interest in as many patients as possible. Expectedly, they have not been able to assess outcome in every patient. What is important from a methodological point of view is whether this loss to follow-up was completely random. Usually it is impossible to determine the exact reasons for patients not showing up at a control visit or an end-of-study assessment. Usually, therefore, it is impossible to conclude that a no-show (or missing) had nothing to do with the severity and activity of the RA. What follows is that you cannot be sure that such events are distributed evenly across trial groups, and therefore every between-group comparison under the assumption of prognostic similarity is meaningless. Rantalaiho and colleagues have done their best to collect as many radiographs from as many patients as possible, but - not unexpectedly - more than 30% of the patients miss their 11-year radiographic assessment. The investigators may, like many authors do, provide inferential arguments that drop-out is not relevant in their study, but unfortunately one cannot judge.
These two biases mean I am rather reluctant to accept firm conclusions from follow-ups of RCTs that have been analysed a decade after the randomisation procedure, however credible they may seem. Many events may have occurred in every individual patient in the trial that may have broken prognostic similarity. I therefore do not truly believe in the explanation of differences after 10 years of intangibly trying to influence patients' fates.
Does this make Rantalaiho and colleagues' results useless? Absolutely not. We welcome cohorts of patients that have been followed for years in order to find out what eventually determines the disease course. Ideally such cohorts include patients with severe and less severe disease, with more and less active RA, with more and less aggressive initial treatment. We should know a lot more about these patients' fates; their baseline values and their baseline biomaterials are extremely important in defining new prognostic biomarkers. Such carefully conducted studies may give insight into what is really important in determining an individual patient's prognosis in a world full of treatment choices that differ in efficacy, effectiveness and cost.
Explained in terms of contradictio in terminis, the contradiction is in the recognition that the randomised part of a RCT is not necessarily a licence for harmlessly comparing treatment effects after a decade of follow-up of that trial.
Finnish Rheumatoid Arthritis Combination Therapy
randomised clinical trial