Study design
This phase 3, randomized, double-blind, double-dummy, active-controlled (NSAID), parallel-group study (ClinicalTrials.gov: NCT02528188) was conducted at 446 clinical research, specialist/general practice, or hospital sites in the USA, Europe, Latin America, and Asia-Pacific region from July 2015 to February 2019.
The study consisted of a screening period of up to 37 days (including a 2- to 30-day washout phase for prohibited medications and an initial pain assessment period [IPAP] of 7 days prior to baseline), a 56-week double-blind treatment period, and a 24-week safety follow-up period that began 8 weeks after final SC injection. The co-primary efficacy endpoints were change in Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC© 1996 Nicholas Bellamy. WOMAC® is a registered trademark of Nicholas Bellamy [CDN, EU, USA]) Pain, WOMAC Physical Function, and Patient Global Assessment of Osteoarthritis (PGA-OA) at week 16 for each tanezumab group versus the NSAID group. The full methodology for this study has been reported previously [10].
Study sample
Eligibility criteria included age ≥ 18 years, body mass index (BMI) ≤ 39 kg/m2, American College of Rheumatology classification criteria for hip or knee OA, a Kellgren-Lawrence (KL) grade ≥ 2 in index joint (most painful hip or knee) as confirmed by the central reader, WOMAC Pain and WOMAC Physical Function scores of ≥ 5 at baseline, and a PGA-OA rating of “fair,” “poor,” or “very poor” at baseline. A documented history of inadequate pain relief or intolerance to standard OA pain treatment (inadequate pain relief with acetaminophen, tramadol, or non-tramadol opioid analgesic; intolerance or contraindication to tramadol or non-tramadol opioid; or unwillingness to take a non-tramadol opioid analgesic) was also required. Finally, participants were required to be receiving treatment with a qualifying NSAID regimen averaging ≥ 5 days per week during the 30 days prior to the screening visit. Exclusion criteria included radiographic evidence in any joint of prespecified bone or joint conditions (e.g., destructive arthropathy characteristic of rapidly progressive OA [RPOA], atrophic OA, subchondral insufficiency fracture, primary osteonecrosis, or pathologic or stress fracture) as determined by a central musculoskeletal radiologist; history of osteonecrosis or osteoporotic fracture, or significant trauma or surgery to a knee, hip, or shoulder within the previous year; history or presence of clinically significant neurological, cardiovascular, or psychiatric disorders, cancer (except certain skin cancers), fibromyalgia, or sciatica; oral or intramuscular corticosteroid within 30 days or intra-articular corticosteroid injection in the index joint within 12 weeks or in any other joint within approximately 30 days of the IPAP; or intraarticular hyaluronic acid injection in the index joint within 30 days or long-acting hyaluronic acid formulation injection in the index joint within approximately 18 weeks of the IPAP.
Interventions
Participants received a stable, open-label, dosing regimen of oral NSAID (naproxen 500 mg twice daily [BID], celecoxib 100 mg BID, or diclofenac extended-release 75 mg BID) for at least the last 2 weeks of the screening period (participants receiving stable naproxen, celecoxib, or diclofenac prior to screening received the same NSAID during screening, while those receiving other stable NSAIDs prior to screening were assigned to naproxen, celecoxib, or diclofenac at the investigator’s discretion during screening). After the screening period ended, participants were randomized in a 1:1:1 manner to tanezumab 2.5 mg SC plus oral placebo, tanezumab 5 mg SC plus oral placebo, or oral NSAID (the same NSAID regimen received during screening) plus SC placebo. SC study medication was administered by site staff every 8 weeks (up to 7 doses) through week 56. Participants self-administered oral study medication BID for up to 56 weeks.
Except for the 24-h period prior to study visits, the rescue medication (acetaminophen) was permitted for participants requiring additional pain relief at doses ≤ 3000 mg/day for up to 3 days/week up to week 16 and then daily after week 16. The use of non-assigned NSAIDs was prohibited through week 64, but occasional use of other analgesics was permitted for self-limiting conditions unrelated to OA.
To continue receiving SC study medication beyond week 16, participants had to meet pre-specified efficacy criteria including a ≥ 15% reduction in WOMAC Pain subscale score from baseline to weeks 2, 4, or 8, and a ≥ 30% reduction in WOMAC Pain score from baseline in the index joint at week 16. Participants not meeting both these criteria were discontinued (reason defined as “met pain criteria for discontinuation”) from the treatment period at week 16 and entered a 24-week early termination follow-up period.
Efficacy measures
Participants completed WOMAC and PGA-OA questionnaires at baseline and weeks 2, 4, 8, 16, 24, 32, 40, 48, 56, and 64 [13]. The 1-item PGA-OA assesses current overall OA status on a 5-point Likert scale from 1 = very good to 5 = very poor. Change from baseline was analyzed and presented for these measures at each assessment timepoint up to week 56; Week 64 data were only used to assess response following treatment discontinuation.
Participants used an electronic diary (eDiary) to assess average pain in the index joint over the past 24 h (on a numeric rating scale [NRS] from 0 = no pain to 10 = worst possible pain) daily to week 16 and then weekly until week 80. Change from baseline was analyzed for days 1–7 and for weeks 1, 2, 3, 4, 6, 8, 10, 12, 16, 20, 24, 32, 40, 48, and 56; Week 64 data (not shown) were only used to assess response following treatment discontinuation.
The proportion of participants achieving ≥ 30% (moderate), ≥ 50% (substantial), ≥ 70%, or ≥ 90% improvement from baseline in WOMAC Pain, WOMAC Physical Function, and the proportion meeting Outcome Measures in Rheumatology-Osteoarthritis Research Society International (OMERACT-OARSI) treatment response criteria (see Fig. 5 footnote for definition) was analyzed at weeks 2, 4, 8, 16, 24, 32, 40, 48, and 56 [14,15,16]. The proportion of patients achieving ≥ 30%, ≥ 50%, ≥ 70%, or ≥ 90% improvement from baseline in average pain in the index joint was analyzed at weeks 1, 2, 3, 4, 6, 8, 10, 12, 16, 20, 24, 32, 40, 48, and 56. To characterize time course and long-term maintenance of effect, these responder data are presented for weeks 2, 4, 8, 16, and 56. The thresholds of improvement of ≥ 30% and ≥ 50% were evaluated since they represent moderate and substantial, respectively, thresholds of clinically meaningful improvement in patients with chronic pain [14]. In order to provide a more complete assessment of treatment efficacy, we also evaluated thresholds of ≥ 70% and ≥ 90% to explore whether even greater levels of improvement can be achieved with tanezumab or NSAID treatment.
Minimum Clinically Important Improvement (MCII) and Patient Acceptable Symptom State (PASS) are categorical patient-reported outcomes that aim to define treatment response at the individual/patient level. MCII is defined as the smallest change in measurement that signifies an important improvement in patient symptoms and PASS is defined as the value beyond which participants consider themselves well. Both endpoints were assessed at weeks 16 and 56 and were based on objective response criteria (changes in average pain in the index joint, WOMAC Physical Function, and PGA-OA scores; see footnote to Supplementary Fig. 1 for explanation of response criteria). The proportion of participants achieving an early sustained MCII or PASS response, defined as meeting the respective criteria at weeks 4 through 16, was also assessed. The week 4 timepoint was chosen based on tanezumab pharmacokinetics (achievement of steady-state) and since it was the first timepoint where a large proportion of participants experienced meaningful (≥ 30%) symptom improvement, and week 16 was chosen since it was the pre-specified primary efficacy timepoint and included limits on rescue medication up to this timepoint.
Participants recorded rescue medication use in the eDiary daily through week 16 and then weekly through week 80. The incidence of rescue medication use and the mean/median number of days of rescue medication use per week were analyzed and presented for weeks 2, 4, 8, 16, 24, 32, 40, 48, and 56. The amount (mg) of rescue medication taken was analyzed for weeks 2, 4, 8, and 16 only, since the amount of rescue medication was not assessed daily after week 16.
The modified Patient-Reported Treatment Impact (mPRTI) questionnaire and the Treatment Satisfaction Questionnaire for Medication V.II (TSQM) were completed at weeks 16 and 56 to assess study treatment preference and satisfaction.
Safety measures
Musculoskeletal and neurological examinations, monitoring of AEs, and review of joint pain scores were conducted by investigators throughout the study. Radiographs of the bilateral hips, knees, and shoulders (obtained at screening and weeks 24, 56, and 80) were evaluated by trained central readers to monitor for possible joint safety events. Possible joint safety events, identified post-screening, were adjudicated by a blinded external committee of experts, and prespecified joint safety events were included in a composite joint safety endpoint (Supplementary Text 1). Full details of the safety measures used in the study and their results have been published previously [10].
Statistical methods
The sample size was established, primarily, to have a high probability of observing participants with any component of the primary composite joint safety endpoint, assuming a low event rate, rather than for efficacy assessments. However, a sample size of approximately 1000 participants per group was estimated to provide 76% power for comparison of the co-primary efficacy endpoints (change in WOMAC Pain, WOMAC Physical Function, and PGA-OA at week 16) for both tanezumab groups versus the NSAID group. The intent to treat the population (all randomized participants who received ≥ 1 dose of SC study medication) was the primary analysis set for efficacy and safety.
Changes from baseline in WOMAC Pain, WOMAC Physical Function, and PGA-OA were prespecified co-primary (week 16) or secondary (other study weeks) efficacy endpoints. All other assessments were prespecified secondary endpoints except for analyses of responder rates for MCII, PASS, and average pain in the index joint, which were exploratory post hoc endpoints.
As reported previously, the co-primary and key secondary (proportion of participants with ≥ 50% improvement in WOMAC Pain at week 16) endpoints were included in a multiple testing procedure to control the family-wise type 1 error, using a graphical gatekeeping strategy [10]. Since the tanezumab 5 mg dose failed to achieve statistical significance for one co-primary endpoint (PGA-OA at week 16), hypothesis testing of the tanezumab 2.5 mg dose for the co-primary endpoints and of both doses for the key secondary endpoint could not be performed [10]. Since the objective of the current manuscript is to evaluate the time course of treatment effect and clinical importance of response using a mixture of primary, key secondary, other secondary, and post hoc endpoints, data in this manuscript are presented with unadjusted p values for directional guidance and consistency across timepoints.
An analysis of covariance (ANCOVA) model was used for the analysis of change from baseline in WOMAC Pain, WOMAC Physical Function, PGA-OA, and average pain in the index joint scores with a multiple imputation approach for missing data (dependent on the reason for missing data). Responder rates for WOMAC Pain, WOMAC Physical Function, average pain in the index joint, OMERACT-OARSI, MCII, and PASS were analyzed using logistic regression with a mixed baseline-observation-carried-forward (BOCF)/last-observation-carried-forward (LOCF) approach to missing data. Rescue medication use was analyzed using logistic regression (incidence) or negative binomial (amount and number of days) models with a LOCF approach to missing data. mPRTI (treatment preference) scores were analyzed using a Cochran-Mantel-Haenszel test (stratified by combinations of index joint, highest KL grade, and NSAID) using observed data. TSQM (treatment satisfaction) scores were analyzed using an ANCOVA model with observed data.