Introduction
The Atlas Trial, a randomized controlled trial designed to “determine whether treatment with a continence pessary is as effective as behavioural therapy for reducing SUI 12 weeks after randomization.” [1] is perhaps, the most frequently cited trial when the effectiveness of kegels exercises versus continence pessaries is discussed. This was a randomized trial that included three treatment arms: incontinence pessaries; behavioural therapy (consisted primarily of pelvic floor muscle training with the addition of skills for active use of the pelvic muscles to prevent stress incontinence); combination therapy. Two primary outcomes used in the study were the Patient Global Impression of Improvement (PGI-I) and the Urogenital Distress Inventory- stress incontinence subscale (UDI-SI). Two secondary outcomes were the number of incontinence episodes on a 7-day bladder diary and a Patient Satisfaction Questionnaire. The sample size calculation was based upon the PGI-I. The weighting that would be given to the results of each of the outcome measures, relative to the results of the other outcome measures was not discussed.
In the conclusion section of the paper abstract, the authors state that the results of the trial, published in 2010, found that “Behavioural therapy resulted in greater patient satisfaction and fewer bothersome incontinence symptoms than pessary at 3 months, but the differences did not persist at 12 months.” [2] In their discussion of the findings, the authors opine, “Thus, the 1-year data support the consideration of pessary as a reasonable alternative for women wishing to avoid or defer stress incontinence surgery and not interested in or able to adhere to behavioural therapy.” [2] This statement implicitly establishes a hierarchy of efficacy that places surgery as preeminent, followed by pelvic exercises and finally pessaries. It is the contention of this review that this hierarchy does not reflect the authors’ findings in the Atlas study. Favouring the findings of the intention to treat (ITT) analysis over those of the per-protocol (PP) analysis was a disservice to the evidence from the study and does not fairly represent the outcomes that can be expected when the specific treatment interventions are followed as they are prescribed.
While acknowledging that the ATLAS Trial represents an excellent effort to answer the question its authors posed, faults with the design, analysis, interpretation and reporting of this study compromised the integrity and verity of the reported findings.
Analysis
The data
The randomization process resulted in groups with baseline characteristics that were acceptably similar. However, after randomization, drop-out and lost to follow-up rates were significantly different between the groups; behavioural 15%, pessary 26% and combined treatment 12%. The loss to follow-up group in the pessary arm deserves more careful examination.
There were two subgroups in the loss to follow-up category in the pessary group of particular interest. These two subgroups were listed in Table 2 as follows: unwilling to continue to participate and wanted other treatment arm. These two groups of participants, the first group did not engage fully in the study and the second group consisted of participants who did not receive any treatment before withdrawal, did not receive any meaningful therapy. This represents 15% of the participants in the pessary group. Despite this very obvious disqualification from analysis, in the intention to treat analysis, these participants were classified as failures for all of the outcome measures.
The analysis
Let’s consider the methods of outcome analysis used in the ATLAS study; intention-to-treat and per-protocol. The intention to treat analysis included all participants in their original group, regardless of whether or not they received alternate therapy or withdrew or were lost to follow-up (these participants were considered failures in the intention to treat analysis). The per-protocol analysis used the data from only those participants who adhered to the assigned treatment.
When should these analysis approaches be used and what can be expected from the analysis?
The ITT approach is often considered to apply the most stringent requirements to a study analysis: it protects the randomization effect of circumventing selection bias by ensuring that participant characteristics that could affect the study outcome are equally distributed in the groups. But by analyzing the results based solely on assignment group regardless of whether the participant received the assigned treatment, “The ITT provides ‘an (unbiased) answer’ only to one question, specifically, ‘What are the expected outcomes for a typical patient instructed, in the context of the trial, to take the treatment to which he was assigned?’ In summary, the ITT is not indicating what happens when the intervention is actually used” [5]
On the other hand, with the PP analysis approach, only those participants who adhered to the assigned treatment protocol without violations are analyzed. The PP approach is more likely to detect a true difference between the experimental and control groups. Detecting a treatment effect is only possible when compliance or adherence to the protocol is optimal, as in the PP analysis. For readers of any research report, whether they be healthcare professionals or patients, it is expected that the reported results represent the outcomes of the treatment effects of the interventions.
Unfortunately, in the Atlas study, the authors focused their discussion on the results from the ITT analysis. This analysis was compromised by attribution bias that occurred when lost to follow-up participants were considered failures. There was a disproportionate number of losses to follow-up in the pessary group and a closer examination of these losses shows that they were made up of
- Women who were not successfully fitted with a pessary
- Women who wanted the other treatment arm after randomization
- Women who were unwilling to continue to participate in the study
The problem with inclusion of women in the first two categories is that they do not represent failure of the intervention since the intervention was never used by these women.
In addition to attribution bias, the authors emphasis on the results of the ITT analysis was incorrect given their study design. ITT analysis is indicated where the study is looking to find a difference between interventions (a superiority study). If there is a high rate of loss to follow-up, the strategy used to treat these cases has the potential to misrepresent to true effect of the study interventions within each group, and if the loss to follow-up is significantly higher in one group vs the other(s), how these cases are treated ( as successes or failures) could significantly bias the outcome of the treatment effect comparison as we have pointed out in the discussion above.
Where a study is aiming to test if two treatments are equally effective (an equivalence study), both ITT and PP should be examined and should agree before a conclusion is drawn. As Catalin Tufanaru states, “I declare that there are serious limitations of ITT for informing healthcare practice. I urge systematic reviewers and users of systematic reviews to acknowledge the real necessity of appropriate, unbiased, statistically valid analyses of the effects of the actual treatments received.”
The Atlas trial was an equivalence study. As such, the ITT and PP analyses should have agreed before a conclusion of superiority or non-inferiority was made. As is evident from the reported results of the Atlas study, the two analyses did not agree; the ITT analysis found a difference while the PP analysis did not.
The fact that the authors reported that their trial found a difference between the treatment arms is problematic. Given the paucity of research comparing pessaries to other interventions, this study has prompted a wide-spread assumption of superiority of pelvic exercises to pessaries. The authors did not address the relative merits of an ITT versus PP analysis in the discussion section of their study report. We invite the authors to undertake a fulsome discussion of the implications of their study given the shortcomings that we have identified.
About the author
Dr. Scott A. Farrell is a Professor Emeritus of Obstetrics and Gynecology at Dalhousie University and a leading expert in women’s pelvic health. Over his 40-year clinical and academic career, he trained countless physicians and published extensively on urogynecology, pelvic floor disorders, and women’s continence care. Dr. Farrell has served as President of the Society of Obstetricians and Gynaecologists of Canada and has been recognized with multiple teaching and research awards. He also holds several patents for medical innovations, including devices that improve the treatment of urinary incontinence.