Development of treatments for rare diseases is challenging due to the limited number of patients available for participation. Learning about treatment effectiveness with a view to treat patients in the larger outside population, as in the traditional fixed randomised design, may not be a plausible goal. An alternative goal is to treat the patients within the trial as effectively as possible. Using the framework of finite-horizon Markov decision processes and dynamic programming (DP), a novel randomised response-adaptive design is proposed which maximises the total number of patient successes in the trial. Several performance measures of the proposed design are evaluated and compared to alternative designs through extensive simulation studies. For simplicity, a two-armed trial with binary endpoints and immediate responses is considered. However, further evaluations illustrate how the design behaves when patient responses are delayed, and modifications are made to improve its performance in this more realistic setting.
Simulation results for the proposed design show that: (i) the percentage of patients allocated to the superior treatment is much higher than in the traditional fixed randomised design; (ii) relative to the optimal DP design, the power is largely improved upon and (iii) the corresponding treatment effect estimator exhibits only a very small bias and mean squared error. Furthermore, this design is fully randomised which is an advantage from a practical point of view because it protects the trial against various sources of bias.
Overall, the proposed design strikes a very good balance between the power and patient benefit trade-off which greatly increases the prospects of a Bayesian bandit-based design being implemented in practice, particularly for trials involving rare diseases and small populations.
Keywords: Clinical trials; Rare diseases; Bayesian adaptive designs; Sequential allocation; Bandit models; Dynamic programming; Delayed responses.