Statistical Models, Design and Data Analysis:
A Conference to Celebrate Anthony Atkinson's 70th Birthday
ABSTRACTS and SLIDES
Design of Dose-escalation Trials
R. A. Bailey
In one form of dose-escalation trial, several cohorts of subjects are recruited. Each cohort takes part at a different time period. The doses are ordinally labelled 0,1,..., where 0 denotes placebo. Because higher doses may have more adverse side-effects, no subject can be exposed to dose i until some information is obtained about the effect of dose i -1.
One possibility is to use dose i for everyone in cohort i. Then there is no blinding; moreover, dose effects are completely confounded with cohort effects and period effects. A modification of this uses a certain number of placebo subjects in each cohort. If there are no cohort effects then the proportion of placebo in each cohort should be such that the design is equireplicate if it proceeds to the planned largest dose. If there are cohort effects, then more precise comparisons between doses can be made if half of each cohort receives placebo.
I shall discuss a new design that does at least as well as both of these, whether or not there are cohort effects, and whether or not within-cohort information is combined with between-cohort information.
Dimension Reduction Paradigms for Regression
R. D. Cook
Dimension reduction for regression, represented primarily by principal components, is ubiquitous in the applied sciences. This is an old idea that has moved to a position of prominence in recent years because technological advances now allow scientists to routinely formulate regressions in which the number p of predictors is considerably larger than in the past. Although large p regressions are perhaps mainly responsible for renewed interest, dimension reduction methodology can be useful regardless of the size of p.
Starting with a little history and a definition of sufficient reductions, we will consider a variety of models for dimension reduction in regression. The models start from one in which maximum likelihood estimation produces principal components, step along a few incremental expansions, and end with forms that have the potential to improve on some standard methodology. This development provides remedies for two concerns that have dogged principal components in regression: principal components are typically computed from the predictors alone and then do not make apparent use of the response, and they are not invariant under full rank linear transformation of the predictors.
Penalized Designs of Multi-response Experiments
V. V. Fedorov
The major theme of the presentation is optimal design of dose-response experiments when several potentially correlated responses are observed simultaneously for every experimental unit. While the narration is built around two endpoints, the generalization for higher dimensions is also discussed. I also address some ethical aspects of dose response experiments. In the traditional optimal design setting one tries to gain as much information as possible without explicit concern about patients in the trial, i.e. doing what is best for the targeted population (collective ethics). The currently popular procedures gravitate to individual ethics: doing what is best (accordingly to current knowledge) for a newly arriving patient. To compromise between these two extremes we maximize information per unit of a penalty, which depends on efficacy and toxicity. Necessary and sufficient conditions, algorithms and software are developed and discussed for locally optimal, composite and adaptive designs.
Doubly Adaptive Designs for Ethical Allocation of Treatments in Clinical Trials
A large class of designs for clinical trials in the statistical literature arise as follows: an optimality criterion is defined, a target allocation is found, namely an allocation of the treatments to be compared which appears to be satisfactory according to the prescribed criterion, and an experiment is identified which approaches the target. Usually the chosen design is tested by means of simulations.
The nature of the criterion may vary. For instance, from an ethical point of view one may want to minimize the number of patients that are allocated to inefficient treatments or the expected number of failures, or the total number of patients in the trial; on the other hand, correct statistical inference is interested in maximizing the precision of the statistical tools, e.g. the estimation of treatment differences or the power of the usual statistical tests. These objects often clash, and the conflict between ethics and information is one of the main problems of clinical experimentation. Furthermore, whether the criterion be ethical or inferential, the derived target allocation in general depends on the unknown parameters of the statistical models and thus is unknown.
A possible solution is to proceed sequentially, in order to redress assignments to the unknown target as we go along. Sequential experiments are said to be adaptive if the observed responses are used to modify the experiment along the way; it seems natural to call doubly adaptive all the designs where the past design history too is taken into account.
After a review of the properties of existing adaptive and doubly adaptive designs, in this talk a suggestion is put forward of a class of compound criteria that allow one to choose the specific weights of the two components - ethics and information. A related target is derived, and a class of suitable doubly adaptive procedures is introduced, based on sequential Maximum Likelihood estimation, which can be proved to be asymptotically optimal.
The relative weights will be hard to decide in advance, especially when they themselves appear to depend on the state of nature. The method can be extended to include a compromise function with adaptive weights, and a design that at each step readjusts the relative importance of ethics and information on the basis of the knowledge acquired up to that time.
Aickin, M. (2001). Randomization, balance, and the validity and efficiency of design-adaptive allocation methods. Journal of Statistical Planning and Inference, 94, 97-119.
Baldi Antognini, A. and Giovagnoli, A. (2005). On the large sample optimality of sequential designs for comparing two or more treatments. Sequential Analysis, 24(2), 205-217.
Bandyopadhyay, U., Biswas, A. (2000). A class of adaptive designs. Sequential Analysis, 19, 45-62.
Eisele, J.R. (1994). The doubly adaptive biased coin design for sequential clinical trials. Journal of Statistical Planning and Inference, 38, 249-62.
Geraldes, M., Melfi, V., Page, C. and Zhang, H. (2006). The doubly adaptive weighted difference design. Journal of Statistical Planning and Inference, 136, 1923-11939.
Hardwick, J. (1995). A modified bandit as an approach to ethical allocation in clinical trials. In: Adaptive Designs (eds. N. Flournoy and W.F. Rosenberger), Institute of Mathematical Statistics, Hayward, CA, 65-87.
Hu, F., Rosenberger, W.F. (2006). The Theory of Response Adaptive Randomization in Clinical trials. Wiley N.Y.
Hu, F., Rosenberger, W.F. and Zhang, L.X. (2006). Asymptotically best response-adaptive randomization procedures. Journal of Statistical Planning and Inference, 136, 1911 - 1922.
Rosenberger, W.F., Stallard, N., Ivanova, A., Harper, C.N. and Ricks, M.L. (2001). Optimal adaptive designs for binary response trials. Biometrics, 57, 909-913.
Rosenberger, W.F. and Lachin,
J.M. (2002) Randomization in Clinical
Royall, R.M. (1991). Ethics and Statistics in Randomized Clinical Trials. Statistical Science, Vol. 6, No. 1., pp. 52-62.
Tymofyeyev Y., Rosenberger, W.F. and Hu, F. (2007). Implementing optimal allocation in sequential binary response experiments. Journal of the American Statistical Association, 102, 224-234.
Wei, L.J. and Durham, S. (1978). The randomized play-the-winner rule in medical trials. Journal of the American Statistical Association, 73, 840-843.
Zelen, M. (1969). Play-the-winner rule and the controlled clinical trials. Journal of the American Statistical Association, 64, 131-146.
15 Years of Joint Research with Anthony C. Atkinson
M. Riani(1), A. Cerioli(1), F. Laurini(1) and A. Corbellini(2)
of this talk is to summarize joint research between the
Recent developments in this field of research concern the introduction of new tools for identifying the number of clusters in complex data and for confirming their cluster membership (Atkinson, Riani and Cerioli, 2006; Atkinson and Riani, 2007), new theoretical arguments for constructing the envelopes of the statistics during the forward search (Atkinson and Riani, 2006) and an automatic procedure for outlier detection, which takes into account simultaneity (Riani, Atkinson and Cerioli, 2007), provides tests with good size and shows better power than traditional existing methods, like MCD.
One current direction of joint research with Anthony concerns a modification of AIC for robust model selection in regression (Atkinson and Riani, 2008) and for time series. A second one is an automatic procedure for robust cluster analysis which does not necessarily force all units to be clustered and can cope with highly asymmetric data and/or high overlapping density regions.
A.C. and Riani, M. (2000). Robust
Diagnostic Regression Analysis.
Atkinson, A.C. and Riani, M. (2002). Forward search added variable t tests and the effect of masked outliers on model selection. Biometrika 89, pp. 939-946.
Atkinson, A.C. and Riani, M. (2006). Distribution theory and simulations for tests of outliers in regression. Journal of Computational and Graphical Statistics, pp. 1-17.
Atkinson, A.C. and Riani, M. (2007). Exploratory tools for clustering multivariate data. Computational Statistics and Data Analysis. (doi:10.1016/j.csda.2006.12.034).
Atkinson, A.C. and Riani, M. (2008). A robust and diagnostic information criterion for selecting regression models. Journal of the Japanese Statistical Society. To appear.
A.C., Riani, M. and Cerioli, A. (2004). Exploring
Multivariate Data with the Forward Search.
A.C., Riani, M. and Cerioli, A. (2006). Random start forward searches with
envelopes for detecting clusters in multivariate data. In: Zani, S., Cerioli,
A., Riani, M., Vichi, M., editors, Data
Analysis, Classification and the Forward Search.
Riani, M., Atkinson, A.C. and Cerioli, A. (2007). Results in finding an unknown number of multivariate outliers in large data sets. Research Report 140, LSE, Department of Statistics
Quantiles, Expectiles and Splines
G. De Rossi and A. Harvey
A time-varying quantile can be fitted to a sequence of observations by formulating a time series model for the corresponding population quantile and iteratively applying a suitably modified state space signal extraction algorithm. It is shown that such time-varying quantiles satisfy the defining property of fixed quantiles in having the appropriate number of observations above and below. Expectiles are similar to quantiles except that they are defined by tail expectations. Like quantiles, time-varying expectiles can be estimated by a state space signal extraction algorithm and they satisfy properties that generalize the moment conditions associated with fixed expectiles. Time-varying quantiles and expectiles provide information on various aspects of a time series, such as dispersion and asymmetry, while estimates at the end of the series provide the basis for forecasting. Because the state space form can handle irregularly spaced observations, the proposed algorithms can be easily adapted to provide a viable means of computing spline-based non-parametric quantile and expectile regressions.
An Optimal Scanning Sensor Activation Policy for Parameter Estimation of Distributed Systems
A method is developed to solve an optimal node activation problem in sensor networks whose measurements are supposed to be used to estimate unknown parameters of the underlying process model which is in the form of a partial differential equation. Given a partition of the observation horizon into a finite number of consecutive intervals, the problem is set up to select nodes which will be active over each interval while the others will remain dormant. The optimal solution maximizes the log-determinant of the resulting Fisher information matrix associated with the estimated parameters. The search for this solution is performed using a branch-and-bound method in which an extremely simple and efficient technique is employed to produce an upper bound to the maximum objective function. The idea is to solve a relaxed problem through the application of a simplicial decomposition algorithm in which the restricted master problem is solved using a multiplicative algorithm for D-optimal design. Additional insight on the performance of the technique is provided by the results of simulations.