Statistical Models, Design and Data Analysis:
A
Conference to Celebrate Anthony Atkinson's 70th Birthday
ABSTRACTS
and SLIDES
Design of Dose-escalation Trials
R. A. Bailey
Queen Mary,
ABSTRACT
In
one form of dose-escalation trial, several cohorts of subjects are recruited.
Each cohort takes part at a different time period. The doses are ordinally
labelled 0,1,..., where 0 denotes placebo. Because higher doses may have more
adverse side-effects, no subject can be exposed to dose i until some information is obtained about the effect of dose i -1.
One
possibility is to use dose i for
everyone in cohort i. Then there is
no blinding; moreover, dose effects are completely confounded with cohort
effects and period effects. A modification of this uses a certain number of
placebo subjects in each cohort. If there are no cohort effects then the
proportion of placebo in each cohort should be such that the design is
equireplicate if it proceeds to the planned largest dose. If there are cohort
effects, then more precise comparisons between doses can be made if half of
each cohort receives placebo.
I
shall discuss a new design that does at least as well as both of these, whether
or not there are cohort effects, and whether or not within-cohort information
is combined with between-cohort information.
Dimension Reduction Paradigms for Regression
R. D. Cook
ABSTRACT
Dimension
reduction for regression, represented primarily by principal components, is
ubiquitous in the applied sciences. This is an old idea that has moved to a
position of prominence in recent years because technological advances now allow
scientists to routinely formulate regressions in which the number p of predictors is considerably larger
than in the past. Although large p
regressions are perhaps mainly responsible for renewed interest, dimension
reduction methodology can be useful regardless of the size of p.
Starting with a little history and a definition of
sufficient reductions, we will consider a variety of models for dimension
reduction in regression. The models start from one in which maximum likelihood
estimation produces principal components, step along a few incremental
expansions, and end with forms that have the potential to improve on some
standard methodology. This development provides remedies for two concerns that
have dogged principal components in regression: principal components are
typically computed from the predictors alone and then do not make apparent use
of the response, and they are not invariant under full rank linear
transformation of the predictors.
Penalized
Designs of Multi-response Experiments
V. V. Fedorov
ABSTRACT
The major theme of the
presentation is optimal design of dose-response experiments when several
potentially correlated responses are observed simultaneously for every
experimental unit. While the narration is built around two endpoints, the
generalization for higher dimensions is also discussed. I also address some
ethical aspects of dose response experiments. In the traditional optimal design
setting one tries to gain as much information as possible without explicit
concern about patients in the trial, i.e. doing what is best for the targeted
population (collective ethics). The currently popular procedures gravitate to
individual ethics: doing what is best (accordingly to current knowledge) for a
newly arriving patient. To compromise between these two extremes we maximize
information per unit of a penalty, which depends on efficacy and toxicity.
Necessary and sufficient conditions, algorithms and software are developed and
discussed for locally optimal, composite and adaptive designs.
Doubly
Adaptive Designs for Ethical Allocation of Treatments in Clinical Trials
A. Giovagnoli
ABSTRACT
A large class of designs for clinical trials in the
statistical literature arise as follows: an optimality criterion is defined, a
target allocation is found, namely an allocation of the treatments to be
compared which appears to be satisfactory according to the prescribed
criterion, and an experiment is identified which approaches the target. Usually
the chosen design is tested by means of simulations.
The nature of the
criterion may vary. For instance, from an ethical point of view one may want to
minimize the number of patients that are allocated to inefficient treatments or the expected number of failures, or the total
number of patients in the trial; on the other hand, correct statistical
inference is interested in maximizing the precision of the statistical tools,
e.g. the estimation of treatment differences or the power of the usual
statistical tests. These objects often clash, and the conflict between ethics and
information is one of the main problems of clinical experimentation.
Furthermore, whether the criterion be ethical or inferential, the derived
target allocation in general depends on the unknown parameters of the
statistical models and thus is unknown.
A possible
solution is to proceed sequentially, in order to redress assignments to the
unknown target as we go along. Sequential experiments are said to be adaptive
if the observed responses are used to modify the experiment along the way; it seems
natural to call doubly adaptive all the designs where the past design
history too is taken into account.
After a review of the properties of existing adaptive
and doubly adaptive designs, in this talk a suggestion is put forward of a
class of compound criteria that allow one to choose the specific weights of the
two components - ethics and information. A related target is derived, and a
class of suitable doubly adaptive procedures is introduced, based on sequential
Maximum Likelihood estimation, which can be proved to be asymptotically
optimal.
The relative weights
will be hard to decide in advance, especially when they themselves appear to
depend on the state of nature. The method can be extended to include a
compromise function with adaptive weights, and a design that at each step
readjusts the relative importance of ethics and information on the basis of the
knowledge acquired up to that time.
Bibliography
Aickin, M. (2001). Randomization,
balance, and the validity and efficiency of design-adaptive allocation methods.
Journal of Statistical Planning and
Inference, 94, 97-119.
Baldi Antognini, A. and
Giovagnoli, A. (2005). On the large sample optimality of sequential designs for
comparing two or more treatments. Sequential
Analysis, 24(2), 205-217.
Bandyopadhyay,
U., Biswas, A. (2000). A class of adaptive designs. Sequential Analysis, 19, 45-62.
Eisele, J.R. (1994). The doubly
adaptive biased coin design for sequential clinical trials. Journal of Statistical Planning and
Inference, 38, 249-62.
Geraldes, M., Melfi, V., Page, C.
and Zhang, H. (2006). The doubly adaptive weighted difference design. Journal of Statistical Planning and
Inference, 136, 1923-11939.
Hardwick, J. (1995). A modified
bandit as an approach to ethical allocation in clinical trials. In: Adaptive Designs (eds. N. Flournoy and
W.F. Rosenberger), Institute of Mathematical Statistics, Hayward, CA, 65-87.
Hu, F., Rosenberger, W.F. (2006). The Theory of Response Adaptive
Randomization in Clinical trials. Wiley N.Y.
Hu, F., Rosenberger, W.F. and
Zhang, L.X. (2006). Asymptotically best response-adaptive randomization
procedures. Journal of Statistical
Planning and Inference, 136, 1911 - 1922.
Rosenberger, W.F., Stallard, N.,
Ivanova, A., Harper, C.N. and Ricks, M.L. (2001). Optimal adaptive designs for
binary response trials. Biometrics,
57, 909-913.
Rosenberger, W.F. and Lachin,
J.M. (2002) Randomization in Clinical
Trials, Wiley.
Royall,
R.M. (1991). Ethics and Statistics in
Randomized Clinical Trials. Statistical Science, Vol. 6, No.
1., pp. 52-62.
Tymofyeyev Y., Rosenberger, W.F.
and Hu, F. (2007). Implementing optimal allocation in sequential binary
response experiments. Journal of the
American Statistical Association, 102, 224-234.
Wei, L.J. and Durham,
S. (1978). The randomized
play-the-winner rule in medical trials. Journal
of the American Statistical Association, 73, 840-843.
Zelen, M. (1969). Play-the-winner
rule and the controlled clinical trials. Journal
of the American Statistical Association, 64, 131-146.
15 Years of Joint Research with Anthony C. Atkinson
M. Riani(1), A. Cerioli(1), F. Laurini(1)
and A. Corbellini(2)
(1)
ABSTRACT
The purpose
of this talk is to summarize joint research between the
Recent
developments in this field of research concern the introduction of new tools
for identifying the number of clusters in complex data and for confirming their
cluster membership (Atkinson, Riani and Cerioli, 2006; Atkinson and Riani,
2007), new theoretical arguments for constructing the envelopes of the
statistics during the forward search (Atkinson and Riani, 2006) and an
automatic procedure for outlier detection, which takes into account simultaneity
(Riani, Atkinson and Cerioli, 2007), provides tests with good size and shows
better power than traditional existing methods, like MCD.
One current direction of joint research with Anthony
concerns a modification of AIC for robust model selection in regression
(Atkinson and Riani, 2008) and for time series. A second one is an automatic
procedure for robust cluster analysis which does not necessarily force all
units to be clustered and can cope with highly asymmetric data and/or high
overlapping density regions.
Bibliography
Atkinson,
A.C. and Riani, M. (2000). Robust
Diagnostic Regression Analysis.
Atkinson,
A.C. and Riani, M. (2002). Forward search added variable t tests and the effect
of masked outliers on model selection. Biometrika
89, pp. 939-946.
Atkinson,
A.C. and Riani, M. (2006). Distribution theory and simulations for tests of
outliers in regression. Journal of
Computational and Graphical Statistics, pp. 1-17.
Atkinson,
A.C. and Riani, M. (2007). Exploratory tools for clustering multivariate data. Computational Statistics and Data Analysis.
(doi:10.1016/j.csda.2006.12.034).
Atkinson,
A.C. and Riani, M. (2008). A robust and diagnostic information criterion for
selecting regression models. Journal of
the Japanese Statistical Society. To appear.
Atkinson,
A.C., Riani, M. and Cerioli, A. (2004). Exploring
Multivariate Data with the Forward Search.
Atkinson,
A.C., Riani, M. and Cerioli, A. (2006). Random start forward searches with
envelopes for detecting clusters in multivariate data. In: Zani, S., Cerioli,
A., Riani, M., Vichi, M., editors, Data
Analysis, Classification and the Forward Search.
Riani,
M., Atkinson, A.C. and Cerioli, A. (2007). Results in finding an unknown number
of multivariate outliers in large data sets. Research Report 140, LSE,
Department of Statistics
Quantiles,
Expectiles and Splines
G. De Rossi and A. Harvey
ABSTRACT
A
time-varying quantile can be fitted to a sequence of observations by
formulating a time series model for the corresponding population quantile and
iteratively applying a suitably modified state space signal extraction
algorithm. It is shown that such time-varying quantiles satisfy the defining
property of fixed quantiles in having the appropriate number of observations
above and below. Expectiles are similar to quantiles except that they are
defined by tail expectations. Like quantiles, time-varying expectiles can be
estimated by a state space signal extraction algorithm and they satisfy
properties that generalize the moment conditions associated with fixed
expectiles. Time-varying quantiles and expectiles provide information on
various aspects of a time series, such as dispersion and asymmetry, while
estimates at the end of the series provide the basis for forecasting. Because
the state space form can handle irregularly spaced observations, the proposed
algorithms can be easily adapted to provide a viable means of computing
spline-based non-parametric quantile and expectile regressions.
An Optimal Scanning Sensor Activation Policy for Parameter
Estimation of Distributed Systems
D. Uciński
ABSTRACT
A method is developed to solve an optimal node activation problem
in sensor networks whose measurements are supposed to be used to estimate
unknown parameters of the underlying process model which is in the form of a
partial differential equation. Given a partition of the observation horizon
into a finite number of consecutive intervals, the problem is set up to select
nodes which will be active over each interval while the others will remain
dormant. The optimal solution maximizes the log-determinant of the resulting
Fisher information matrix associated with the estimated parameters. The search
for this solution is performed using a branch-and-bound method in which an
extremely simple and efficient technique is employed to produce an upper bound
to the maximum objective function. The idea is to solve a relaxed problem
through the application of a simplicial decomposition algorithm in which the
restricted master problem is solved using a multiplicative algorithm for
D-optimal design. Additional insight on the performance of the technique is
provided by the results of simulations.