CLARK, I

From Biometrics Bulletin, 2001

CLARK, I. and HARPER, W.V. Practical Geostatistics 2000. Ecosse North America Llc, Columbus, Ohio, 2000. xii + 342 pp. $60.00 (with CD $100.00). ISBN 0-9703317-0-3.

As one major part of spatial statistics, geostatistics has a central question: ‘how can we estimate the likely value at an unsampled location given measured values at neighbouring sampled locations?’ (p. 247 of this book). That question is not only statistically engaging, but often eminently practical: in mining, knowing where to drill and not to drill is the key to success. Good answers to the question have been known for some decades, most simply to use a weighted average, but also to use weights reflecting the structure of spatial variation, especially the pattern of local dependence. This and other modern techniques have multiple roots, including South African mining geology and French mathematical statistics, although not surprisingly Kolmogorov worked on the problem several years before most other researchers.

Many if not most of the people who are interested in such techniques are non-statisticians, and there is a continuing need for texts at all levels. Isobel Clark and William Harper’s self-published book is a substantial revision of a shorter text published by Clark (1979), which received a mixed review in this journal from Sibson (Biometrics, 36, 743, 1980). Like its predecessor it aims to take its readers from no statistical knowledge to the rudiments of geostatistics. The audience is presumed competent in ordinary algebra with modest calculus and some matrix algebra: this mix evidently fits people like mining engineers, although I suspect that it is atypical of non-statisticians who might be interested in these techniques.

Almost all of the authors’ experience appears to be in mining geology, which provides most of their examples, although there are some from ecology and environmental science. They use ‘sample’ in the field scientist’s sense of ‘lump of material which is analysed’: thus ‘number of samples’ is what in statistics is normally ‘number in sample’. Only halfway through the text are spatial aspects touched on directly, after chapters ranging from basic summary measures and graphs to regression. The chapters on the simplest kinds of kriging (weighted averaging done smartly) are the most successful, although the sometimes very slow step-by-step explanations
would have been better followed by terse end-of-chapter summaries, and many statistical readers might regret the authors' advertised aversion to matrices (p.275). There are many detailed worked examples, and even those already familiar with geostatistical techniques may find useful hints on explaining them to others. Particularly welcome is the advice from hard-earned experience, as on p.215, which presents a useful list of cautionary notes. The last chapter pointing to more advanced techniques is, however, too compressed to be very useful.

Unfortunately, there are also many questionable sections and some outright errors. The authors make much more use of skewness and kurtosis than is common, without adequately warning of the many difficulties in interpreting such measures from small samples (or even large ones). At one point we are told that the skewness is positive ‘because the data is skewed to the lower values’ (p.21) and other similarly confusing statements are made. Fitting a three-parameter lognormal or Sichel's compound Poisson distribution to small samples I would classify as dangerous even for experts: it is surprising to see these tasks posed in several exercises, while fitting a lognormal to a counted variable with many zeros requires more comment (pp.86, 88). Using n-1 as divisor to get unbiased estimates of variance from a sample of n does not guarantee unbiased estimates of standard deviation, contrary to a repeated assertion. Depending on parameter values, the lognormal may be more skewed than the Weibull (p.86). To clarify some vexed historical questions: Gauss did not discover the Gaussian (p.31), and Poisson did not discover the Poisson (p.113) (de Moivre was there earlier in both cases), nor did Snedecor discover the F distribution (his notation honours Fisher) (p.138).

The graphics in the book are in many respects not state of the art. Most crucially, probability plots of observed quantiles versus normal quantiles use for the latter a probability of rank/n, so that the highest value for probability 1 is unplottable. Thus the highest observed value is never shown, especially alarming whenever, as often, it is an outlier. Some standard plotting position such as (rank-0.5)/n should have been used. Despite much effort, I have never found histograms with bars superimposed or juxtaposed, which are used in several examples, as effective for comparing distributions as quantile-quantile plots. More cosmetically, point symbols on plots are often coarse and filled and thus obliterate their neighbours unnecessarily; there is overuse of vibrating line shading and stippling; and many vertical axis titles are spelled out character by character.

The expectation is that readers are most likely to be using a spreadsheet, although many computer illustrations were produced using the authors’ own software: a demo for Geostokos Toolkit is included on an accompanying CD. There is no comment on what is best for the much bigger data sets said to be common in mining geology (‘Databases in RSA gold mines can run to millions of samples’: p.327).

A great advantage of publishing a book oneself is presumably freedom from the interference of publishers, referees or copy-editors with their own ideas of what is proper and correct. Independence allows the authors to exhibit their own broad senses of humour, and a breezy, jokey exuberance pervades, which will not be to all tastes. More seriously, outsiders do normally impart some quality control, often lacking here, not only in the mistakes and dubious content noted already, but also in several mangled or missing references, hundreds of typos (mostly trivial, to be sure) and occasionally messy typesetting of equations.

On the back cover, the authors warn that their book is ‘not intended for specialist mathematicians or statisticians’, who indeed would be better advised to consult some existing standard text such as that by Cressie (1993). Neither is this the book I would recommend first even to those in its target audience. Among various possibilities, Webster and Oliver (1990) is a well-balanced guide from authors who have statistical expertise and much experience with spatial data and in interacting with non-statisticians.

REFERENCES

Clark, I. (1979) Practical Geostatistics. London: Applied Science Publishers.

Cressie, N.A.C. (1993) Statistics for Spatial Data. New York: John Wiley.

Webster, R. and Oliver, M.A. (1990) Statistical Methods in Soil and Land Resource Survey. Oxford: Oxford University Press.

N. COX

Department of Geography
University of Durham, UK

Back to Reviews Page