"Does Geostatistics Work?", 16th International
APCOM Symposium, T.J. O'Neil (Ed),
DOES GEOSTATISTICS WORK?
Isobel Clark
Lecturer, Dept. of Mineral Resources Engineering,
Imperial
INTRODUCTION
The problem of the estimation of ore reserves is
essentially a simple one to describe.
Given a set of sample data, at specified locations within a deposit, how
can we estimate the value (or quantity) of ore in unsampled areas. The question may be posed
of a whole deposit (global) or it may be more desirable to estimate smaller
volumes within the planned mine area, e.g. stopes or blocks (local). Many methods have been evolved to tackle this
problem, from hand-contouring of values by the mine
geologist to extremely sophisticated geomathematical
analysis.
The main difficulty with any estimation method in
this context is that it is almost impossible to check whether the
"predicted" value bears a close relationship with the "actual"
value. The only reliable way to decide
if an estimation method has "worked" is to test the eventual value of
ore against the predicted value. This
is, more often than not, impossible - even allowing for the time factor
involved. To know the "actual"
value of a mined block it has to be sampled intensely.
and these values with their positions recorded
accurately. Alternatively, blocks may be passed through the concentration process individually and these results recorded. In most producing mines
neither of these alternatives are practicable or seen to be important. Another problem with comparing in-situ
(so-called) mineable reserves with production figures is the introduction of
many variables into the concentration process -- dilution, wastage, possible
stockpiling, mill recoveries and so on.
This paper describes a study undertaken to answer the
question "Does Geostatistics Work?". To remove ambiguity, Geostatistics here
refers to the practical applications of the Theory of Regionalised
Variables as developed by Georges Matheron in
Perhaps it would be best to begin by defining the question
with a little more precision. What is really meant by "work"? We have (say) a prediction of grade value for
some given volume. This will differ from
the "true" value over that volume.
An error is incurred by the estimation
procedure. The main selling point of
Geostatistics is that, whatever estimator you use, the "standard
error" of that estimate can be calculated using
the semi-variogram. Our question is,
then, does the "standard error" produced by geostatistical theory
really reflect the accuracy of the estimate produced. If the theory matches up with the practical
experience, then we can use geostatistical methods with confidence in other
applications. For example, if we know
that geostatistics measures the standard error of an estimate with accuracy, then we can be confident that the estimation technique
giving the smallest standard error (Kriging) must be the best one.
This paper attempts to answer the question "Does
geostatistics genuinely measure the accuracy of predicted values?".
CASE STUDY
The Mine
Geevor Tin Mines Ltd. in its present form was registered in 1911, although there is evidence that
mining in the Geevor lease area extends well back into the eighteenth
century. The Geevor and
The Data
The study which is described in this
paper was carried out on the Simms Lode of Geevor Mine. The cassiterite vein is almost vertical, and
its total developed length (at 1976) was just under
7,000 feet in a north-west/southeast direction.
The lode lies about a kilometre from Victory
Shaft, and comprises nine developed levels from 600 feet below surface to 1400
feet. The cassiterite vein is divided into several major sections by "guides"
-- quartz-filled faults -- and minor branchings. The area selected for this detailed study is
in the central portion of the lode, and it has been shown
by other studies (geological and statistical) that this area may be considered
as one continuous unit. Figure 1 shows a
section drawing of the study area. The
vein is narrow enough and continuous enough to be treated as a reasonably flat
plane in two dimensions, and all sections are drawn an this
understanding. The different modes of
shading merely denote the years at which the data was added
to the continuing study. Although some
development was available on level 14 (1400 feet below surface) this has been omitted for simplicity.

The lode is sampled by chipping
across the vein (but not the gangue) in the hangingwall
of the drive or the stope. The vein
averages about two feet wide, and the normal stoping width is forty
inches. The bagged sample is returned to
surface for vanning assay which produces a value for
'content of recoverable cassiterite'.
This is expressed in pounds of cassiterite per
ton of ore (lb/ton). The width of the
lode is measured in inches on site. Prior to March 1971 development drives were
sampled at five foot intervals, but this was changed to ten feet, and in 1976
to three metres.
In the study area all development except on
level 6 and for minor westward extensions on 10, 11 and 12 was completed prior
to March 1971. Thus, virtually all drive
sampling is available at five foot intervals. Stopes are sampled,
usually, at fortnightly intervals. This
produces a very approximate grid of samples within the
stoped area. Figure 2 shows an enlarged
subsection of levels 12 and 13. Each cross denotes a chip sample. Note how closely the drive is
sampled compared with the stope.
The sections were originally hand-drawn in the Survey
Office at Geevor. For this study each
sample was assigned two co-ordinates relative to an arbitrary origin in order
to position it an the two-dimensional plane. Figure 2 is actually a computer plot of the
sub-area shown. In all
there are about 2,700 stope samples and 1,900 drive samples in the study
area. It must be assumed that all
samples are comparable -- i.e. have the same
'support'.

It is common practice to value vein deposits such as
this using the "accumulations" of the samples. That is, the assay value is
multiplied by the width. This has not been done in the study described. The variable discussed throughout this paper
is "grade" or assay value.
Justification for this decision may be found in
SEMI-VARIOGRAMS
The semi-variogram is an essential tool in any
geostatistical analysis. It provides a
graphical and numerical measure of the "continuity" of the mineral
values within the deposit. In a normal reserve estimation only development samples would
be available to construct the "experimental" semivariograms. This reduces the calculation to a very simple
one, since the sampling is done at regular intervals along each drive, either
at five or ten foot, except for the occasional cross-cut
from the shaft. Experimental values can be calculated for any multiple of five feet. A decision has to be made
on how far to carry this calculation.
For various reasons, the experimental semi-variograms were
produced for distances up to 250 feet on each level in the lode
separately. At first glance
there seems to be little or no correlation between the assay values along the
drives, and vast differences between levels in the lode.
An immediate question arises as to whether these
differences between the levels are simply due to "sampling variation"
or whether they are due to some inherent structure in the lode. Some authorities (David, Guarascio)
have suggested that such differences may be due to the samples following a log-normal type of distribution. In the log-normal
distribution the standard deviation of a set of samples is directly
proportional to the arithmetic mean of those samples. Thus if the "local mean" shifts
slightly, the standard deviation will also shift accordingly. Since the sill of a semivariogram
(if it has one) is equal to the variance of the sample values, a shift in the
'local mean' will have an even greater effect on the semivariogram
sill.

Table 1 shows the mean and
standard deviation of the sample grades for each level separately. It can easily be seen
that the standard deviations rank in the same order as the experimental
semi-variograms in Figure 3. A graph of mean versus standard deviation
illustrates the proportional effect very well.
The correlation coefficient is, in fact, 0.95.
Table 1:
Sample means and standard deviations by level.
|
level |
arithmetic |
standard |
|
mean |
deviation |
|
|
|
|
|
|
6 |
34 |
34 |
|
7 |
40 |
51 |
|
8 |
56 |
78 |
|
9 |
46 |
48 |
|
10 |
57 |
73 |
|
11 |
77 |
89 |
|
12 |
79 |
105 |
|
13 |
53 |
63 |

It would seem from this
evidence that the difference between the level semi-variograms is due to the
proportional effect - and presumably the log-normality of the grade
distributions. In actual
fact, the samples are not log-normally distributed but may be characterised by a mixture of two log-normals
- so we must presume that proportionality also holds under this circumstance.

According to the authorities (op cit)
the next stage in the variogram analysis is to "standardise"
each level semi-variogram by the square of the local mean. That is, each calculated semi-variogram is
divided by the square of the arithmetic mean of the samples
which were used in its calculation.
This was done, and the resulting
semi-variograms are shown in Figure 5.
The picture is now much more homogeneous. The highest and lowest semivariograms now differ onlv by a factor of
two, whereas in the "unstandardised" graph
the highest and lowest differed by a factor of ten. Since the levels now appear to have
comparable structures we can combine them to form one
overall experimental semi-variogram for the whole section, and to fit a model
for use in the later estimation procedures.

The overall "standardised" experimental
semi-variogram is shown as the dots in Figure 6. The
dashed line is the compound value of the variance divided by the square of the
arithmetic mean. This would be a first
estimate at the value of the sill of the model (if it exists). Attempts to fit one of the simple models for
semi-variograms, i.e. linear, spherical, de Wijsian etc. fail completely. It is clear that a large nugget effect is
present, at a level of about 0.70. Note that this semi-variogram is
dimensionless, since all the values have been divided
by the square of the mean grade. Thus this nugget variance must be read as "0.70
multiplied by the square of the local mean". The sill clearly exists and takes the value
1.55. However, the curve between the two (intercept and sill) cannot be
approximated by a spherical model. It is
possible to fit a model which is a mixture of two
spherical models. A fitted model is shown as the solid line in Figure 6. It has two spherical
components with ranges of 15 feet and 125 feet respectively, and with sills of
0.60 and 0.25. This implies that samples within 125 feet of one another are
correlated, but to a fairly weak extent. Samples within 15 feet are much more highly
correlated, but about 45% of the total variation in the sample values seems to
be due purely to random behaviour - perhaps a
reflection of the type of mineralisation.
This tends to imply that no matter how many samples are
taken, there will always be a large unpredictable component in whatever
we try to estimate.
The semi-variogram model to be used
in the estimation procedure will be:
for distances (h) less than 15 feet

for distances between 15 and 125 feet
![]()
and for distances greater than 125 ft.
g(h) = 1.55
TESTING THE ESTIMATION PROCEDURE
Having produced a model for the semi-variogram of this
section of the lode, we can now embark upon an estimation procedure. Suppose we wish to estimate a volume which possesses an actual (unknown) grade, say T.
Suppose we use a linear estimator, that is one which is a simple linear combination of the grades of
the surrounding samples. We may denote
this by T*. In the prediction
an error will be made which will be equal to T - T*. Using the semi-variogram we can produce a
measure of the reliability of the estimator in the form of the "estimation
variance", s@e. Since we are
using a "standardised" semivariogram, the estimation variance
produced will be "relative to the local mean". That is, the calculated variance must be multiplied by the square of the local mean before it
becomes a true measure of the spread of the errors. The calculated
variance can be denoted by s@i,
so that:

The only problem now is to define what exactly is meant by the local mean.
The published authorities are not very explicit on this point, but presumably this should be interpreted as the "expected
value at this point". This must
mean the actual value at that point (or over that volume), i.e. T. Thus, the
standard error of the estimate T* should be given by
![]()
In practice, of course, we would not know the value of T
so that the standard error of T* would be estimated by
![]()
To test the estimation three quantities are needed - T,
T' and a . The estimator T* will (according to the
theory) be drawn from some kind of probability distribution with mean T* and
standard deviation se. That is, for any estimation we have one sample
from such a distribution. Unless it is
possible to carry out several estimations which result in the same standard
error, it is difficult to test whether the procedure is 'working" or not. Suppose a new variable is proposed:
![]()
No matter what kind of distribution T* follows, the z
variable will have a distribution with a zero mean and a standard deviation of 1. Now, if several estimations are carried
out, each will yield a z value.
If the theory and method of application are correct then all of these z
values will come from a distribution with mean zero
and standard deviation 1. This produces a situation which
can easily be tested -- carry out several estimations. calculate
the z values and see if they have a mean of zero and a standard deviation of 1.
The Case Study
In the study area under consideration
the only "actual" values which we know with reasonable certainty are
the measured values at the sample sites within the stopes and development
drives. Since the drive samples were used to calculate the experimental semi-variograms,
they will be excluded from the testing of the estimation technique. This should ensure an objective and unbiassed
test, and leaves upwards of 2,700 "known" T values to use in the
testing. The next problem is to decide
on which estimator to use. Any linear
estimator may be chosen, and in this case the choice
was the Kriging estimator. The procedure
followed was:
1) Consider each
stope sample in turn;
2) Find all
other stope samples within a given distance (dx) in the horizontal direction, and within the same
stope;
3) Use these
stope samples to produce a Kriging estimate and Kriging variance for the
"point"' under considerations
4) Correct the Kriging variance for the local mean;
5) Calculate the
corresponding z value;
6) Include this
z value in the calculation of mean and standard deviation, and in the
histogram.
In the computer program written for this analysis the
histogram was centred on zero and scaled so that the
total range ran from -3 to +3. The individual z values were not stored at
all. A decision has to be made on the size of dx. The larger it
is, the longer the program takes to run.
Mini-tests using level 13 only (726 samples) produce virtually the same
output for 6x ranging between 25 feet and 65
feet. Distances over 65 feet were considered impracticable -- especially since the
dominant range of influence is only 15 feet.
The final results given in this paper were all
calculated for 6x equal to 25 feet. In
correcting the Kriging variance for the local mean, the "true" value
of the stope sample was used, so that
![]()
for each stope sample. The
histogram of the resulting i values is shown as Figure 7. Negative values of z
imply that T' has over-estimated T, and positive values imply
under-estimation. For clarity, all
samples below -2.75 have been included in the
group -3 to -2.75. Similarly, all samples greater than +3 are
included in the group +3 to +3.25. The mean of the z values is -1.60 and the standard deviation is 2.96. Table 2 shows the means and standard deviations by level so that
it may be seen that these values are consistent over the whole lode.

Table 2:
Parameters of transformed errors by level.
|
level |
arithmetic |
standard |
|
|
mean |
deviation |
|
|
|
|
|
7 |
-1.28 |
3.09 |
|
8 |
-1.86 |
2.94 |
|
9 |
-1.29 |
2.10 |
|
|
|
|
|
12 |
-1.76 |
3.31 |
|
13 |
-1.63 |
3.14 |
The only conclusion which we can draw
from this histogram and these parameters is that the Kriging variances
do not reflect the true accuracy of the estimation procedure. A standard deviation of 3
for the z values implies that (in general) the supposed Kriging standard
deviations are too small by a factor of 3. That is, the Kriging variances are a
factor of 9 too small.
Closer inspection of the histogram allows us to break this conclusion
down into more detail. The computer
program does not record the extreme values of z. so we do not know how long the
negative tail is. However, we can see
that the positive side of the graph is truncated at a value of +1. This implies that
for samples which have been underestimated the Kriging
variance given by theory is far too large.
For samples which have been over-estimated the
supposed variance is far too small. In
other words, we will be vastly overoptimistic about over-estimated samples and
rather pessimistic about underestimated ones.
This is not only undesirable but could prove
disastrous!
A Retrial
In practice we would not know the value of T which we
were trying to estimate, but would only know T*. Suppose in (4) above the Kriging variance was
corrected by the estimate of the local mean, i.e. T*, so that
![]()
The above procedure was repeated
on this basis, and the resulting histogram is shown in Figure 8. The overall
mean of the z values is 0.09 and the standard deviation is 1.45. Table 3 gives
the level by level parameters, again showing that
these parameters are consistent throughout.

Table 3: Parameters of 'estimated'
transformed errors by level.
|
level |
arithmetic |
standard |
|
|
mean |
deviation |
|
|
|
|
|
7 |
0.22 |
2.46 |
|
8 |
0.11 |
1.47 |
|
9 |
0.08 |
1.29 |
|
|
|
|
|
12 |
0.06 |
1.28 |
|
13 |
0.06 |
1.17 |
The picture is a lot more encouraging than the previous
one. The mean 'transformed' error is
very close to zero, although consistently positive on all levels. The standard deviation is still somewhat
larger than the expected 1.00. There is also a truncation, but this time at the
lower end, and there is still rather a long tail -- but not nearly so remarkable as that in Figure 7. We now seem to be
over-optimistic about the accuracy of under-estimated samples and
vice-versa. On the
whole the Kriging standard deviations seem to be under-estimated by
about 45%.
This latter analysis taken in isolation would seem to
support the "working" of the application of geostatistics. However, it is somewhat disturbing when taken
in context with the previous analysis.
If the theory is correct then surely the theoretical analysis - with T
known - should give a much better picture than the practical analysis - with T
unknown. The case study so far has resulted
in entirely the opposite situation.
Should we therefore conclude that geostatistics does not work as
according to theory?
AN ALTERNATIVE APPROACH
In the analyses above we have
followed the stages of the accepted form of geostatistical analysis. Having calculated "local"
experimental semivariograms and demonstrated that the "proportional
effect" was present, we standardised each level by the local mean,
combined the individual levels and emerged with an overall experimental
standardised semi-variogram. To this we fitted a model which was 'relative to the local
mean'. We then followed the usual
Kriging procedure to estimate each stope sample in turn and compared the errors
in prediction by means of the z transformation.
The conclusion reached was that if we take the actual value of the stope
sample (T) as the "local mean", the resulting estimation variances
bear very little relationship to reality.
If we take the estimated value (T*) in place of the local mean the
resulting estimation variances give an acceptable picture of the reliability of
the errors, although the standard deviations are under-estimated by about 45%
and the shape of the histogram is decidedly odd.
Suppose we take a rather different approach and return
to the original .raw' experimental semi-variograms. Each level is calculated
from several hundred inter-related sample values and we have illustrated that
in each level the sill is directly proportional to the square of the sample
mean. Suppose an alternative hypothesis is put forward.
Suppose that the variation of mean and standard deviation of the sample
grades between each level is due simply to them being different sets of samples
from the same log-normal type distribution. Under this
hypothesis the semi-variograms are not standardised but combined to form one
overall semi-variogram -- this might be referred to as the 'raw' experimental
semi-variogram. This combined
semivariogram is shown as the dots in Figure 9, with the overall sample
variance drawn in as the dashed line.

This graph exhibits a strange zigzag behaviour which is
presumably the proportional effect differentiating between those levels with
five foot sampling and those with ten foot.
A sill appears to be present at about 5,675 (lb/ton)@, and a nugget variance of 2,575 (lb/ton)@. This nugget
effect represents 45% of the total sill value - exactly the same proportion as
in the standardised model. Bearing this
in mind, a model was proposed with two ranges of 15 and 125 feet, and with
sills of 2,200 (39% of total sill) and 900 (16%) respectively. That is, exactly the
same model as before but with a total sill of 5,675 (lb/ton)@ instead of 1.55. This is shown
as the solid line in Figure 9. and appears to be a
perfectly adequate fit. This, perhaps,
supports the alternative hypothesis, since the fitting of essentially the same
model would tend to suggest that 'standardisation' is
superfluous. However, it can be seen that deriving such a complex model from the
zigzag semi-variogram in Figure 9 would have been rather more difficult if we
had not had the standardised model to refer to.
Since the models are the same except for a multiplying
factor, the estimates produced will be identical. The only difference will be in the estimation
variances. Whereas the standardised
model gave standard errors directly proportional to the value being estimated,
the "raw" model gives standard errors which
are independent of the size of either true value or estimator. In other words, the former approach assumed
that low grades could be estimated more accurately
than high ones. The present approach
assumes that the accuracy of the estimation depends only on the sample
configuration -- not the sample grades.

The six-point analysis described previously was respected with the "raw" model, omitting step
(4). The histogram of transformed z
values is shown in Figure 10, and possesses a mean of
0.00 and an overall standard deviation of 0.76. Table 4 shows the parameters by
level. In this case the mean error an
all levels is zero, but the standard deviations vary between 0.60 and 1.10 in
no recognisable pattern. This is an encouraging sign since it suggests
that the variation is stochastic and not due to some consistent bias. The histogram of the transformed errors is
much more intuitively pleasing than either of its
predecessors. Whilst still
slightly skewed, there are no truncation points or excessive concentration of
values. This evidence leads us to
conclude that in this case the standardising of the
experimental semi-variograms has led us into an erroneous picture of the
standard errors of the estimation procedure.
Table 4: Parameters of errors by level
|
Level |
arithmetic |
standard |
|
|
mean |
deviation |
|
|
|
|
|
7 |
0.01 |
0.60 |
|
8 |
0.00 |
1.10 |
|
9 |
0.00 |
0.63 |
|
|
|
|
|
12 |
0.00 |
0.84 |
|
13 |
0.00 |
0.94 |
CONFIDENCE INTERVALS
It is common in quoting standard errors for estimates to
attempt to convert these into confidence intervals. The main difficulty in this process is in
what distribution to assume for the errors.
Most workers happily assume that their prediction errors follow an
approximately
![]()
An empirical 95% Confidence Interval can
be found from the histogram of the transformed errors in our case study.
This is done by
taking the centre 95% of the histogram, and using the z values at either end of
this interval instead of the ±2 above.
Taking the three approaches in turn produces:
¨
Standard
errors calculated by using the standardised semi-variogram multiplied by the
'true' mean (T) produces an empirical 95% interval of

¨
Multiplying
the standard error by the estimated mean T' produces an interval of

¨
Using the
'raw' semi-variogram gives an empirical 95% confidence interval

This lends weight to the decision that the 'raw'
semi-variogram leads to a better evaluation of the actual errors incurred in
the estimation. Assuming a 95%
Confidence Interval of ±2a in either of the standardised cases could lead to
disastrous consequences, whereas in the "raw" case it will give a fairly accurate representation of the actual Confidence
Interval.
CONCLUSION
The question posed by this paper was "Does
Geostatistics Work?". We have considered one case study only
and the conclusion which must be drawn is "Yes
provided the so-called proportional effect is ignored". If the "raw" semi-variogram is
used, then the standard error predicted by geostatistical theory accurately
measures the reliability of the estimation method. We have also seen that the usual assumption
of the Central Limit Theorem -- i.e. that a 95% Confidence Interval may be
produced by taking ±2se -- is also fairly safe if the
"raw" semivariogram is utilised.
ACKNOWLEDGEMENTS
The author wishes to thank the management of Geevor Tin
Mines Ltd. for the use of the Simms Lode data in this and the rest of the
research for the degree of Ph.D obtained in January
1979. The computer programs were
developed and implemented on the CDC Cyber 175 at Imperial College Computer
Centre.
REFERENCES
Clark, I "Geostatistical Reserve Estimation in
Cornish Tin Mining", Ph.D Thesis,
David, M
Guarascio, M. "Improving the uranium deposit estimations,
the Novazza case" Advanced Geostatistics in
the Mining Industry,
Journel, A.G. &