CHAPTER
4: Estimation
So far we have used the basic concepts and assumptions of Geostatistics to
build ourselves a ‘model’ of the structure and continuity within the deposit.
We have also (in Chapter 3) seen how this can lead to the production of
‘theoretical’ grade/tonnage curves and the study of how mining block size can
influence final production Figures. It is now time we returned to our original
problem of the estimation of ore reserves. The discussion in this (and the
next) chapter will be confined to ‘local’ estimation, i.e. interest is confined
to one portion of the deposit at a time. However, it should be borne in mind
that the same techniques can be applied on a global scale, i.e. to the whole
deposit at once. It should also be remembered that block-by-block or
stope-by-stope estimates will lead inevitably to global estimates.
Let us,
then, define the situation which is of interest to us. There is a point or an
area or a volume of ground over which we do not know the grade (or value), but
we wish to estimate it. Let us call this ‘unknown’ grade T, and the area (or point, or volume) of interest A. In order to produce an estimator we must have some
information, usually in the form of samples. To be completely general, let us
suppose n samples with values of g1,g2,g3...gn. This set of samples is generally
denoted by S. From these samples we can form a ‘linear’
type of estimator --- that is, a weighted average. We must restrict ourselves
to this type of estimator at this stage. The estimator is denoted by T* and is equal to:
![]()
where the w1,
w2, w3...wn are the weights assigned to each
sample. Most currently used local estimation techniques use a weighted average
approach --- inverse distance techniques and so on. The simplest case of all is
when all of the weights are the same, and T* is just the arithmetic mean of the
sample values.
|
|
|
||||||||||||||||||||||||||||||||
|
Fig 4.1. Hypothetical
sampling and estimation situation --- a uranium deposit. |
Table 4.1 Positions
and values on hypothetical Uranium estimation problem |
Now
consider the setup of samples and ‘unknown’ which we originally discussed in the
first chapter. Figure 4.1 shows the point of interest which lies at position A, and we have five ‘point’ samples lying around this
position. The co-ordinates of these six points and the values of the samples
are given in Table 4.1. The hypothetical deposit is a low-grade, large-tonnage
uranium one, which is assumed to be isotropic. The semi-variogram model fitted
to this deposit is a spherical one with a range of influence of 100 ft, a sill
value (C) of 700 (p.p.m.)² and a nugget effect of 100 (p.p.m.)². Let us take the simplest possible estimation
procedure. Take the value at the closest sample position (1) and ‘extend’ this
to the unknown point. In doing so we incur an estimation error, e, which will be equal to the
difference between the actual value T and the estimated value T*, which in this case equals g1. That is:

It is not too difficult to show that if there is no trend
(at least locally), this estimator is unbiased. That is, if we make lots of similar
estimations the average error will be zero.
![]()
The ‘reliability’ of the estimation can be measured by
looking at the spread of the
errors. If the errors take values consistently close to zero, then the
estimator is a ‘good’ one. If the spread of values is large, then the estimator
will be unreliable. The simplest stable measure of spread (statistically) is
the standard deviation. The standard deviation of an estimation error --- or
standard error as it is referred to in ordinary statistics --- will therefore
measure the reliability of that estimator.
No matter
how many estimations we perform, we cannot calculate the standard deviation of
the errors since we do not know the value of the error made. Therefore we must
look at the ‘theoretical’ form of the variance of the estimation error, i.e.
the estimation variance:

The average would be made (theoretically) over the whole
deposit. That is, the same estimation situation would be repeated over the
whole deposit and the variance found. This cannot be done in practice, of
course, so let us look closer at the form of this variance. It is found by
taking the grade at point A, subtracting the grade at point 1,
squaring the result, repeating the process over all possible pairs of such
points and then averaging the values. This sounds exactly like the definition
of a variogram. In fact, it is the variogram between the
two points A and (1). Given the distance between them (h) we
can evaluate this estimation variance simply by reading a value from the
semi-variogram model (g ) and
multiplying it by 2. This is one of the reasons why it is good policy to avoid
confusing the variogram and the semi-variogram. Thus:
![]()
In the case of our particular example given in Fig. 4.1:

Given our knowledge about this deposit, i.e. the
semi-variogram model, we can state (without too much fear of error) that the
estimator used has a standard error of 25.4 p.p.m. Turning this standard error
into a confidence interval, however, requires the assumption of some kind of
probability distribution for the deposit. For instance if we hope that the
Central Limit Theorem holds, we can say that a 95% confidence interval for T would be given by T*± 1.96se, i.e. (350 p.p.m., 450 p.p.m.). On
the other hand, if we were to assume a log-normal distribution for the errors,
the 95% confidence interval would be given by (354 p.p.m., 453 p.p.m.).

Fig. 4.2. More realistic estimation --- the
value of the block is required (uranium deposit).
Now, let us
complicate the procedure a little. Instead of estimating the value at the point
A, in a more realistic situation (at least
in mining) we would be interested in the average grade over an area or block or
some mining unit. In Fig. 4.2, a ‘panel’ 60 ft by 30 ft has been centred on the
original point A. The estimation procedure then becomes:

The same arguments as previously still hold. The average
error can be shown to be zero if there is no local trend. The estimation
variance is still a variogram, but it is now the variogram between the grade at
sample point (1) and the average grade over the panel A. We saw in Chapter 3 that we could
cope with average grades over samples if we wanted the semi-variogram between
samples of the same size, but so far we have not considered the possibility of
having two different sizes to compare. The model semi-variogram supplies us
with the difference in grades between two points. We could find the value of
the semi-variogram between the sample point and every point within the panel A, and we could average those values. Let us define
this quantity as
(S,A), read as ‘gamma-bar between the sample
and every point in the panel’. The ‘bar’ notation is the standard one for
arithmetic mean. This gamma-bar term will take the place of the g(h) in our previous relationship. However,
what we really need is the semi-variogram between the average grade of panel
and the sample, not between all the individual points within
the panel and the sample. 2
(S,A) would be the variance of the error made
if we tried to estimate every point within the panel. To correct for this
difference in emphasis we need to take into account the variation of the grades
at points within the panel.
This was
discussed in Chapter 3, and we evaluated it using the auxiliary function F(l,b). This was the average semi-variogram
between all possible pairs of points within the panel. We can rewrite this in a
more general way using the gamma-bar notation. That is,
(A,A) will be the average semi-variogram value
between every point in the panel and every point in the panel. In the case
shown in Fig. 4.2, then, when using the value at sample point (1) to estimate
the average grade of the panel, the estimation variance becomes:
![]()
The calculation of these gamma-bar terms will be discussed
more fully later.
Now, let us complicate the mathematics still further. We
actually have more than one sample available to us, so why not use them in the
estimation procedure. Suppose we use the arithmetic mean of the samples as our T*. This gives us the simplest form of the
weighted average type of estimator. That is:

In this case the term
(S,A) is the average semi-variogram value between each
point in the ‘sample set’ S and each point in the panel A. The term
(A,A) is still the average semi-variogram
between each point in the panel and each point in the panel. However, now we
have yet another source of spurious variation. We only consider the average
grade of the samples as the estimator, but
(S,A) takes the individual grades into account. Thus we have
also to subtract a
(S,S) term from the variance, where this is the
average semi-variogram value between each point in the sample set and each
point in the sample set (i.e. 25 ‘pairs’ of samples). The final version of the
estimation variance then becomes:
![]()
The arithmetic mean is often known in Geostatistics as an extension estimator, and the above
variance is referred to as the extension variance. To distinguish this variance
from the more general estimation variance for a weighted average, the subscript
e is used rather than the general e.
CALCULATION
OF GAMMA-BAR TERMS
Having produced a formula for the extension variance, it only remains to
explain how to calculate such terms as
(S,A) in practice. For the sake of our
(too) simplistic approach, we will consider for the moment only simple
idealistic cases, and these only in one or two dimensions. Generalisation will
be discussed later.

Fig.4.3. Example of using a peripheral point to
estimate the average value of the line segment.
Consider,
as an example, the setup in Fig. 4.3. There is a length of, say, drive, l m long, whose grade is unknown.
We have
at our disposal a single sample, perhaps at a development heading, whose value is
known. In our previous notation T is the average grade over l, T* is the grade at the sample
position, A is the length and S is the single sample point. The reliability of this
estimator is given by:
![]()
(S,S) is the semi-variogram between the sample point
and itself, which is zero because the sample is a ‘point’.
(A,A) is none other than the F(l) function encountered in Chapter 3. Our
problem arises with
(S,A) which has been defined as the average
semi-variogram between the sample point and every point in the line. That is,
we take M as a fixed point (the sample) and M’ can be anywhere on the line. We
take all such pairs that are possible, calculate the value of the
semi-variogram for each pair, sum these (using an integration), and average
this sum. Because the ‘sum’ is being performed over a continuous length, we
cannot divide it by the ‘number of points’ in the sum. Instead we divide by the
length of the line itself, l. This produces another auxiliary function
which is called c(l) and deals with the specific case
of points on the end of lines. Thus our extension variance becomes:
![]()
It remains only to determine the function c(l) for the particular model in use and the
standard error is immediately available. The one-dimensional auxiliary
functions are given below for the three ‘common’ models. Semi-variograms
comprising more than one component model are easily handled. The auxiliary
function for each component is evaluated and then the component auxiliary
functions added together.
Auxiliary functions
Linear
model for the semi-variogram;

Exponential model for the semi-variogram;

Spherical model for the semi-variogram:

Thus in our example above, if we have a linear
semi-variogram, the extension variance for the setup in Fig. 4.3 becomes:
![]()
For any specific problem, we need specify only the length
of the line l and the slope of the semi-variogram, p.
Let us now consider a slightly more interesting example,
such as that shown in Fig. 4.4.

Fig 4.4. Example of using a central point to estimate the average value of the line segment.
Here the
point sample is in the middle of the line, but otherwise the
situation remains the same. In:
![]()
only the first term
(S,A) has changed. Rather than invent a new
auxiliary function, or have to do the integration all over again, we can use
the existing c(l) function to produce the required
term.
The term
we require is as follows:
|
|
= the average semi-variogram value between the sample point and every point along the line = (sum of all semi-variogram values between the sample point and every
point along the line)/l = (sum of all the semi-variogram values between the sample point and
every point in the left hand half of the line + sum of all the values between
the sample and the right hand half of the line)/l |

Fig 4.5. Simplifying the central point problem
to allow the use of auxiliary functions.
Figure 4.5 illustrates the ‘splitting’ of the
line so as to put the sample point at the end of two shorter lines. Now, c(l/2) would give us the average of all the
semi-variogram values between M (the sample point) and the M’ on the left hand
half of the line. Returning to the definition of the c function, it can easily be seen
that the sum of all the semi-variogram values between M and M’
will be the average multiplied by the length of line under
consideration.
Thus:
![]()
so that
![]()
In a particular case the user may substitute his own model for
the semi-variogram, and hence the appropriate auxiliary functions. Before
moving on let us compare this result with the previous situation, where the
sample lay at the end of the
line. In the former case the extension variance was:
![]()
By definition c(l) must be greater than (or at least equal
to) c(l/2). The conclusion? If you can only
take one sample, it is better to take it in the middle of what you are trying
to estimate. It is reassuring to find that so-called common sense has a sound
mathematical background.

Fig. 4.6 Generalisation of the ‘central’ point problem.
Using the
same sort of logic on Fig. 4.6, you should be able to deduce that:
![]()
so that
![]()

Fig. 4.7. Extrapolation of the peripheral point
problem.
Figure 4.7 at first sight seems to be a different kettle of
fish. However, let us follow the same procedure and see where it leads.
The point lies on the end of a ‘line’ of length l+b. The expression (l+b) c(l+b) would give us the sum of all the semi-variogram
values between the sample and the length l+b. However we do not require the
points corresponding to M’ within the length b, so we may subtract those in the
form c(b). That is:
![]()
so that
![]()
For the linear model, for example, this would be:

This is obviously larger than the expression when the point
was on the end of the line, as would be expected.
One last example before we abandon one-dimensional
examples: Fig. 4.8 shows the ‘same’ line, which now contains three samples.

Fig. 4.8. More complex problem when three samples are available to estimate the line segment.
We shall
use the arithmetic mean of the three grades to estimate the length, i.e. T*=(g1+g2+g3)÷3. Then our extension variance is
![]()
where S is now a set of three points.
(A,A) remains unchanged, equal to F(l) since we have not changed the length to
be estimated at all. However,
(S,A) is now the average semi-variogram value
between each of the three points and the line, so that
![]()
where S1 represents sample 1 and so on. Now
(S1,A) is simply c(l), as is
(S3,A). The term
(S2,A) is the same situation as that in Fig.
4.4, so this equals c(l/2). Thus,
![]()
The middle term of the variance
(S,S) requires us to take each point in the
sample set with each point in the sample set. Since there are three points in
the set, there are nine such pairs of points:

Each of the individual terms is simply the semi-variogram
between a pair of points. Three of the terms, g(S1,S1), g(S2,S2) and g(S3,S3)
are automatically zero since the samples are points. The terms g(S1,S2), g( S2,S1), g(S2,S3)
and g(S3,S2)
are all equal to g(l/2), whilst g(S1,S3)
and g( S3,S1)
are equal to g