CHAPTER 5: Kriging
Let us turn, now, to a much more common, and probably more realistic, approach
to the estimation of ‘local’ values. Until now we have considered only the
operation of averaging all the local samples and applying this value as the
estimate of the area under consideration. There are cases in which it would not
seem sensible to weight all the samples equally, since some will be a great distance
from the ‘unknown’ area A, whilst others will be much closer
to it --- if not inside. It would seem more sensible to use a weighted average
of the sample values, with the ‘closer’ samples having more weight. The new
estimator will be of the form:
![]()
where the
weightings sum to 1. If this condition is met, and there is no trend (locally) then T* is an unbiased estimator. This means that
over a lot of estimations the average error will be zero. This type of
estimator is called a ‘linear’ estimator because it is a linear combination of
the sample values. The arithmetic mean is simply a special case where all of
the weights are equal. It can be shown that the estimation variance for the
general ‘unbiased linear’ estimator is:

Where we previously calculated
(S,A) which was the average semi-variogram
between each sample and the unknown area, we now form a weighted average for
each individual sample with the area A,
(Si,A), in the same way as the actual estimator.
For example, if we have n samples taken around the area A, the first term in the variance becomes:

The last term in the variance,
(A,A), does not change its form, since we have only changed
the form of the estimator, not the area being estimated. The term
(S,S) which previously measured the variation
in values between the samples, must now take into account the different weights
associated with each sample. Thus, if we take, e.g., sample 4 we must remember
that it has a weight w4. If we take sample 4 with, say, sample
2, then we must include both weights w2 and w4, so that the term becomes
w2 w4
(S2,S4)
To form the equivalent
(S,S), then, each
(Si,Sj) is multiplied by the corresponding
wiwj before being added into the sum.

Fig. 5.1. Three
samples to be used to estimate the line segment.
As a simple example, let us consider the setup in Fig. 5.1.
We calculated the extension variance for this setup in Chapter 4. The
arithmetic mean gave an extension variance of
pl/18 when we used a linear
semi-variogram with the form g(h) =ph. Now suppose we allocate a set of weights
to these three samples instead of treating them all the same. Let us, for this
example, put three-quarters of the ‘weight’ at the central point, and an eighth
on each end point. The ‘new’ estimator is therefore:

The
reliability of this estimator will be given by the general form of se². The term
(A,A) is given by the function F(l), by definition, which for the linear
semi-variogram takes the form pl/3. The central term --- the ‘within
samples’ term --- is:

where each sample is taken with each sample in turn ---
including itself --- and the combination is multiplied by both weights. Since
the samples in this case are all points, all the
(Si,Sj) terms can be found directly from
the semi-variogram g(h). This gives:

Since the semi-variogram model is linear, this becomes:

This leaves only the first term --- the between sample and
area term --- to be evaluated. This is:
![]()
Sample 1 is at one end of the length l, so that
(S1,A) =c(l). Similarly
(S3,A) =c(l). Sample 2 is the central point of the
length l, so that
(S2,A) =c( l/2). For the linear model c(l) =pl/2, so that

Putting all these together, gives an estimation variance
of:

The specified set of weights, (1/8, 3/4, 1/8) when used with a linear semi-variogram model
give an estimation variance of 0.0729pl. The arithmetic mean of the
samples gave an extension variance of pl/18=0.0556pl, which is three-quarters of the size of
the estimation variance above. It is fairly obvious in this case that the
specified weighted average gives a worse estimator than simply using the
arithmetic mean.

Fig. 5.2. Estimation of the block value is required from the five scattered samples --- uranium example.
As a two-dimensional example, let us return to the
ubiquitous uranium example as illustrated in Fig. 5.2. We shall allocate
weights to each sample according to its distance from the centre of the block A. Table 5.1 shows the ‘inverse distance’ calculation
for the weights in this example.
Table 5.1. Calculation of Inverse Distance weightings for hypothetical Uranium estimation problem
|
Sample |
Distance from |
Inverse |
Corrected |
|
number |
centre (ft) |
distance |
weight |
|
1 |
21.54 |
0.0464 |
0.319 |
|
2 |
50.00 |
0.0200 |
0.137 |
|
3 |
31.62 |
0.0316 |
0.217 |
|
4 |
30.00 |
0.0333 |
0.229 |
|
5 |
70.00 |
0.0143 |
0.098 |
|
Total |
|
0.1457 |
1.000 |
The
estimator T* becomes:

The three terms which go into the variance are:

The individual values of
(S1,A) and so on are the same as those evaluated
when considering the extension variance, and are equal to 356.7, 572.4, 456.9,
446.8 and 696.1 respectively. Thus the first term in the variance is equal to
461.9 (p.p.m.)²--- as opposed to 505.8 (p.p.m.)² in the extension variance. The second term in the
variance of the weighted mean turns out to be 441.2 (p.p.m.)². In the extension case it was 504.7 (p.p.m.)². The final term in both variances is 344.0 (p.p.m.)² as given by the auxiliary function F(l,b). Putting these Figures together gives us
an estimation variance for the inverse distance estimator of 138.6 (p.p.m.)². This is equivalent to a ‘standard error’ of 11.8 p.p.m.
Thus, using the inverse distance weights we obtain an estimate of 372.8 p.p.m.
U3O8 for the panel, and we
can say that this has a standard error of 11.8 p.p.m., which has been derived
from our knowledge of the semi-variogram of this deposit. If we are willing to
make the additional assumption of Normality of the errors, we could say that a
95% confidence interval for the ‘true’ value of the panel is (349 p.p.m., 396
p.p.m.). This should be compared to the extension estimate of 366 p.p.m., with
a standard error of 12.8 p.p.m., and a 95% confidence interval of (340 p.p.m.,
392 p.p.m.). It can quickly be seen that the inverse distance estimation
produces a (slightly) more accurate result than the arithmetic mean --- this
seems quite sensible.
Suppose we change the situation slightly.
Suppose that sample 3 was not north of the block but south, i.e. northing 2310,
as in Fig. 5.3.

Fig. 5.3. Sample 3 is
now located South of the block to be estimated.
The inverse distance weights are unchanged,
because they depend on the distance between the sample and the block centre.
However,
the estimation standard error rises sharply to 14.3 p.p.m., indicating a loss
in accuracy. The change in the estimation variance is caused solely by a
decrease in the size of the ‘within samples’ term --- the amount of information
contained in the sample set --- since the other two terms remain the same as
before. In actual Figures the term falls from 441.2(p.p.m.)² to 376.2(p.p.m.)². It does not really seem very
sensible to use the same weights for Fig. 5.3 as we did for Fig. 5.2, since
sample 3 now gives us a lot less ‘information’ than it did previously. We could
suggest a new set of weights and calculate the estimation variance. If this
were smaller than the ‘inverse distance’ set, we could say our ‘new’ estimator
was (in some sense) better than the inverse distance one. Alternatively, we
could use a different method to produce the weights --- inverse distance
squared, for example. It would seem a lot more desirable to find a direct
method of producing the ‘best’ estimate, given our knowledge of the deposit.
We have decided to use a ‘linear’ type of estimator --- a
weighted average of the sample values. We know that it is an unbiased estimator
if the weights add up to 1. There is an infinite number of such linear unbiased
estimators, so we will search for the ‘best’ one, and we will define ‘best’ as
‘having the smallest estimation variance’. The expression for the estimation
variance depends on three things: the basic geometry of samples and unknown
area, the form of the semi-variogram, and the weighting allocated to each
sample. For any given setup the variance can only be changed by altering the
values of the weights. Thus we
wish to minimise the estimation variance with respect to the weights. The
variance is a simple (?) function of the weights, so to minimise it we must
differentiate and set the differential equal to zero:
![]()
This will
provide n equations in n unknowns (w1,
w2, w3, w4...wn). These weights will provide an
estimator which has the minimum value of the estimation variance. However, they
will not necessarily add up to one. There is nothing in the above system of
equations that constrains the weights in this way. Effectively, we also need to
satisfy the equation Swi=1. Thus, to obtain the ‘Best Linear
Unbiased Estimator’ we actually have to satisfy (n+1) equations. However, we only have n unknowns, so far, which is not a very desirable
condition. To rectify this we must introduce another unknown, in the form of a
Lagrangian Multiplier, to balance up the system. Therefore, instead of
minimising the estimation variance, we actually minimise:
![]()
with respect to w1,
w2, w3, w4... wn, and l. This last produces the equation:
S wi – 1 = 0
as is required. After the differentiation has been
performed and the equations tidied up, the following system results:

Although this looks slightly fearsome in its complexity, if
you look a little closer, you will find that most of the elements are (I hope)
by now familiar to you. Consider the first equation. The right hand side merely
requires the average semi-variogram value between sample 1 and the unknown
area. The left hand side contains the n+1 unknowns, wi and l, and the average semi-variogram
value between sample 1 and each of the other samples in turn. All of these
terms are identical to those which we would
work out for an extension or an estimation variance.
The second equation is identical to the first except that
it is sample 2 which is present right along the equation. The third has sample
3 all the way along, and so on until the nth equation with Sn all the way along. Finally, we
have the necessary condition for the sum of the weights. The solution to this
set of equations will produce a set of weights giving the ‘Best Linear Unbiased
Estimator’ --- sometimes referred to as BLUE. This process was named Kriging by
Georges Matheron after Danie Krige, who has done a tremendous amount of
empirical work on weighted averages. Pronunciation of the word is a
controversial topic --- I personally prefer ‘kridging’ (as in bridging). The
system of equations is generally referred to as the kriging system, and the
estimator produced is the kriging estimator. The variance of the kriging
estimator could be found by substitution of the weights into the general
estimation variance equation. However, it can be shown that the kriging
variance can be written as:
![]()