Contents Page

Previous section

Isobel’s Home Page

Practical Geostatistics 2000

Courses

 

CHAPTER 5: Kriging

Let us turn, now, to a much more common, and probably more realistic, approach to the estimation of ‘local’ values. Until now we have considered only the operation of averaging all the local samples and applying this value as the estimate of the area under consideration. There are cases in which it would not seem sensible to weight all the samples equally, since some will be a great distance from the ‘unknown’ area
A, whilst others will be much closer to it --- if not inside. It would seem more sensible to use a weighted average of the sample values, with the ‘closer’ samples having more weight. The new estimator will be of the form:

where the weightings sum to 1. If this condition is met, and there is no trend (locally) then T* is an unbiased estimator. This means that over a lot of estimations the average error will be zero. This type of estimator is called a ‘linear’ estimator because it is a linear combination of the sample values. The arithmetic mean is simply a special case where all of the weights are equal. It can be shown that the estimation variance for the general ‘unbiased linear’ estimator is:

 


Where we previously calculated
(S,A) which was the average semi-variogram between each sample and the unknown area, we now form a weighted average for each individual sample with the area A, (Si,A), in the same way as the actual estimator. For example, if we have n samples taken around the area A, the first term in the variance becomes:

 


The last term in the variance,
(A,A), does not change its form, since we have only changed the form of the estimator, not the area being estimated. The term (S,S) which previously measured the variation in values between the samples, must now take into account the different weights associated with each sample. Thus, if we take, e.g., sample 4 we must remember that it has a weight w4. If we take sample 4 with, say, sample 2, then we must include both weights w2 and w4, so that the term becomes

 

w2 w4 (S2,S4)


To form the equivalent
(S,S), then, each (Si,Sj) is multiplied by the corresponding wiwj before being added into the sum.

Fig. 5.1. Three samples to be used to estimate the line segment.


As a simple example, let us consider the setup in Fig. 5.1. We calculated the extension variance for this setup in Chapter 4. The arithmetic mean gave an extension variance of 
pl/18 when we used a linear semi-variogram with the form g(h) =ph. Now suppose we allocate a set of weights to these three samples instead of treating them all the same. Let us, for this example, put three-quarters of the ‘weight’ at the central point, and an eighth on each end point. The ‘new’ estimator is therefore:

The reliability of this estimator will be given by the general form of se². The term (A,A) is given by the function F(l), by definition, which for the linear semi-variogram takes the form pl/3. The central term --- the ‘within samples’ term --- is:

 


where each sample is taken with each sample in turn --- including itself --- and the combination is multiplied by both weights. Since the samples in this case are all points, all the
(Si,Sj) terms can be found directly from the semi-variogram g(h). This gives:

 


Since the semi-variogram model is linear, this becomes:

 


This leaves only the first term --- the between sample and area term --- to be evaluated. This is:

 


Sample 1 is at one end of the length
l, so that (S1,A) =c(l). Similarly (S3,A) =c(l). Sample 2 is the central point of the length l, so that (S2,A) =c( l/2). For the linear model c(l) =pl/2, so that

 


Putting all these together, gives an estimation variance of:

 


The specified set of weights, (1/8, 3/4, 1/8)  when used with a linear semi-variogram model give an estimation variance of 0.0729
pl. The arithmetic mean of the samples gave an extension variance of pl/18=0.0556pl, which is three-quarters of the size of the estimation variance above. It is fairly obvious in this case that the specified weighted average gives a worse estimator than simply using the arithmetic mean.

 

Fig. 5.2. Estimation of the block value is required from the five scattered samples --- uranium example.


As a two-dimensional example, let us return to the ubiquitous uranium example as illustrated in Fig. 5.2. We shall allocate weights to each sample according to its distance from the centre of the block
A. Table 5.1 shows the ‘inverse distance’ calculation for the weights in this example.

 

Table 5.1. Calculation of Inverse Distance weightings for hypothetical Uranium estimation problem

Sample

Distance from

Inverse

Corrected

number

centre (ft)

distance

weight

1

21.54

0.0464

0.319

2

50.00

0.0200

0.137

3

31.62

0.0316

0.217

4

30.00

0.0333

0.229

5

70.00

0.0143

0.098

Total

 

0.1457

1.000

 

The estimator T* becomes:

 


The three terms which go into the variance are:

The individual values of (S1,A) and so on are the same as those evaluated when considering the extension variance, and are equal to 356.7, 572.4, 456.9, 446.8 and 696.1 respectively. Thus the first term in the variance is equal to 461.9 (p.p.m.)²--- as opposed to 505.8 (p.p.m.)² in the extension variance. The second term in the variance of the weighted mean turns out to be 441.2 (p.p.m.)². In the extension case it was 504.7 (p.p.m.)². The final term in both variances is 344.0 (p.p.m.)² as given by the auxiliary function F(l,b). Putting these Figures together gives us an estimation variance for the inverse distance estimator of 138.6 (p.p.m.)². This is equivalent to a ‘standard error’ of 11.8 p.p.m. Thus, using the inverse distance weights we obtain an estimate of 372.8 p.p.m. U3O8 for the panel, and we can say that this has a standard error of 11.8 p.p.m., which has been derived from our knowledge of the semi-variogram of this deposit. If we are willing to make the additional assumption of Normality of the errors, we could say that a 95% confidence interval for the ‘true’ value of the panel is (349 p.p.m., 396 p.p.m.). This should be compared to the extension estimate of 366 p.p.m., with a standard error of 12.8 p.p.m., and a 95% confidence interval of (340 p.p.m., 392 p.p.m.). It can quickly be seen that the inverse distance estimation produces a (slightly) more accurate result than the arithmetic mean --- this seems quite sensible.

Suppose we change the situation slightly. Suppose that sample 3 was not north of the block but south, i.e. northing 2310, as in Fig. 5.3.

Fig. 5.3. Sample 3 is now located South of the block to be estimated.

The inverse distance weights are unchanged, because they depend on the distance between the sample and the block centre.

However, the estimation standard error rises sharply to 14.3 p.p.m., indicating a loss in accuracy. The change in the estimation variance is caused solely by a decrease in the size of the ‘within samples’ term --- the amount of information contained in the sample set --- since the other two terms remain the same as before. In actual Figures the term falls from 441.2(p.p.m.)² to 376.2(p.p.m.)². It does not really seem very sensible to use the same weights for Fig. 5.3 as we did for Fig. 5.2, since sample 3 now gives us a lot less ‘information’ than it did previously. We could suggest a new set of weights and calculate the estimation variance. If this were smaller than the ‘inverse distance’ set, we could say our ‘new’ estimator was (in some sense) better than the inverse distance one. Alternatively, we could use a different method to produce the weights --- inverse distance squared, for example. It would seem a lot more desirable to find a direct method of producing the ‘best’ estimate, given our knowledge of the deposit.


We have decided to use a ‘linear’ type of estimator --- a weighted average of the sample values. We know that it is an unbiased estimator if the weights add up to 1. There is an infinite number of such linear unbiased estimators, so we will search for the ‘best’ one, and we will define ‘best’ as ‘having the smallest estimation variance’. The expression for the estimation variance depends on three things: the basic geometry of samples and unknown area, the form of the semi-variogram, and the weighting allocated to each sample. For any given setup the variance can only be changed by altering the values of the
weights. Thus we wish to minimise the estimation variance with respect to the weights. The variance is a simple (?) function of the weights, so to minimise it we must differentiate and set the differential equal to zero:

This will provide n equations in n unknowns (w1, w2, w3, w4...wn). These weights will provide an estimator which has the minimum value of the estimation variance. However, they will not necessarily add up to one. There is nothing in the above system of equations that constrains the weights in this way. Effectively, we also need to satisfy the equation Swi=1. Thus, to obtain the ‘Best Linear Unbiased Estimator’ we actually have to satisfy (n+1) equations. However, we only have n unknowns, so far, which is not a very desirable condition. To rectify this we must introduce another unknown, in the form of a Lagrangian Multiplier, to balance up the system. Therefore, instead of minimising the estimation variance, we actually minimise:

 


with respect to
w1, w2, w3, w4... wn, and l. This last produces the equation:

 

S wi – 1 = 0


as is required. After the differentiation has been performed and the equations tidied up, the following system results:

 


Although this looks slightly fearsome in its complexity, if you look a little closer, you will find that most of the elements are (I hope) by now familiar to you. Consider the first equation. The right hand side merely requires the average semi-variogram value between sample 1 and the unknown area. The left hand side contains the
n+1 unknowns, wi and l, and the average semi-variogram value between sample 1 and each of the other samples in turn. All of these  terms are identical to those which we would work out for an extension or an estimation variance.


The second equation is identical to the first except that it is sample 2 which is present right along the equation. The third has sample 3 all the way along, and so on until the
nth equation with Sn all the way along. Finally, we have the necessary condition for the sum of the weights. The solution to this set of equations will produce a set of weights giving the ‘Best Linear Unbiased Estimator’ --- sometimes referred to as BLUE. This process was named Kriging by Georges Matheron after Danie Krige, who has done a tremendous amount of empirical work on weighted averages. Pronunciation of the word is a controversial topic --- I personally prefer ‘kridging’ (as in bridging). The system of equations is generally referred to as the kriging system, and the estimator produced is the kriging estimator. The variance of the kriging estimator could be found by substitution of the weights into the general estimation variance equation. However, it can be shown that the kriging variance can be written as:

 

KRIGING EXAMPLES

Earlier in this chapter we took the setup in Fig.5.1, a semi-variogram of the form g(h)=ph, and allocated a set of weights to each of the three samples. We showed that in that case the estimation variance was 0.0729pl. Now let us see if we can find the ‘best’ set of weights using the kriging system. Since we have three weights, there are four equations which in the general form are:

 


and the kriging variance would be:

 


The right hand side of the equations are
(S1,A) , (S2,A)  and (S3,A). (S1,A) is the average semi-variogram value between a line of length l and a point on the end of it. This is the definition of c(l). So is (S3,A). (S2,A) is the average semi-variogram between the line and a central point. This example was tackled in Chapter 4 and found to be c(l/2). Thus we have the right hand side of the equations and most of the variance. The left hand side of the equations is made up of the individual sample-with-sample terms. Since all of the samples in this case are supposed to be points, all of these ‘left hand side’ relationships are given by the semi-variogram model. It remains only to calculate the distances between the pairs and use the model to produce the terms. The diagonal terms, (S1,S1), (S2,S2) and (S3,S3) are all zero, since g(0) =0 by definition, (S1,S2)  is equal to (S2,S1), and is g(h) ( l/2) . So is (S2,S3) and (S3,S2). (S1,S3) is g (l) as is (S3,S1). Finally, (A,A) is F(l) by definition. Putting these together gives:

 

 

w2 γ(l/2)

+ w2 γ(l)

+ l

=

c(l)

 w1 γ(l/2)

 

+ w2 γ(l/2)

+ l

=

c(l/2)

 w1 γ(l)

+ w2 γ(l/2)

 

+ l

=

c(l)

 w1

+ w2

+ w3

 

=

 1

 

and the kriging variance is

 


Up until this point the system does not depend on the actual model for the semi-variogram. Our model was
g(h) =ph and for this example we will take p=4. For the linear semi-variogram, c(l) =pl/2, so in this case c(l) =2l. Similarly, F(l) =4l/3. Substituting in the above system:

 

 

2lw2

+ 4lw2

+ l

=

2l

(1)

 2lw1

 

+ 2lw2

+ l

=

l

(2)

 4lw1

+ 2lw2

 

+ l

=

2l

(3)

 w1

+ w2

+ w3

 

=

 1

(4)

 

and

 


Adding equations (1)  and (3)  gives

 


whilst equation (4)  gives

 


These two together show that
in this case l=0. Thus we can eliminate l from the first three equations, and divide all of them through by l. This suggests that the results --- the final values of the weights --- do not rely on the length being estimated. We have then:

 

 

2w2

+ 4w2

=

2

(5)

 2w1

 

+ 2w2

=

1

(6)

 4w1

+ 2w2

 

=

2

(7)

 

Subtracting equation (5) from equation (7) gives:

 


so that equation (6) gives:

 


Therefore,
w3 = 1/4 and w2 = 1/2. The ‘optimal’ set of weights for the problem in Fig. 5.1 is therefore (1/4, 1/2, 1/4)  and this estimator gives a kriging variance of:

 


The final result is therefore that:

the BLUE has weights of (1/4, 1/2, 1/4) and a kriging variance of  l/6.


In our previous studies of this particular setup, we found that the arithmetic mean gave an extension variance of
pl/18 and that the set of weights (1/8, 3/4, 1/8)  gave 7pl/96. To match the example above we should set p=4, so that the variances become 2l/9 and 7l/24 respectively. The kriging procedure has improved over the arithmetic mean by about 25%, this being the difference in magnitude between the extension variance and the kriging variance. The spurious set of weights which we tried earlier in this chapter give a variance almost twice that of the optimal estimator. Note that in this case, since we have used a linear semi-variogram model, the set of weights is independent of the length being estimated, but the variance is directly proportional to it. For an exercise, see if you can show that the set of weights is also independent of the slope of the semi-variogram, p. Show also that the kriging variance is directly proportional to p, which seems only sensible.

 

TWO-DIMENSIONAL EXAMPLE

Now let us return to the familiar uranium example shown in Fig. 5.2. The data --- position co-ordinates and grade values --- were given in Table 4.1. We have five samples, so there will be six equations in the kriging system. The (Si,Sj) terms on the left hand side of the equations are all ‘point-with-point’ relationships, and can be evaluated directly from the semi-variogram model. The model was spherical, with a range of influence of 100ft, a sill of 700 (p.p.m.)² and a nugget effect of 100 (p.p.m.)². The (Si,A) terms are all ‘point-with-panel’ relationships and hence are simple combinations of the H(l,b) auxiliary function. The (A,A) term is, of course, F(l,b).


The kriging system turns out to be:

 


and the kriging variance would be:

 


Solving this system of equations (by computer) gives:

 


This kriging standard deviation compares favourably with the standard error of the ‘inverse distance’ weights. The main difference in the weighting is perhaps a surprising one at first sight. Sample 2 is allocated an almost zero weight. In fact, it receives only 20% of the weight on sample 5, which is considerably farther from the block centre. This is because the kriging system automatically takes account of the relationship amongst the samples. Sample 2 provides little extra information about the block in the presence of sample 1, and to a lesser extent samples 3 and 4. To illustrate this point further, let us consider the situation in Fig. 5.3, where sample 3 was moved to the south of the block. The kriging system now becomes:

Note that the only difference between this and the previous system is in row 3 and column 3 on the left hand side. The right hand side is unchanged since we have not changed the relationship of each individual sample to the panel. The new weights are:

The weight on the third sample is now less than that of sample 2. The main movement in weight has been towards the ‘northern’ samples, 1, 2 and 5, although the largest increase is, naturally, in sample 1. The really striking change is in the value of the estimator which has dropped by 17 p.p.m.

As a third example let us consider the same sampling situation as before, but now with the block rotated through 90 degrees, as in Fig. 5.4.

 

Fig. 5.4. Block to be estimated is rotated through 90º.

 

The kriging system for this setup has the same left hand side as the first situation in Fig. 5.2. However, all of the terms on the right hand side have changed, to:

 


The weights allocated to the samples change drastically:

The estimator T* takes the value 371.9 p.p.m., and has a standard error of 10.7 p.p.m. Sample 2 has been completely screened out by sample 1, and the influence of sample 5 has also declined somewhat. The biggest surprise, perhaps, is that sample 1 no longer has anything like the importance it had in the previous two examples. There is no method other than kriging in which this shift in ‘information value’ can be quantified and utilised.

 

SUMMARY OF MAJOR POINTS

1.    We can evaluate the accuracy of any linear estimator if we have a model for the semi-variogram.

2.    We can produce the minimum variance unbiased linear estimator using the kriging technique, if we have a model for the semi-variogram.



The standard litany of the advantages of kriging can be found in numerous publications. The points of major importance are:

 

a)      Given the basic assumptions, no trend, and a model for the semi-variogram, kriging always produces the Best Linear Unbiased Estimator.

b)      If the proper models are used for the semi-variogram, and the system is set up correctly, there is always a unique solution to the kriging system.

c)      If you try to ‘estimate’ the value at a location which has been sampled, the kriging system will return the sample value as the estimator, and a kriging variance of zero. In other words, you already know that value. This is usually referred to as an ‘exact interpolator’.

d)      If you have regular sampling, and hence the same sampling/block setup at many different positions within the deposit, it is not necessary to recalculate the kriging system each time.

 

 

SIMULATED IRON ORE EXAMPLE

To round out this chapter, let us return to the simulated iron ore deposit mentioned at the end of Chapter 4. Two sets of samples had been taken. The 50 ‘random’ samples were shown in Fig. 4.14 and the regular grid in Fig. 4.17.

The regular grid actually comprises 41 samples. As before, the 400-metre-square area was divided into 50-metre-square blocks. However, this time the block values were estimated by the method of kriging. The range of influence of the semi-variogram model for this deposit was 100m, so all samples within this distance of a block were included in its estimation. The results are shown in Fig. 5.5, and once again the upper Figure in each block is the estimated value, whilst the lower is the kriging standard deviation, or standard error. For comparison, Fig. 5.6 shows the kriging solution for the case where the area is divided into 100-metre-square blocks. Notice that the kriging standard deviations in all cases are much lower than for the 50-metre blocks.

Fig. 5.5. Simulated iron ore deposit --- block averages kriged from the set of random samples and the corresponding kriging standard deviations --- 50 metre blocks.

Fig. 5.6. Simulated iron ore deposit --- block averages kriged from the set of random samples and the corresponding kriging standard deviations --- 100 metre blocks.

 

This once again bears out the principle that it is ‘easier’ to estimate large areas rather than small ones. Figure 5.7 shows the estimated values for the blocks when using the sample values from the regular grid. The kriging standard deviation for all of these blocks is 2.4%Fe. This is not markedly different from the extension standard deviation of 2.5%Fe. Perhaps our conclusion here must be that with a regular grid of this particular size, the arithmetic mean of the two ‘corner’ samples is as good an estimator as a weighted average of all samples within 100m of the block. The exterior samples would seem to be superfluous in the circumstances. This conclusion does not hold for the irregular sampling, for which some great improvements in accuracy result from the application of kriging.


Fig. 5.7. Simulated iron ore deposit --- block averages kriged from the set of regular samples --- 50 metre blocks.

 

The usual requirement in ore reserve estimation is the production of block values. However, in many other possible applications of kriging --- such as geochemistry or hydrology --- the estimation desired is in the form of ‘point’ values, or a contour map of the variable of interest. Kriging can be used to produce the close grid of values necessary to the plotting of contour maps. In fact, the procedure is very much easier than ‘area’ estimation, since all the ‘average semi-variogram values’ reduce to simple values of the semi-variogram model itself. Since all of the observations are made at specified points, the left hand side of the kriging system is ‘point-with-point’ semi-variogram values. Since the value to be estimated is also at a point, the right hand side is also ‘point-with-point’ semi-variogram values. Figure 5.8 shows the contour map produced using the 50 randomly chosen samples. The blacked-out area at the top of the map is outside the range of influence of any sample, and hence cannot be estimated.

Fig. 5.8. Simulated iron ore deposit --- contour map kriged from random samples.

Fig. 5.9. Simulated iron ore deposit --- kriged standard deviation map for the kriged contour map from random samples.


One of the advantages of kriging as an interpolation technique is that every estimate is accompanied by a corresponding kriging standard deviation. Thus, for any contour map of values, a companion map of ‘reliability’ can be produced. This is shown in Fig. 5.9. The location of the sample points can easily be seen by the concentration of the low value contours in Fig. 5.9 --- the 1%Fe and 2%Fe contours. The highest possible contour value would be 5
Ï2=7.07%Fe, which is the boundary around the blacked out area. This corresponds to trying to estimate the value at a point which is just on the range of influence away from the nearest sample.

The ‘unreliable’ areas are quite clearly outlined by the 5%Fe contour (the sample standard deviation). A standard error which is larger than the original sample standard deviation denotes a rather unreliable prediction.

Fig. 5.10. Simulated iron ore deposit --- contour map kriged from regular samples.

Fig. 5.11. Simulated iron ore deposit --- map of kriging standard deviations for map kriged from regular samples.


Figure 5.10 shows the interpolated contour map using the regular samples, and Fig. 5.11 the corresponding map of the kriged standard deviation. Note that, even though only 41 samples are available, and even though these are on a very large grid (71m), the highest contour value in Fig. 5.11 is only 4%Fe.

An additional advantage of kriging as an estimation technique is that the maps and/or calculations of the ‘standard errors’ can be produced without actually taking the samples. For example, if infill drilling were proposed on the regular grid, it is fairly obvious from Fig. 5.11 where the new samples should be taken. If a decision was taken to reduce the grid to 50m, i.e. put a hole in the centre of each 4%Fe contour, a complete new map of the resulting standard error could be drawn before setting foot in the field.

Contents Page

Next section

Isobel’s Home Page

Practical Geostatistics 2000

Courses