Contents Page

Previous section

Isobel’s Home Page

Practical Geostatistics 2000

Courses

 

CHAPTER 3: The Volume -- Variance Relationship

In the previous chapters we have discussed semi-variograms, calculated experimental semi-variograms and fitted models to these as if the samples had no characteristics other than ‘position’. We have ignored the size and shape of the sample, the way in which it may have been taken and/or measured, and so on. We have effectively assumed that the sample values were located at ‘points’ within the deposit. In this chapter we will see what effect those other characteristics -- collectively called ‘support’ -- have on the sample value itself, and hence on the semi-variogram.


Let us consider the lead/zinc example which was discussed in Chapter 2. Although the cores were actually 1.52m long, we ignored this fact and calculated the experimental semi-variogram as before. Suppose, however, that the cores had been sectioned into 3.04m lengths instead of 1.52m -- what effect would this have on the sample values and on the semi-variogram? Table 3.1 shows the ‘borehole log’ for 1.52 and for 3.04m samples. The experimental semi-variogram was also calculated for the 3.04m cores, with the results shown in Table 3.2.

Fig. 3.1. Experimental semi-variograms constructed on various lengths of core --- lead/zinc example.

Figure 3.1 shows the ‘new’ experimental semi-variogram alongside the 1.52m one for comparison. For good measure, both tables and the figure also show the resulting values for cores of 4.56m. It can be seen immediately that the 3.04m semi-variogram is always lower than the 1.52m, and the 4.56m one is considerably lower than both. Let us return to the basic assumptions of Geostatistics and try to explain this behaviour. We must recall two facts from Chapter 1. The first is the basic definition of the semi-variogram: it is the average square of the difference in grade between two samples a given distance apart. If those samples were ‘points’ then the grade is assumed to be measured ‘at a point’; if they are cores then the grade measured is the average grade over the core length. Thus we are not comparing two individual grades, e.g. g1 and g2, we are comparing two average grades 1 and 2. We cannot reasonably expect the average grade over 1.52m of core to have the same behaviour as the grade of a ‘teaspoonful’ of ore. Similarly, if we take the grade and average it over 3.04m we would expect different behaviour again. The question is how to characterise this difference in behaviour.

The second fact to recall from Chapter 1 is that the sill of the semi-variogram --- if one exists --- is equal to the ordinary sample variance. If we are dealing with ‘point’ samples, then we can estimate the sill of the semi-variogram, and compare this value with the sill. That is, C=s² (ideally).

Now, if the samples are cores of a certain length l (e.g. 1.52m) and we measure the average grade over that length, then we have smoothed out some of the ‘point’ variation. We have replaced a large number of individual ‘points’ with one average value. The variance of the averages will therefore be less than the variance of the ‘points’, so that


In a similar way
C3.04 will be less than C1.52 and so on. If we have a model for the semi-variogram for the point samples we could produce the model for any other size of sample, by employing the mathematical relationship between the point model (g ) and the model for samples of length l (gl). Since we are only using a limited number of simple models for the point semi-variogram, it is not too difficult to state this relationship.


If we have a linear model for the point samples,
g(h)=ph, where p is the slope of the semi-variogram line, then the semi-variogram for samples of length l is given by:

 


This is illustrated in Fig. 3.2, with the point model for comparison.

 

Fig. 3.2. Regularisation of a linear semi-variogram by core lengths.

 

 In practice, we generally have an experimental semi-variogram for samples of length l, that is gl*, and we need to find the model for the point samples (g )  for use in the later chapters. Since the slope, p, of the core model is the same as that of the point model, simply measuring the slope of our experimental gl* will give a value for p, and hence the point model, g. One complication arises if the point model is actually a linear model plus a nugget effect. Taking core samples lowers the line, but a nugget effect will raise it again. From the above formula, if no nugget effect is present, extending the line of the core model back until it intersects the semi-variogram axis should produce an intercept of -pl/3. Once an estimate of p has been made this can easily be checked, and if necessary a nugget effect C0 added to the model.


Now suppose our deposit followed an exponential model, with sill
C for ‘point’ samples, i.e.

 


For cores of length
l, the theoretical model becomes:

 


with a rather more complex form for distances less than the length of the core
(h<l). Since we are unlikely to have values of an experimental semi-variogram for distances less than the sample length, the form of it seems rather academic. Figure 3.3 shows a point exponential model and the corresponding ‘regularised’ curve for a sample of length l. It can easily be seen that C l is lower than C. In fact:

 

 

so that a sample which was, say, one-fifth of the range of influence, would produce a sill:

 

 

That is, the new sill will only be 94% as high as that of the point model. It will also be noticed from Fig. 3.3 that extending the ‘linear’ part of the core model (close to the origin) until it intersects the sill produces an estimate of the range of influence for the cores a l, which is longer than that of the points,  a.

 

Fig. 3.3. Regularisation of an exponential semi-variogram by core lengths.


In fact, a l =a+l. This seems quite sensible if you remember that cores will have to be just that bit further apart before they become independent.


The above arguments and formulae apply to the situation where you know the ‘point’ model and you wish to find the ‘regularised’ model. In practice the situation is generally reversed. We usually have an experimental semi-variogram which has been calculated on cores of a given length, and we need to find the point model for use in the estimation techniques. Suppose, then, that we have a graph of the experimental semi-variogram
g l*, and we have decided that our deposit follows an exponential model. The first step is to guess the two parameters Cl  and al  . Since the model is exponential, the sill Cl  will be greater than most of the experimental points on the graph. Having guessed Cl, produce a line up through the first two or three points on the graph until it cuts the sill. This will give a first guess at al. We know that a=al-l, so we have a first estimate of a. Using this in the above formula for Cl, we can reverse the equation and produce a value for C, the point sill. We now have guesses at the values of a and C which govern the point model. The next question is whether these are ‘good’ guesses. We have already stated that if we know the point model, we can produce the corresponding model for cores of any given length, i.e. gl(h). If our guesses are good ones then this theoretical model for gl(h) should match the experimental semi-variogram, gl*(h). Substituting values for h, l, a and C produces a smooth curve like the lower one in Fig. 3.3, and this can be compared to the data. If necessary, a and C can be altered until the ‘model’ values become a good fit to the ‘data’ values. In effect, this is the same procedure as was used in Chapter 2, but with an additional consideration of the sample length.


Let us now turn to the most common model --- the spherical model. This will be influenced in the same sort of way as the exponential. The sill for the cores will be lower than that for the ‘points’, and:

 


The formula for the semi-variogram of the cores is extremely complex because of the ‘discontinuity’ in the model but an example is shown in Fig. 3.4. A subroutine to evaluate the formula has been published. If the calculations are to be done by hand (or hand calculator) then it is easier to use tables such as Table 3.3.

Fig. 3.4. Regularisation of a spherical semi-variogram by core lengths.

This table shows the form of the ‘regularised’ semi-variogram for a core of length l if the original point semi-variogram had a range of influence a, and a sill of 1. The use of this table is best illustrated by an example. We can now return to the example shown in Fig. 3.1 of the zinc values measured over core lengths of 1.52m. In Chapter 2 we guessed that the sill lay at about 10.5(%)². This is our first approximation of Cl. Producing the line through the first two points on the experimental semi-variogram gives 2al ¸3=9.6m (approximately). That is, al =14.4m, and hence a=12.9m. Using the formula:

 


The first estimates, then, for the parameters of the point model are
a=12.9m and C=11.2(%)². We must find the row in the Table 3.3 which corresponds to our value of a/l, i.e. 8.5. The entries along this line correspond to multiples of the sample length l. That is, h/l=1 means h=1.52m, h/l=2 means h=3.04m and so on. We see that at h/l=1 the table gives a value of 0.116. This would be for a semi-variogram with a sill of 1. Since we have a sill of 11.2(%)², the value we require is 0.116´11.2=1.30(%)². This is now a ‘model’ value for the semi-variogram of cores of length 1.52m and can be plotted on the graph next to the ‘observed’ value of 1.33(%)².

A second point on the model would be at h=3.04m, i.e. h/l=2. The table gives a value of 0.288 for C=1, so that our model value is 0.288´11.2=3.23(%)². This can be compared with the experimental value of 3.09(%)². This process is repeated until we have a model value to compare with each observed value. The resulting model curve has been plotted in Fig. 3.5. This seems to be rather a good fit to the experimental semi-variogram, if we accept the sill at 10.5(%)².

Fig. 3.5. Fitted regularised model to the lead/zinc example --- 1.52m cores.

Adjustments could be made if the sill was thought to be too low, by raising C and a. Suppose we accept this ‘point’ model with a=12.9m and C=11.2(%)². We can run a secondary check by comparing the models for core lengths 3.04m and 4.56m. For the former, a/l=4.25 so that we must interpolate in the table between a/l=4.00 and a/l=4.50. Linear interpolation is generally sufficient for this sort of exercise. Figure 3.6 shows the experimental and model curves for each sample length, and the point model for comparison.

Fig. 3.6. Fitted models to the lead/zinc example --- 3.04m and 4.56m cores and the ‘point’ model.

The model seems to be a good fit to the 3.04m semi-variogram, especially to the first four points. However, after the first point on the 4.56m semi-variogram the model here is consistently considerably higher than the experimental semi-variogram until h is about 41m. This could perhaps be neglected in view of the fact that each of these experimental values is calculated on 15 or fewer pairs. All in all, the spherical model as estimated seems to be a pretty good fit.

 

VOLUME--VARIANCE CALCULATIONS

This process of the semi-variogram changing with different ‘support’ is usually known in the literature as ‘regularisation’ --- on the basis that the samples get more regular as the sample size increases. We have seen that we can handle experimental semi-variograms for core samples, and still derive the supposed point model. However, this leads us on to another problem of the ‘volume--variance’ relationship and the influence of sample size on the sort of distribution encountered. Suppose at the pre-feasibility stage of investigating a deposit the management requests a grade/tonnage calculation.

That is, given an economic cutoff grade (or list thereof) can we evaluate (i) the tonnage of ore in the deposit which is above cutoff and (ii) the average grade of that ore. Suppose we take an example to illustrate the problem which arises. A hydrothermal tin vein has been sampled by means of nine development drives approximately 100ft apart in the plane of the lode. Chip samples are taken every 10ft along these drives. The sampling setup is shown in Fig. 3.7.

Fig. 3.7. Typical sampling situation in Cornish tin example.

These chip samples may be considered as ‘points’ since they have a very small volume. Figure 3.8 shows a histogram of the 2730 chip samples taken from the development drives in this lode. Suppose we now specify a ‘cutoff grade’ of 25lb/ton for this lode. The histogram shows that about 44% of the chip samples lie below 25lb/ton. We could (possibly) make the statement that we therefore believe that 44% of the ore in the lode lies below 25lb/ton.

Fig. 3.8. Histogram of chip samples taken from the drives in the cassiterite vein.

Now, the usual method of estimating the value in the stopes is to delineate a block (say 125ft long) between the drives and allocate to that block the average of all the peripheral development samples. It is this estimate which determines whether a stope block enters ‘reserves’ or not. Figure 3.9 shows the corresponding histogram of the estimates of 125ft by 100ft stoping blocks, i.e. the averages of the drive samples over two lengths of 125ft each.

Fig. 3.9. Histogram of estimates of stope values in the cassiterite vein.

We have seen from the previous exercise that we expect averages over lengths to be somewhat less variable than ‘point’ samples. This is adequately borne out by the behaviour of these estimates. Whereas the point values range up to 300lb/ton or more, the drive averages seldom exceed about 150lb/ton. Whilst 44% of the point samples lie below 25lb/ton something less than 8.5% of the ‘block estimates’ do so. Should we now say that 8.5% of the ore lies below 25lb/ton? What we really need to do is to redefine the phrase ‘of the ore’. In the first case what we meant was that if the deposit were divided into chip samples, we could reject 44% of these as being below cutoff. In the second it was 8.5% of the drive averages below cutoff. That is, if the deposit were divided into pairs of 125-ft strips 100ft apart, 8.5% of these would be below cutoff. Or alternatively, by my estimate 8.5% of the stope blocks would be below cutoff. In other words, we cannot define how much ore we have after selection unless we define a unit of selection in terms of size and shape. The real question is ‘how many 125 by 100ft stope panels are below cutoff?’ To answer this question we must determine what sort of distribution these panels would follow. The full answer will depend on (i) the distribution of the original samples and (ii) the semi-variogram of the deposit.

Let us make a general statement of the problem and see how it leads to a solution. The original sample data has a ‘support’ of, say, l; it has a semi-variogram gl(h) with a sill Cl; it has a distribution of grades which can to some extent be characterised by the histogram and which has a mean l and variance Cl. The panels or blocks being estimated will have a support of, say, v; a semi-variogram gv (h) with sill Cv; a distribution with mean v and variance Cv. The first thing we can say is that l and v should be the same, since both describe the average grade of ore over the whole deposit. Thus we can replace them both by , the average of ‘point’ samples. The second thing we can say is that if we have a model for the point semi-variogram we can state the relationship between the point sill C and the ‘core’ sill Cl, and between C and C v  for any defined volume v. Suppose we take the simple example of a core of length l which can be represented as a straight line (since the diameter is very much smaller than the length).


Fig. 3.10. Derivation of the variance of grades within a ‘line’ segment.

This is illustrated in Fig. 3.10. Consider two points on this line, M and M’. We could calculate from the model semi-variogram the ‘difference’ between the grades at these two points. Now suppose we took all possible pairs (M,M’)  which exist within the line --- including the case when M=M’. In this way we could get a measure of the ‘variability’ of the grades within the line. If we take the average of the semi-variogram values g (M-M’) over all possible pairs, then we obtain the variance of the grades within the length l.

This is the variance which is removed from the system if we only consider the average grade over the length l, i.e. the difference between the point sill and the regularised sill, C-C l.  Mathematically:

 


where
F(l) defines the variance of grades within the length l. Although this looks fearsome, it reduces to:

 

for the linear model

 

 

for the exponential model, and for the spherical model:

 


These, of course, correspond exactly with the difference between the point and regularised semi-variograms. Now suppose we want to consider a two-dimensional panel such as that shown in Fig. 3.11.


Fig. 3.11. Derivation of the variance of grades within a panel.

The F function now becomes F(d,b)  to show that it has two dimensions. This would be a quadruple integral, since the points M and M’ can now move throughout the whole panel. The formulae get complicated, but not impossible, and for example of the type of values encountered, Table 3.4 has been produced. This table shows the F(d,b)  function for a spherical model with range equal to 1 and sill of 1. This is a ‘standardised’ spherical model --- in the same sense as a ‘Standard’ Normal distribution. This table can be used to produce the corresponding value of the F function for any spherical model, as follows:

 

                                                           i.divide the lengths of the sides of the panel by the range of influence a;

 

                                                          ii.read off the corresponding entry in the table;

 

                                                        iii.multiply this value by C.

 

 

Examples of such calculations are given later in this chapter. Similar tables may be produced for the linear and exponential models.


In three dimensions the problem of calculating the
F(l,b,d) function analytically appears to be insurmountable. It is necessary to resort to a numerical approximation using a computer. The easiest way to do this is to go back to the definition of the F function: we take pairs of points (M,M’) within the block; consider all such pairs; calculate the semi-variogram value between M and M’; sum all these values and average them --- this gives the F value. Now, suppose we do not take all of the pairs but only a few ‘representative’ ones. That is, instead of considering the block as an infinite number of points we consider it to be a ‘grid’ containing a finite number of points, say on a 5 by 5 by 5 grid. Some authors suggest taking ‘randomly’ distributed points, but there seems little sense in that. Using such a method, Table 3.5 was produced for the ‘standardised’ spherical model. In order to produce only one table, it has been necessary to insist that two sides of the block have the same length. This table is used in the same way as the two-dimensional one. 

 

GRADE/TONNAGE CURVES

So, we now know how to calculate the function
F for one, two and three dimensions, and hence can state the difference between the ‘point’ variance and the ‘regularised’ variance of regular shaped areas and volumes. This will give us a numerical quantity for the reduction in the variance, but unless we make some assumptions about the distribution of the samples, we cannot actually quantify the change in the ‘tonnage above cutoff’ and so on. There are two ways to approach the problem:

                                               i.            Assume that the histogram of the samples represents the whole deposit accurately.

                                              ii.            Assume that the histogram represents a set of samples from the whole deposit, and as such contains some random variation from the ‘population’ distribution.


The first approach declares that the samples are ‘typical’ of the whole deposit, and leads to graphical anamorphisms and transfer functions. The second approach declares the belief that if we could measure the grade at every point in the deposit we would end up with a smooth curve of a fairly simple form. This is a much simpler approach, and generally seems to be sufficient for most deposits.

To start with a simple example, let us consider an iron ore deposit which is known to follow a Normal distribution with a mean of 48%Fe  and a standard deviation of 5%Fe. This distribution has been established on samples small enough to be called ‘points’. We also know that the deposit follows a point semi-variogram model which is spherical with a range of influence of 400ft. Now, suppose that the mine plan is to be constructed on blocks which are 100ft by 100ft by 50ft. What will the distribution of these blocks look like. The first thing we can say is that it will probably be Normally distributed. It will certainly have the same mean (48%Fe)  as the ‘points’. The only change will be in the standard deviation of the distribution. We need to evaluate the function F(100,100,50) for a spherical model with a=400 and C=25. To use Table 3.5 we must ‘standardise’ the situation so that the range of influence becomes 1. That is, F(100,100,50)  for a=400 is the same as F(0.25,0.25,0.125) for a=1. Table 3.5 gives a value of 0.209  for these arguments, but this is for a model whose sill is 1. For our model the required value is 0.209´25=5.225(%Fe) ². This is the difference between the point variance and the block variance. Therefore the variance of the block values will be 25-5.225=19.775(%)²  leading to a block standard deviation, sv, of 4.45%Fe. This is slightly over 10% less than the point standard deviation, as would be expected with such a ‘small’ block. Thus we have two distributions to be considered, both Normal, as follows: