Tutorial
Session Five – Universal Kriging
The
example session with EcoSSe which is described in this and Part 1 is
intended as an example run to familiarise the user with the package. This
documented example illustrates one possible set of analyses which may be
carried out. One of the most neglected aspects of statistical analysis ---
especially of spatial data --- is the purely visual assessment of the sample
data. It takes you through the following sequence of analyses:
Ø Cross-validation of the
semi-variogram model
Ø Kriging a grid of point
values for mapping
There
are many other facilities within the package, which are given as alternative
options on the menus. This part of the documentation assumes that you have
worked through Tutorial Four.
Cross validation of the
semi-variogram model
For
this Tutorial, we have decided to continue with some geostatistical estimation
using the model which we have fitted.

As
you can see, I have chosen the option to
. There
is a bit of confusion in the literature in the naming of the process. Some
authors call this jack-knifing. This
nomenclature is misleading, since the procedure bears little relationship to
what statisticians would expect by jack-knifing. Other authors use the two
words hyphenated or as a single word. We have chosen the above form to emphasis
the meaning of the procedure. We attempt to validate
our semi-variogram by dropping out each sample value and (cross) estimating the
value at that location from the neighbouring samples. We then compare the
estimated value with the actual value, and the difference between them with the
supposed geostatistical error.
There
are different ways of comparing the actual error with the Kriging error. We
have chosen a simple method by calculating the ratio between the two --- i.e.
actual error divided by Kriging standard deviation. If certain basic
assumptions are satisfied, and we have
chosen the correct semi-variogram model, these (error) statistics should
average zero and have a standard deviation of one. We use the mnemonic XVAL for cross validation. Running this option will also
give you an idea of how long Kriging is likely to take on your computer.
EcoSSe remembers which variables
you were studying. You can come to this routine directly. There is no need to go
through the whole procedure to get this far! In that case you will have to
select the variables in the same way that you did for the semi-variogram
analysis.

To
carry out cross validation --- which includes kriging the estimates --- you
need a semi-variogram model. In this Tutorial we have already fitted a model,
so EcoSSe
will remember this and offer it to you as the base model. However, for kriging
there are one or two extra parameters which need to be defined. If you have a
significant trend (as we have with the Wolfcamp) that should be defined so that
the kriging can allow for it. In the presence of trend, we use Universal
Kriging.
|
|
|
It
is worth taking a moment to discuss the inclusion of “trend” in the kriging
system. We found that a Quadratic Trend Surface was the most significant one
earlier on in this run. However, that trend
was a surface which described the whole study area. When we start Kriging, we
need to know what sort of trend exists on
the scale of estimation. For example, the extent of the WOLFCAMP study area is around 250 miles. However, we
will not use all of the samples in each individual estimation. In fact, EcoSSe
will suggest a search radius of 60 miles, since that is our range of influence.
If a high order of trend is fitted to a large area, we can often drop one or
two orders when working at the estimation scale. We could have fitted our
semi-variogram model to the (say) Quadratic residuals - on the large scale - and still decided to use only a Linear trend
at the “local” scale.
Always
bear in mind that when there is a trend in the original sample data, all
parameters refers to the semi-variogram of the residuals. Although the values in the WOLFCAMP area are obviously anisotropic, the
semi-variogram we fitted to the residuals
is isotropic. The lower boxes in the dialog allow you to specify simple
geometric anisotropy. Click on the
button and you will get the following
information:

Close
the box (
)when
you have seen enough and click on
to get the routine to accept the
semi-variogram model for kriging. Now we have variables to study and a
semi-variogram model which tells the software how the values are related to one
another. For cross validation, we will take each sample in turn and remove it
from the data set. The neighbouring samples will be used to produce an estimate
at this location. We can then compare estimated value with the actual value
found in the sample at that location.
Before
we can go any further, we need to define the “neighbourhood”. That is, how far
do we want the software to search for samples to be included in the estimation
process. Since we have an “isotropic” semi-variogram model, it seems sensible
to select an isotropic search radius. Since we have a Spherical model, EcoSSe
will suggest that we use the range of influence of the model as a default
search radius.
|
|
|
The
default search radius is always the range of influence of the first component
fitted --- providing the model is Spherical. For Exponential and Gaussian
models the search radius is adjusted to a realistic distance. For models
without a sill, EcoSSe
cannot guess what an appropriate search radius would be.
The
default search radius, given our semi-variogram model, is 60 miles. However, we
need to ensure that enough samples lie within our search circle to characterise
the trend as well as provide a weighted average for the estimated value. With
so few samples, we have chosen to enlarge the circle to 75 miles to make sure
we get enough samples.

If
you answer “yes” to this question, you will be prompted for an output file
name. The default name is that of the original data file with the extension .XVL. You can change the data file name, the extension
or both if you so wish. This file will be written in the correct format to be
read back in as a data file. The cross validation outputs a table of values on
the ghost.lis file as the estimation
proceeds. The final column “error statistic” is the ratio of the actual error
to the Kriging standard error.

As
the cross validation is carried out, a post plot will be drawn of the “error
statistics”. The contour levels for this graph have been chosen so that a value
in the highest (+2.5) and the lowest (-2.5) contour bands should
occur one time in one hundred. This plot is an excellent device for visually
spotting outliers in the sample data. These need not be outliers in the usual
statistical sense. That is, they may be quite acceptable values as such. What the cross validation will show is whether they
are acceptable values in the context of
the neighbouring samples.
The
left hand box on the screen summarises the various calculated values. A direct
comparison can be made between the average actual value (2002 feet) and the
average Kriged value (2004 feet). The standard deviations of actual and
estimated values are also very close in this instance. The average (typical?)
Kriging standard error is 153 feet, although the individual standard errors
vary widely around this value. Finally, looking at the all-important “error statistics”,
we find an average of 0.1029 and a standard deviation of 1.1893. Ideally, we
are looking for zero and one. It is your decision as to whether 1.19 is close
enough to 1.0 to be accepted (sic). Please refer to full documentation for
further discussion of this point.
After
many iterations in the real world, including re-assessment of the trend
residuals, we ended up with the following semi-variogram model for the wolfcamp data:

The
cross validation statistics from this model are 0.0045 and 1.0191 respectively,
much closer to the ideal 0 and 1. This model can be exported to file for future
use, using the facility on the Service option on the main menu.
For
this Tutorial we will accept this improved model. If you do not wish to accept,
choose “cross validation” again and change the semi-variogram model. To reduce
the standard deviation of the errors you will need to raise the Sill or nugget
effect, or reduce the range of influence.
If
you choose to store your cross validation results on a file, this file can be
read back into EcoSSe. You can produce scattergrams of, for example, estimated
values versus actual values to see how well the kriging is performing. You can
also do a probability plot of the ‘error statistics’ to see if
Interpolating a map with
kriging
Interpolating
a grid of points with kriging will produce an estimated (or “predicted”) map of
the values over the study area. This map reflects the actual values measured at
the actual sample locations and uses a weighted average estimator for grid
points which have not been sampled.

Weights
are determined by a set of equations which combine:
q the spatial continuity as
modelled by the semi-variogram
q any anisotropy identified
and modelled in the semi-variogram
q any trend component
identified and defined by the user
q the spatial layout of the
samples relative to the points being estimated
q the spatial layout amongst
the samples themselves (clusters, irregularities etc)
The
chosen weights will minimise the “estimation variance”, which may be
interpreted as a measure of the estimation error.
EcoSSe will remember everything
which has been defined during this run. We have already defined which variables
we have been analysing:

Click
on
to proceed. The routine also needs contour
levels:

and
to know whether you want the results stored on a “grid” file:

The
default name for a grid file is the original data file name with the extension .GEA. EcoSSe will suggest contour levels based on the
variability of the sample values. You can change these if you so desire.
Alternatively you can run with the default contours and draw prettier maps by
reading the grid files back in. Please note that “grid” files are not in the same
format as “data” files. If you want to read them back in, you must use the
option:

You
need to confirm semi-variogram model, search parameters and the area which is
to be studied.
The
semi-variogram we defined previously in the cross validation section. If you
come directly to the kriging routines without passing through any others, you
will need to respecify your semi-variogram model including the local trend
component.
|
|
|
You
need to define search parameters and the area which is to be studied. The
neighbouring samples will be used to produce an estimate at each unsampled grid
point. Before we can go any further, we need to define the “neighbourhood”.
That is, how far do we want the software to search for samples to be included
in the estimation process.
EcoSSe cannot guess what an
appropriate search radius would be. As a simple default, with models such as
the Spherical, a default based on the semi-variogram parameters is
offered. When the value at a specified
grid point is being estimated, all samples within this circle of the point will
be used in the Kriging process. If there are too many samples within this
circle, those closest to the “unsampled” location will be selected.
|
|
|
The
default search radius, given our semi-variogram model, is 60 miles. However, we
need to ensure that enough samples lie within our search circle to characterise
the trend as well as provide a weighted average for the estimated value. With
so few samples, we have chosen to enlarge the circle to 75 miles to make sure
we get enough samples.
In
this run, we already defined a boundary of interest to us. If you wish to
change this boundary and, say, look at the whole Wolfcamp area, simply click on
.

Once
you have chosen the area to be studied, you must define the grid spacing to be
used. Points will be calculated at each grid node and represented on the screen
as a shaded rectangle of the appropriate size.
Since
we have previously specified a grid spacing or number of grid points, the
software offers these parameters once more.
We can alter the grid spacing by changing the number in the relevant
box:

If
you make a change and want to check how many grid points you have before
proceeding, click on
and the rest of the parameters will be
updated. You may also change minimum and maximum X and Y values at this stage.
Once you click on
the map parameters will be defined.
Interpolating
a grid of points produces a sketch map on the screen. The shading information for the contour
levels will appear in the left hand box and the map itself in the right. A
shaded square will be displayed on the map to show you which point is being
estimated in addition to the information in the prompt box. You may copy the
screen to your printer at any stage during the estimation process.

When
the Kriging has been completed, press the
When
the Kriging has been completed, you have the following options:
![]()
To
display the data locations, click on
.
If
you click
the “error” map will be displayed showing the
standard errors associated with the estimated grid points. To display the data
locations on the standard error map, click on
.
You
can copy the plots with
+
and
paste them into another application. Some systems (notably Windows NT) require
pressing
+
. This will place a copy of the Window in the
clipboard. You can import the picture
into a Word processing application such as Microsoft Word, a spreadsheet
application like Lotus or Excel, or paste
+
into
many applications, such as MSPaint.

If
you have elected to write a grid file, these values are also stored on the .GEA file so that you can redraw the maps with different
contours by reading back the grid file.
Finishing up

Clicking
on this menu item or on
will end your run with the software. You will
see the closing down dialog box:

The
above Tutorial session should serve only to illustrate a possible use of the
various routines from EcoSSe. Try running the program again, choosing
your own responses. try looking at reef width instead of grade. This variable
has a standard two parameter lognormal distribution. Try reading in one of the
other data files which are provided, say, samples.dat.
General Notes
There
are a few points which you may have noted in following the Tutorial session
above. Most of the routines communicate between themselves, without you having
to worry about getting the right information from one to the other. For
example, after you read in the complete contents of the data file, the routines
ask which of the variables you actually want to analysis. This information is
then stored internally and may be accessed by any of the other routines. This
is a feature of most of EcoSSe, in that it will recall what you chose
previously and ask whether this is to change or not. You should bear this in
mind if you are analysing more than one data file in a single run. In
particular, the boundary used in mapping will be remembered. If you change data
file or even which variables you analyse this will not automatically update.
A
copy of this run should have been made on a file called GHOST.LIS. Send this file to your printer as a record
of the analysis or look at it with Wordpad or Notepad. When you start another
run, a new file GHOST.LIS will be started and the old
one will be destroyed. If you want to keep it, use ‘Save As’ and give it
another name.
EcoSSe ---
like any computer software --- is not completely error-free. Neither is it
fool-proof. You can always get out of the software by pressing the
,
and
keys at the same time. This will invoke the ‘End
Task’ facility to close the Window without damaging the rest of your system. If
you cannot figure out what went wrong, note down as much information as you can
about the program you were running, the data you were using and exactly where
it broke down. Contact your supplier locally or Geostokos direct for
assistance. Send us the ghost.lis file and (if you can) the
data you were analysing at the time.