Practical Geostatistics 2000 Data Sets
These data sets are featured in Practical Geostatistics 2000 and can be analysed with any of the demo programs.

Small data files [81KB zipped]

Large data files >10,000 samples [482KB zipped]

 

1.3 Data sets

The sort of applications presented within the book are mainly geological with some hydrology and environmental case studies. The potential applications include any form of measurable spatial data and some which cannot be given a quantitative measure, such as rock type, land use etc. We have included applications of geostatistical techniques in the following fields (so far):

    • Coal: a simulated set of data based on a real coal seam in Southern Africa. Boreholes drilled into the coal seam are measured for: thickness of coal (metres), energy content or `calorific value' of coal (Megajoules per tonne); ash content (%) and sulphur content (%). Three co-ordinates in metres are available for the top of the coal seam where intersected by the drillhole.
    • GASA: this data set is named for the Geostatistical Association of South Africa and was used in an illustration of geostatistical techniques at a meeting in April 1987 in Johannesburg. The sample data are taken from deep boreholes drilled into a typical Witwatersrand type gold reef. The measurements of interest are the grade of the gold in grams per tonne of rock (parts per million) and the thickness of the reef intersection in the borehole (centimetres). The 27 boreholes lie approximately 1 kilometre apart and constitute a typical data set for the planning and design of a new Wits gold mine. The values have been disguised by a factor but are otherwise unaltered. Co-ordinates are in metres.
    • Samples: this data set is based on a Wits type gold mine some decades into production. The samples are chipped from the face of the reef in a working section of the mine (stope). As the face advances, new chip samples are taken. Values within a stope are traditionally estimated using the sample values from the face. This data is totally fictitious except for the locations of the samples, which are taken from a real Wits type gold mine.
    • Copper: a simulation based on a stockpile of mined material in the former Soviet Union. Boreholes have been drilled into the dump. The drill core is cut every 5 metres and assayed for copper and cobalt content in percentage by weight. This is the only three dimensional set of tutorial data. Co-ordinates are in metres.
    • Geevor: this is sample data from a hydrothermal tin deposit in Cornwall, England. The mineralisation appears as a continuous vein which is sub-vertical. Samples of around 1kg are chipped across the vein, which averages about 24 inches wide. Measurements are grade of tin in pounds of black tin (SnO2) per ton of rock. The thickness of the vein or 'lode' is measured to the nearest inch. Co-ordinates are in feet along section and elevation above an arbitrary base level. Clark, I., 1979, "Does geostatistics work?", Proc. 16th APCOM, Thomas J O'Neil, Ed., Society of Mining Engineers of AIME Inc, New York, 213-225.
    • Wolfcamp: measurements of water pressure (potentiometric level) in 85 water wells in the Texas panhandle. This data set was part of a study carried out by the Office for Nuclear Waste Isolation in the mid 1980's on a potential site for a high level nuclear waste repository. The Wolfcamp aquifer underlies the planned repository. One aspect of repository planning is to quantify the risks inherent in a breach of the storage facility. Should radionuclides leak into the local aquifers, the scope and speed of potential contamination has to be assessed. The pressure of fluid within the aquifer was one of several variables used to determine the travel path and speed of travel for escaped radionuclides.

Reference: Harper, W.V., and Furr, J.M., 1986. "Geostatistical analysis of potentiometric data in the Wolfcamp Aquifer of the Palo Duro Basin, Texas", BMI/ONWI-587, April, Office of Nuclear Waste Isolation, Battelle Memorial institute, Columbus, Ohio.

    • Scallops: Scallop data were collected during a 1990 survey cruise off the east coast of North America. Scallop counts were obtained using a dredge. Any scallop smaller than 70 mm was termed a prerecruit. Total catch is the sum of prerecruits and recruits. Measurements included in the data file are:
      • National Marine Fisheries Service (NMFS) 4 digit strata designator in which the sample was taken;
      • sample number per year ranging from 1 to approximately 450;
      • location in terms of latitude and longitude of each sample in the Atlantic Ocean;
      • total number of scallops caught at the sample location;
      • number of scallops whose shell length is smaller than 70 millimeters;
      • number of scallops whose shell length is 70 millimeters or larger.

Reference: Ecker, M.D., and Heltshe, J.F. 1994. "Geostatistical estimates of Scallop Abundance", In, Case Studies in Biometry, Lange et al., editors. Wiley, New York

    • Dioxin: A truck transporting dioxin contaminated residues dumped an unknown quantity of these wastes onto a farm Road in Missouri. In November, 1983, the U.S. EPA collected samples of the site. In order to reduce the number of samples required, samples were composited along transects. The transects run parallel to the highway, and this direction is designated as the X-direction. The direction perpendicular to the highway is designated as the Y-direction. Data are TCDD concentration (tetrachlorodibenzo-p-dioxin) in micro grams per kilogram (mug/kg). Co-ordinates and transect length are given in feet. Reference: Zirschy, J.H., and Harris, D.J. 1986. "Geostatistical analysis of hazardous waste site data". Journal of Environmental Engineering, 112:770-784.
    • Organics: Data are Soil Organic Matter values (in grams per kilogram) derived from soil samples taken in a research field at the University of Nebraska West Central Research and Extension Center near North Platte, Nebraska, USA. Data were taken as part of experiments on variable-rate fertilizer technology. Co-ordinates are in metres. Reference. Gotway, C.A. and Hergert, G.W. (1997). ``Incorporating Spatial Trends and Anisotropy in Geostatistical Mapping of Soil Properties''. Soil Science of America Journal, 61:298-309
    • Velvetlf: Subsample of the number of velvetleaf weeds counted in 7 meter² area in a field in Nebraska. Data were collected by Gregg Johnson (see 2nd reference), as part of a research program in weed management at the University of Nebraska.

References: Data set taken from: Gotway, C.A., and Stroup, W.W. 1997. "A generalized linear model approach to spatial data analysis and prediction". Journal of Agricultural, Biological, and Environmental Statistics, 2:157-178.

Data collected by: Johnsen, G.A., Mortensen, D.A.,, and Gotway, C.A. 1996. "Spatial and temporal analysis of weed seedling populations using geostatistics". Weed Science, 44:704-710.

All of the above case studies appear somewhere within the text.

Practical Geostatistics 2000 home page