James A. Kuiper

Grid-Based Modeling for Land Use Planning and Environmental Resource Mapping *


Presented at the Nineteenth Annual ESRI User Conference, San Diego, California, USA
July 26-30, 1999
Sponsored by Environmental Systems Research Institute (ESRI)



The submitted manuscript has been created by the University of Chicago as Operator of Argonne National Laboratory ("Argonne") under Contract No. W-31-109-ENG-38 with the U.S. Department of Energy. The U.S. Government retains for itself, and others acting on its behalf, a paid-up, nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

* This work was supported under a military interdepartmental purchase request from the U.S. Department of Defense, U.S. Army, through U.S. Department of Energy contract W-31-109-Eng-38, and by the U.S. Department of Energy, Chicago Operations Office, under contract W-31-109-Eng-38.


ABSTRACT

Geographic Information System (GIS) technology is used by land managers and natural resource planners for examining resource distribution and conducting project planning, often by visually interpreting spatial data representing environmental or regulatory variables. Frequently, many variables influence the decision-making process, and modeling can improve results with even a small investment of time and effort. Presented are several grid-based GIS modeling projects, including: (1) land use optimization under environmental and regulatory constraints; (2) identification of suitable wetland mitigation sites; and (3) predictive mapping of prehistoric cultural resource sites. As different as the applications are, each follows a similar process of problem conceptualization, implementation of a practical grid-based GIS model, and evaluation of results.


INTRODUCTION

Geographic Information System (GIS) software providers have made major advances in providing powerful analysis capabilities and increasingly streamlined and friendly user interfaces. GIS data sources are rapidly increasing in availability, coverage, accessibility, compatibility, and quality, and in many cases their costs are decreasing. Many land managers and natural resource planners have access to a GIS for their decision making and regularly see advertisements showing that their software can support site suitability analysis, predictive modeling, and land use analysis. Despite this, it is still fairly common to find that the GIS tools are being used simply to view spatial data and produce maps, rather than for analysis and modeling tasks. Some of the reasons for this include limited user knowledge of the GIS software capabilities, practical limitations on data availability for a particular problem, short time frames for project planning, and budget constraints. This paper describes a general purpose approach to grid-based environmental constraint and resource mapping and briefly presents three application studies. The examples demonstrate the versatility of the technique and show how the results can contribute to management or planning activities.

The first case study describes a model to examine "limits to growth" of four types of activities at a government installation. The work was done in conjunction with a programmatic environmental impact statement project (Kuiper 1996) that included production of an extensive, environmentally focused GIS (DSHE 1996). Fourteen environmental and regulatory constraints were considered in the model, which used weighted summation as the calculation method.

In the second case study, a weighted summation model was used to identify potential wetland mitigation sites at Argonne National Laboratory in northeastern Illinois. The analysis was performed quickly and took good advantage of available GIS data. The model successfully identified seven alternative sites that had a variety of characteristics.

A map depicting the potential for prehistoric archaeological sites was produced in the third case study. The model in this case assigned statistically calculated measures of site potential to locations in the GIS having specific combinations of environmental variables. Nearly all of the known sites in the study area were identified as high or medium potential by the model. This work is also described in detail in Wescott and Kuiper (1999) and Kuiper and Wescott (1999).

THE MODELING APPROACH

Environmental constraint mapping, site-suitability analysis, predictive modeling, cost-surface modeling, multivariate analysis, and land use analysis are all terms used to describe methods for modeling resource distribution in GISs. These methods, usually conducted with grid data, allow the GIS to generate useful answers from many layers of information related to the distribution of a resource or siting of a project. Each layer represents the spatial distribution of a variable that influences the resource being considered. In many cases, the original data are in vector format, but gridding the data to perform the analysis in grid format usually requires much less computation time. Modeling approaches include grid algebra, linear or logistic regression, and supervised or unsupervised classification. For many problems, grid algebra is an efficient technique that will yield useful answers even under limited budgets and tight schedules. This paper is focused on projects that fit this description; however, many of the model formulation steps described here apply to the other, more complex methods listed. Some of the other methods are more statistically defensible or answer different questions, but they may also require specific types of input data, additional assumptions, or a specific sampling design.

Problem Definition

To formulate a meaningful environmental constraint model, the factors that contribute to locating the resource in question must be defined carefully and specifically. This process can range from simply identifying contributing variables on the basis of intuition or experience, to statistically analyzing measured variables to identify the most significant ones. Once identified, some critical variables may not be appropriate for GIS analysis. For example, in the wetland mitigation study, development costs were a major planning factor, but difficult to model in the GIS. Cost was more efficiently addressed on a site-by-site basis after the model identified good candidate sites on the basis of other factors.

Each factor must also be considered from the perspectives of GIS data availability, source scale, accuracy, applicability, and spatial extent. For example, in the archaeological model, soil type may have been a useful variable, but was eliminated primarily because the only readily available source for soils data was a general 1927 map that did not cover the full extent of the study area. In contrast, topographic setting was determined to be essential to the model and was included despite the considerable amount of processing needed to produce it.

The method used for calculation of the model should be determined on the basis of the input data and the desired analysis result. This method must usually be decided in advance because it determines the way values are coded in the input GIS layers. In many cases, input layers contain categorical data, such as the presence or absence of a regulatory constraint or several types of land cover. For calculation of a weighted-sum or weighted-mean model, these layers must be assigned values relative to their costs or benefits to the issue being analyzed. For a model based on co-occurrence of unique combinations of environmental factors, it may be necessary to convert ratio data, such as elevation or distance to water, to discrete ranges or to collapse categorical data into fewer categories.

Data Preparation

Factors to be used in the model must be represented as distinct grid layers that are co-registered and have the same cell resolutions. The extent of the study area and the grid cell size to be used for modeling are important considerations at this step. All layers must cover the full study area with an identical boundary so the model can give consistent results for the whole area. This condition is more difficult to accomplish than it may seem. For example, in the archaeological predictive model case study, the zero elevations in the digital elevation model (DEM) did not exactly match the Chesapeake Bay shoreline edges derived from aerial photography. Since the DEM was generated from a more general source, the zero elevation pixels along the shoreline edge were set to a 1-foot elevation.

Similarly, many factors influence the selection of cell size. These factors include the resolution of the source data layers, storage space, processing time, the size of the resource being studied, spatial error introduced by the gridding process, and the desired resolution of the model results.

The coding scheme for values in the input layers must be selected so as to be compatible with the modeling method being used. The layers might contain (1) 1 or 0 to represent the presence or absence of a constraint, (2) a code representing the type of a resource at that location, or (3) a value between -1 and 1 representing the relative cost or benefit of a resource. Once these decisions are made, production of the grid layers can begin. The many possible editing and processing steps to prepare the grid layers are beyond the scope of this paper.

Model Implementation

Once the input layers are ready for modeling, the final details for the model, such as weights or site potential values, must be developed. In some cases, these factors can be estimated on the basis of sample data, such as in the archaeological predictive model case study, but often they must be intuitively assigned according to qualitative considerations. Three commonly used grid algebra models are weighted summation, weighted averaging, and assigning scores to unique combinations of values.

The equation for weighted summation is:

Weighted summation
equation,
where n is the number of environmental layers, xxyi is the value for environmental layer i of a grid at location xy, and wi is the weight associated with the ith environmental layer. The calculation is performed for all occurrences of x and y. In Arc/Info grid notation, a weighted summation model for wildlife habitat suitability might be

suitability = (0.5 * foodsource) + (0.25 * cover) + (0.25 * watersource),

where foodsource, cover, and watersource are grids coded with values of 1 and 0 to indicate the presence or absence of the resource. The higher weight for food source gives it a larger influence on the result. In this situation, the output values will range from 1 (highly suitable) to 0 (unsuitable) and be stored as a floating point grid. Note that file size of the model output can be reduced significantly by using integer weights. For example, if the weights were 10, 5, and 5 respectively, the output would range from 20 to 0, which can be stored as byte rather than floating point data.

In some cases, input layers with more than one category or with ratio data might be needed. Food sources of different importance could be coded as values from 1 to 10, and water source could be represented as a distance from shoreline, with a value of 0 for locations within water bodies. These ranges can be scaled to ranges of 0 to 1 (or -1 to 1 if negative values are used) with a multiplicative factor. This value should be multiplied by the relative importance weight to produce a single weight that will adjust for range differences and relative importance. The important consideration is to design the model so that the weights adjust for range differences and relative importance so each input layer has the proper influence on the output sum.

Use of a weighted-mean computation can make it easier to conceptualize the model and control the range of the output values; however, if the weights sum to 1.0, a weighted sum is equal to a weighted mean. The weighted mean is calculated as:

Weighted mean equation.

Another approach that works well for categorical data is to assign scores to unique combinations of input values. This procedure is especially useful where the co-occurrence of several factors is more significant than simply the presence of a number of positive factors. Continuing with the habitat suitability example, there may be one or more combinations of food source, cover, and water source that are essential for good habitat. In this situation, results of a weighted-sum model could be misleading if a location having a good food source and good cover, but no water source resulted in a high output score.

To run a unique combination model in Arc/Info, the Grid COMBINE function is run against the set of input layers, producing an output image with a different value for each unique combination of input values. Scores can be assigned to these combinations by populating a field in the grid attribute table using a relational join to a table of parameters. In practice, the number of possible unique combinations rapidly increases with the number of input layers and the number of groups in each layer, thus increasing the complexity of assigning scores intuitively or diluting the sample size available from supporting data sources. In the archaeological site potential study described below, the scores were determined statistically by analysis of environmental data collected at previously discovered archaeological sites in the region. The final model was limited to four layers with two levels each so that meaningful scores could be derived from the supporting database of known archaeological sites.

The unique combination model uses an approach similar to logistic regression, which assigns probabilities to unique combinations of input data on the basis of their observed values. The difference is that logistic regression requires both presence and absence data as inputs, while the unique combination model uses only presence. For example, a logistic regression archaeological site potential model would require environmental data from surveys both where sites were discovered and where they were not. This added "non-site" information allows the model to better separate environmental factors that are indicative of sites rather than indicative of the survey area in general. Unfortunately, available archaeological data used in the study described below did not include non-site information, so a unique combination model was implemented rather than logistic regression.

Linear regression and supervised or unsupervised classification are additional techniques useful for developing resource-distribution models. These techniques require interval or ratio data for the independent variables and include assumptions of normal statistical distributions.

Visualization, Validation, and Revision

Once a model has been run, visualization and validation are used to assess the results. In the case of Arc/Info software, usually the grid is displayed with the GRIDSHADES or GRIDPAINT commands using the "linear" option to ramp the values from 1 to 256. Shade sets such as rainbow.shd with graded colors are useful for color schemes. Model input and output cell values can be interactively examined by using MAKESTACK to group them and CELLVALUE to interactively query cells in the map display. The Grid HISTOGRAM command can be used to examine the distribution of values in the model result and determine useful color schemes for quantifying the results into discrete categories. In some cases, it may be useful to visualize the result as a cost surface using TIN functions to view the model in 3D.

Visualization provides an intuitive level of validation, but ground-truth data are usually essential to determine the usefulness of the results. In the predictive archaeology model, known sites within the study area were omitted when calculating model parameters and subsequently were used as a validation data source to examine modeling results. In this case, the proportion of the map covered by high and medium potential was calculated, along with the percentage of known sites falling in these areas. Kvamme's Gain Statistic (1 - [% area / % known sites]) (Kvamme 1988) was then used to estimate how far the model deviated from a random distribution. In the wetlands mitigation case study, ground truthing took the form of field work performed in the areas identified by the model to gain a better understanding of these areas as candidates for wetlands mitigation and to study them at a higher level of detail.

In many cases, visualization and validation of the initial model run simply help point the analyst to a better solution or approach. The need for model revisions and improvements should be an expected part of the process.

CASE STUDIES

Environmental and Regulatory Limits to Growth in Activities at a Federal Installation

An environmental impact statement (EIS) is a document reporting on the environmental consequences of two or more alternative actions. They typically focus on one or more well-defined projects or activities, and usually result in collecting a large body of environmental information that could be used for other purposes. For a land manager, this information can be used to investigate more general questions, such as, "Which areas of my installation will present me the fewest regulatory, land use, and environmental constraints for a particular activity?" or "How much of a particular activity can my installation support before regulatory and environmental impacts make it infeasible?" The model described in this case study was designed to investigate such issues.

For Aberdeen Proving Ground (APG), Argonne National Laboratory compiled an extensive environmentally-focused GIS (Kuiper 1996; DSHE 1996). APG is a U.S. Army testing and evaluation facility of over 39,000 acres located in the Upper Chesapeake Bay region of Maryland. Limits to growth in missions or activities at APG and most other federal installations are typically determined by a set of environmental, land use, regulatory, and engineering factors. Most of these factors have a spatial component that can be represented as a layer in a GIS. Environmental constraints can include known areas of contamination, existence of protected animal and plant species, locations of important habitats, noise and air-quality impact areas, and cultural resources sites. Land use constraints include current infrastructure development, planned land use areas (zoning), safety zones, and floodplains. Many of these factors are addressed by regulations that must be followed in the course of planning and implementing a project or activity. Engineering factors such as utility availability, transportation accessibility, soil characteristics, and the like are also important for this type of analysis. These factors were beyond the scope of the study, however, because the supporting data were not readily available, and the analysis was geared to a more general overview of the facility rather than the details of a particular project.

Limits to growth of four types of activities were examined: residential housing, administrative, industrial research and development and vehicle test tracks, and range activities. This paper highlights the residential housing analysis; however, the modeling approach was similar for the other three cases. Fourteen constraints were identified (Table 1). For each, a GIS layer depicting the extent of the constraint was produced, and weights were assigned to the layers to adjust for the importance of the constraint. Weights ranged from 0 (no constraint) to 10 (prohibitive) and were qualitatively determined by researchers familiar with the installation and the constraint categories. For example, a weight of 10 was assigned to safety zones, which are by definition incompatible with residential housing, while a weight of 3 was assigned to some air quality zones since their effects could be easily mitigated in most cases. Use of a range of integers from 0 to 10 rather than floating point numbers from 0 to 1 significantly reduced the file sizes of the input layers and model.

Table 1: Constraint layers used for residential housing limits to growth analysis at Aberdeen Proving Ground with weighting factors and percent of total land area.

Constraint layer Weighting factor Percent of land area Comments
Current development 5 11.4 Existing buildings and roads.
Incompatible land use 10 92.7 Categories included airfield, airfield accident potential zones, research and development, restricted buildings, storage, test ranges, and vehicle test tracks.
Safety zones 10 60.0 Zones around munition storage and similar areas, and range safety areas.
IRP study areas 5 - 10 100.0 Installation restoration program general study areas (5) and specific study area with known sites (10).
Wetlands 6 35.3 Based on U.S. Fish and Wildlife Service National Wetlands Inventory data.
100-yr floodplain 10 37.6 Developed from digital elevation model and Federal Emergency Management Agency data.
Maryland critical areas 7 53.2 Potential zone is a 1000-ft buffer from tidal wetlands and shoreline, based on the Maryland Critical Area Act. (This regulation does not affect federal lands, but was included as a good management practice.)
Noise zones 5 - 10 59.2 Results from aircraft and blast noise models. Weights depend on noise level.
Protected species 10 8.0 500-meter buffer around bald eagle nest and roost sites.
Sensitive species 6 0.6 Waterfowl sanctuary area and 250-meter buffer around blue heron rookeries.
Air quality 3 - 5 11.2 Variable buffer sizes around buildings known to emit toxic air pollutants and results from air quality modeling for particulate matter around test tracks.
Riparian areas 6 15.6 Forested areas within 300 meters of perennial streams and shorelines.
Large forest tracts 6 26.2 Forest tracts larger than 100 acres.
Cultural resource sites 3 7.0 Historic structures, known archaeological sites and 200-ft buffers around potential archaeological sites.

A weighted summation model was calculated using the 14 weighted layers. A portion of the resulting constraint map is shown in Figure 1. Results from this model indicated that only 0.2% of the study area had no significant constraint to development for residential housing. The lowest output value was 5, in locations only affected by the installation restoration program study areas, which cover the whole site. The highest score of 89 occurred for a 0.75-acre area of the installation. Eleven of the 14 constraints occurred in this location.

Limits to growth results

Figure 1: Results of limits to growth model for residential housing at Aberdeen Proving Ground.

The modeling results are useful as a guide for areas of the site more or less suitable for residential housing, but further examination of the identified areas would be required before specific land use decisions could be made. The modeling approach allowed results to be obtained quickly, and adjustments such as changing the input layers and weights for specific projects or activities can be made easily.

Locating Potential Wetland Mitigation Sites

U.S. wetlands are protected under the Clean Water Act, and Executive Order 11990 requires federal installations to avoid or reduce impacts to any wetlands on their sites to the extent practical. The U.S. Department of Energy (DOE) provides guidance for implementing this order at its facilities under Title 10, Section 1022 of the Code of Federal Regulations. Unless a compromise agreement is reached with the U.S. Corps of Engineers, wetlands lost or degraded must be mitigated by creating or improving other wetlands so that they are similar in area, type, and function. As a DOE research laboratory, Argonne National Laboratory is subject to each of these regulations. As part of compliance activities, GIS modeling to generate a map of wetland suitability at Argonne was conducted.

Argonne is located southwest of Chicago, Illinois. Approximately 55% of the 1,480-acre site is developed, with the remainder composed of relatively undisturbed woodlands, old-fields and wetlands. Most of Argonne is relatively level, with slopes from 2 to 5 percent, and numerous shallow depressions support the development of wetland ecosystems. Thirty-five wetland areas totaling 44.6 acres were identified on the site in 1993 (Van Lonkhuyzen and LaGory 1994). The laboratory property is surrounded by Waterfall Glen Forest Preserve, a 2,470-acre greenbelt with a character similar to the undeveloped Argonne lands.

Many factors influence where new wetlands should be located and how successfully the new or improved wetlands will become established. In this project, a team of ecologists collaborated with the author to identify critical environmental factors and compare them with available GIS data. These factors included surface and groundwater conditions, topography, soil characteristics, vegetative cover, presence of historical wetlands, cost, and land zoning restrictions. Some GIS layers, such as one depicting the locations of wetlands before the Laboratory site was established, were added to the GIS for the analysis, but most were already in the Argonne GIS database. Because of the complex issues contributing to cost, it was not included as an input layer to the model, but instead was addressed on a site-by-site basis after suitable sites were identified. The GIS data sources used for producing the model input layers included 2-ft elevation contours, buildings, roads, surface hydrology, land use, contaminated areas, and vegetative cover; all primarily developed from a digital aerial photograph from 1995. Soil information was based on a 1976 U.S. Soil Conservation Service (SCS) soil survey (Mapes 1979). Locations of historic wetlands and local depressions were determined from 1946 and 1932 U.S. Geological Survey 7.5-minute quadrangle maps. Some data existed for depth to groundwater, but it was too general to be applicable for the modeling.

A one-to-one correspondence between available GIS layers and the desired inputs to the model did not exist, so it was necessary to combine and edit layers to produce the final model input layers. Grid cells in the model input layers were coded with 0 or 1 indicating the presence or absence of the factor being mapped. Each layer was also assigned a value between -1 and 1 indicating both its contribution to wetland suitability and its importance relative to the other layers. For example, areas with hydric soils were given a high positive weight, while areas zoned for intensive use were given a negative weight. Table 2 summarizes the layers used for modeling, with weighting factors, percent of land area, and data sources noted.

Table 2: Input layers for modeling wetland suitability at Argonne National Laboratory.

Suitability layer Suitability value Percent of land area Comments
Land use - suitable 0.5 23.2 Open space and environmentally sensitive categories from land use map.
Land use - unsuitable -0.3 76.8 All other land use categories from land use map.
Hydrology - streams 0.5 11.3 Streams, buffered to 50 feet.
Hydrology - local depress. 0.7 4.6 Local depressions in 1995 2-ft contours, buffered to 20 feet.
Soil - hydric 0.5 16.2 Hydric soils from SCS maps.
Soil - flooded 0.4 0.6 Water and intermittent water from SCS maps.
Vegetation - forested -0.3 51.8 Vegetative cover layer.
Vegetation - scrub / shrub -0.1 2.9 (Same)
Vegetation - marsh 0.3 3.3 (Same)
Vegetation - old field 0.5 16.5 (Same)
Historic wetlands 1.0 0.3 USGS maps
Historic local depression 0.7 0.3 (Same)

A weighted summation was performed with the input layers and assigned suitability values, and then locations of buildings, roads, and contaminated areas were removed from the map to rule them out for consideration as possible wetland mitigation sites. The map shown in Figure 2 was produced from the results, and Table 3 shows input layer values and model output for the five locations indicated on the map.

Wetland suitability model results

Figure 2 Results of wetland mitigation suitability model for Argonne National Laboratory.

Table 3: Model input values and results for the five locations shown in Figure 2.

Layer Site 1 Site 2 Site 3 Site 4 Site 5
Land use 0.5 -0.3 0.5 -0.3 0.5
Hydrology 0.7 0.5 0.5 0.7 0.0
Soil 0.5 0.5 0.0 0.5 0.0
Vegetation 0.3 -0.3 0.5 0.3 0.0
Historic 1.0 0.0 0.0 0.0 0.0
Model output 3.0 0.4 1.5 1.2 0.5

The locations highlighted in Figure 2 and Table 3 show how co-location of multiple suitability factors determined the outcome. Site 1, with the highest score, had favorable values in all categories, including suitable land use, a local depression, hydric soils, marsh vegetation, and an historic wetland occurring at that location. Factors at Site 2 were mixed, with favorable hydrology and soil conditions, but with forest cover, no historic wetland, and an unsuitable land use. Site 5 shows lower potential since the only positive suitability factor was land use.

The modeling approach allowed a rapid assessment of the entire site, requiring only readily available data. Although the model was fairly simple and weighting factors were qualitatively assigned, the results were satisfactory, fitting well with known wetlands on the site and with field work conducted during the project. On the basis of field survey and modeling results, seven areas were selected for further analysis, including preliminary site designs and cost estimates.

Predicting Prehistoric Site Locations

Another model developed for Aberdeen Proving Ground was designed to predict the occurrence of prehistoric archaeological sites. This work is documented in greater detail in Wescott and Kuiper (1999) and Kuiper and Wescott (1999), so only the necessary details to compare and contrast the work with the other case studies are presented here. Modeling to locate unrecorded prehistoric archaeological sites using GIS is becoming increasingly popular among archaeologists, but there are some significant assumptions that must be made that are unique to this problem. Generally, these techniques assume that the selection of sites by original inhabitants was at least partially based on a set of favorable environmental factors, such as distance to water or topographic setting. Another assumption is that current GIS layers consistently characterize changes from the prehistoric condition of the region sufficiently well that they can be used to help discover additional sites. This suggests a greater need for supporting data than the case studies presented so far.

Aberdeen Proving Ground proved to be a good candidate for modeling for several reasons. Many known prehistoric sites are located in the area, and archaeological surveys in the region have included systematic collection of the environmental characteristics at known sites. Restricted access to much of the installation, low levels of disturbance, and protective regulations have produced a uniquely protected setting that contrasts sharply with the higher rates of development in areas surrounding the installation. A modeling approach is also appropriate because of the difficulty of conducting surveys at APG. Barriers include presence of unexploded ordnance and extensive wetland areas, and limited access to range activity areas. Past intensive surveys at APG have been limited to shorelines.

Model formulation began with the development of a database of 572 recorded sites in the Upper Chesapeake Bay region. Site data included a polygon location in the GIS, site type, distance to water, type of water source (brackish or fresh), soil type, topographic setting, slope, elevation, aspect, geomorphic setting, time period, dimensions, and contents. The data were examined with statistical software to better understand the information and to identify patterns that would be useful for modeling purposes.

GIS layers were produced for each of the significant environmental variables in the archaeological site database. Most layers were derived from existing line or polygon layers, but some required a number of steps to produce the final result. In particular, topographic setting and type of nearest water source involved the most work to compile.

Next, descriptive statistics from both the database of known sites and 500 random points generated from the GIS layers were calculated and examined to determine the most significant environmental layers. The known-sites database was divided into two parts (shell midden and non-shell midden) to decrease the influence of opposite environmental factors for different types of sites. For example, shell middens would tend to occur at low elevations because of the proximity of shellfish sources, while a hunting camp might be found on a bluff to maximize the range of view. The 46 known sites occurring within APG lands were omitted during this analysis for later use in model validation. The analysis resulted in the selection of four relevant environmental factors.

Because the available data did not meet the sample size or presence/absence requirements of logistic regression, a similar but less rigorous model was implemented. Rather than using a weighted mean, it was determined that it would be more powerful to assign site potential on the basis of the co- occurrence of specific combinations of the four environmental variables. The four variables were limited to two groups each to maintain an adequate sample size. As can be seen in Table 4, with this arrangement 37.5% of the known shell sites were within 500 feet of water, with nearest water type brackish, elevation 20 feet or less, and topographic setting a floodplain or flat. Frequencies were much lower for combinations that would be intuitively unlikely to be associated with shell midden sites. A similar set of statistics was developed for non-shell sites.

Table 4: Frequencies of unique combinations for shell prehistoric sites in the Upper Chesapeake Bay Region (Wescott and Kuiper 1999).

Distance to water (ft) Water type Elevation (ft) Topography Frequency Percentage
0-500 Brackish 0-20 Terrace/Bluff 75 34.7
0-500 Brackish 0-20 Floodplain/Flat 81 37.5
0-500 Brackish > 20 Terrace/Bluff 14 6.5
0-500 Brackish > 20 Floodplain/Flat 2 0.9
0-500 Fresh 0-20 Terrace/Bluff 24 11.1
0-500 Fresh 0-20 Floodplain/Flat 10 4.6
0-500 Fresh > 20 Terrace/Bluff 4 0.9
> 500 Brackish 0-20 Terrace/Bluff 4 1.9
> 500 Fresh 0-20 Terrace/Bluff 2 0.9
Totals 216 100.0

Similar to logistic regression, the model associated site potential with the occurrence of a unique combination of environmental factors. Instead of calculating a probability, however, it simply used the observed frequency of the unique combination as the measure of site potential. To locate cells with these unique combinations in the Arc/Info GIS, the Grid COMBINE function was used to produce a GIS layer with a value for each unique combination of the environmental variables. A high potential was assigned to unique combinations occurring over 20% of the time, medium potential to 6.25% to 20%, and low or no potential to less than 6.25%. (The eight possible combinations would each comprise 6.25% of the total if they were equal in distribution.) These site potential levels were linked to the grid value attribute table (VAT) using a relational join in the database. This assigned the site potential to each cell in the map, and results could be queried and visualized. The same approach was used to produce a non-shell site potential map; then the results of the two models were combined using the Grid MAX function. Results of the combined model for a portion of APG are shown in Figure 3.

Predictive archaeological model

Figure 3 Results of predictive archaeological model for prehistoric shell midden sites at for Aberdeen Proving Ground (Kuiper and Wescott 1999).

With the availability of the known site database and the 46 records set aside for validation, a more detailed assessment of the model was possible in this case study. The locations of known sites were compared to the model output, and Kvamme's Gain Statistic (1 - [% area / % known sites]) (Kvamme 1988) was calculated. Results are shown in Table 5. The shell model matched well with known sites in the region, which can be attributed to several factors. The environmental characteristics associated with these sites are well defined, the model was focused specifically on these types of sites, and the mix of environmental conditions identified in the model matches well with shoreline surveys used to discover the sites. For a variety of reasons, results were less successful for the non-shell sites.

Table 5: Summary of validation results for the Aberdeen Proving Ground archaeological site potential model (Kuiper and Wescott 1999).

Site potential Percent of site area Number of known sites Kvamme's Gain Statistic
   Shell Non-shell Combined    Shell Non-shell Combined    Shell Non-shell Combined
High 16.5 2.7 19.2 12 2 42 0.82 0.55 0.79
Medium 2.5 44.3 29.0 0 30 4 0.80 0.52 0.52
Low 81.0 53.0 51.8 1 1 0 - - -

The results of the study provided a map useful for refining and reducing areas of potential high probability for prehistoric archaeological sites; however, ground truthing is still necessary to better validate the results. The modeling does not take the place of intensive archaeological survey to discover sites, but it does provide planners with a guide showing areas that would likely require less time, effort, and money to develop from a cultural resources compliance standpoint. Priority areas for evaluation, monitoring, or mitigation are augmented by the model results.

DISCUSSION

The modeling approaches described and the case studies presented show that grid-based environmental constraint or resource mapping models can be designed and computed with a relatively small amount of effort, yet still yield useful results. In the case studies, similar modeling approaches were used to model three substantially different problems, each giving useful results with available data. These results provide information to resource planners and land managers in a concise and informative format.

For effective modeling, an understanding of the significant factors contributing to the resource in question, knowledge of available GIS data sources, and familiarity with grid-based modeling are all necessary. Resourcefulness and creativity are needed to develop input layers that will support the analysis. Weighting factors or other parameters must be designed to adjust for unique characteristics of the input layers, such as importance or numerical range differences.

Once a model is developed, validation and ground truthing are still essential to achieve a complete answer to the problem in question; however, each model presented gave results that could be used to focus or refine field data collection. After initial results are completed, adjustments to a model can be made quickly as new information and insights are gained. More complex or statistically defensible models, such as logistic regression, multivariate regression, or classification can also be implemented when supporting data are available. Opportunities for improved modeling will also occur as data collection and survey techniques are increasingly designed with the goal of supporting GIS modeling.

ACKNOWLEDGMENTS

I would like to thank Konnie Wescott of Argonne, and Reed MacMillian, David Blick, and other staff members of the Directorate of Safety, Health and Environment at Aberdeen Proving Ground for their support and encouragement during the archaeological survey modeling project. Thanks also to John Hoffecker for his guidance and interest in the archaeological aspects of this work. Richard Olsen and Gary Williams made significant contributions to the limits to growth modeling, and Richard also provided management support. The Aberdeen case studies were sponsored by the U.S. Army as part of a larger environmental analysis for Aberdeen Proving Ground, supported under a military interdepartmental purchase request from the U.S. Department of Defense, U.S. Army, through U.S. Department of Energy contract W-31-109-Eng-38. For the Argonne National Laboratory wetlands modeling, Mark Kamiya sponsored the work, and Robert Van Lonkhuyzen and Kirk LaGory were instrumental in the analysis. Kirk LaGory also provided a constructive review of the initial draft of this document. Rob Hrabak deserves thanks for his ongoing support of the Argonne GIS which made the modeling possible. The Argonne portion of the work was supported by the U.S. Department of Energy, Chicago Operations Office, under contract W-31-109-Eng-38. Finally, Joan Meyer and Margaret Greaney provided GIS support to each case study.

DISTRIBUTION STATEMENT AND DISCLAIMER

Distribution Restriction Statement: Approved for public release: Distribution is unlimited. #3160-A-5

Neither the U.S. Army, nor any of its employees or officers, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.

Reference to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the U.S. Army. The views and opinions of the authors do not necessarily state or reflect those of the U.S. Army.

REFERENCES

Directorate of Safety, Health and Environment (DSHE), 1996, Geographic Information System database, U.S. Army, Aberdeen Proving Ground, MD.

Kuiper, J.A., 1996, Producing a Programmatic Environmental Impact Statement for a Large Federal Facility: A GIS Technical Leader's Perspective, paper presented at the 16th Annual ESRI User Conference, Palm Springs, CA.

Kuiper, J.A., and Wescott, K.L., 1999, A GIS Approach for Predicting Prehistoric Site Locations, presented at the 19th Annual ESRI User Conference, San Diego, CA.

Kvamme, K., 1988, Development and testing of quantitative models. In W.J. Judge and L. Sebastian (eds), Quantifying the Present and Predicting the Past: Theory, Method, and Application of Archaeological Predictive Modeling, U.S. Department of the Interior, Bureau of Land Management Service Center, Denver, CO, pp. 325 - 428.

Mapes, D.R., 1979, Soil Survey of DuPage and Part of Cook Counties, Illinois, U.S. Department of Agriculture, Soil Conservation Service.

Wescott, K.L., and J.A. Kuiper, 1999, Using a GIS to Model Prehistoric Site Distributions in the Upper Chesapeake Bay. In K.L. Wescott and R.J Brandon (eds), Practical Applications of GIS for Archaeologists: A Predictive Modeling Tool-Kit. London: Taylor & Francis (in press).

Van Lonkhuyzen, R.A., and K.E. LaGory, 1994, Wetlands of Argonne National Laboratory- East, DuPage County, Illinois, ANL/EAD/TM-12, Argonne National Laboratory, IL.


AUTHOR INFORMATION

James A. Kuiper (jkuiper@anl.gov): Biogeographer / GIS Analyst
Environmental Assessment Division - 900/D11
Argonne National Laboratory
9700 South Cass Avenue
Argonne, IL 60439-4832
Office: (708) 252-6206
FAX: (708) 252-3659
Internet: www.ead.anl.gov
ProXTR



Click here for Internet Business Systems © 2008 Internet Business Systems, Inc.
+1 (408) 850-9202 — Contact Us, or visit our other sites:
AECCafe - Architectural Design and EngineeringEDACafe - Electronic Design AutomationTechJobsCafe - Technical Jobs and Resumes	MCADCafe - Mechanical Design and EngineeringNanotechCafe - Nanotechnology ResourcesPrinted Circuit Board Engineering and ManufacturingShareCG  - Share Computer Graphic (CG) Animation, 3D Art and 3D Models
  Privacy Policy