GEOG 303: Introductory GIS

Rough lecture notes

Week 2


Trends in GIS

The Three Schools of GIS

General GIS references:  taken from:

Basic and practical introductions to GIS

        all of these and many others are obtainable through online GIS 'bookstores':

GIS magazines

Web references

Solving real world problems

Using GIS to solve problems in the real world requires interaction between the real world, the GIS and the users.

The real world needs to be represented within a GIS.  The users perceived the real world in a manner related to their problem, and hence need to be able to communicate with the GIS in terms related to their problem (ie. data, functionality, etc.).

How do we represent the real world?

Geographic features in the real world can be represented in a number of ways as follows:

1. Analog map

2.  Digital map 3.  GIS

Abstraction and generalisation


The process for obtaining a representation of the real world follows the cartographic process for abstraction and generalisation.  The process involves the steps of selection, classification, simplification and symbolisation.

The process for obtaining a GIS representation must consider the purpose, content and detail of the database.  This is similar to the cartographic map-making process in which the purpose, content, cartographic scale and presentation must be considered in producing a map.

Steps of the generalisation process 

The steps of the process of abstraction and generalisation are described as follows:

  • Selection. Involves decisions regarding the geographic space to be mapped, map scale, map coordinates and projection, data variables to be mapped, data gathering/sampling techniques.
  • Classification. Process in which objects are placed in groups according to similar properties. This reduces the complexity and improves the organisation of a map.
  • Simplification. Map features can be simplified by smoothing curves and straightening paths to eliminate unnecessary detail. For example, a straight line between two cities could indicate the connectivity between cities rather than the exact positional location of a road which may be irrelevant for a particular application.


  • Symbolisation. A set of marks or symbols is used to represent real world phenomena on a map. Such symbolisation involves defining size, shape, pattern, and color for points, lines, and polygons (areas).

Data representation with GIS 

In many ways, GIS have retained the notion of the map and many map concepts are found back in GIS. However, the manner in which GIS handle and analyse data is very different from that for maps. This is despite the fact that much data input into GIS is derived from maps.

Within GIS, data is often structured in a layered fashion representing the way in which maps have traditionally been handled. Each layer, also known as a coverage, contains some specific data such as a theme (eg. roads, vegetation cover, soils, etc.), time period (eg. years 1970, 1980, 1990) or vertical slices (eg. ground floor, first floor, etc. of a building).


Geographic data includes both spatial data and descriptive (or attribute) data. Spatial data deals with location, shape and relationships among features. Attribute data deals with the characteristics of the features.

 

Essential GIS components

Every GIS must include: 

The database is the heart of the GIS.  It must be structured so that the data can be accessed by functions initiated by users.  In the following sections, we will consider the structure of the data as well as the functions that operate on the data.

Structure of geographic data

The following chart illustrates the structure of geographic data.

The spatial component consists of locational information (ie. absolute or relative X,Y coordinates), geometry (ie. shape of point, line and polygon features [or raster cells)) and topology (ie. relationships between points, lines and polygons - adjacency, connectivity, and containment).  Attribute data can consist of both descriptive data and cartographic attributes (eg. line color and thickness, point symbol, etc.).  A third component is temporal data which is sometimes considered as a further dimension (eg. fourth dimension) but is often included as another attribute of the data. Never forget Metadata!

Types of GIS data

Two broad types of data can be identified:
  1. Continuous data - The information is a collection of spatial distributions and is referred to as continuous data.  Examples include altitude, rainfall, temperature, etc.
  2. Object-based data - The information is composed of identifiable entities and is referred to as discrete data.  Examples include roads, rivers, land parcels, etc.

GIS data models

The two types of data - field-based and object-based - are implemented in two geographic data models: 
  1. Raster - stores the space "around" the objects (features)
    • stores pixels (picture elements) in an image and cells in a grid
    • most closely represents continuous data
  2. Vector - stores the objects "in" space
    • stores points, lines and polygons to represent the features
    • most closely represents object-based data
Note however that either data model can be used to store field-based or object-based data.  Both models define a "discretisation" of the features (ie. grid cells or vector objects).  In other words, continuous features are represented discretely.

The type of data model used within a GIS will affect, not only the database, but also the functionality and the user interface.  We will explore the functions for each type of data model in the following sections

Scale of measurement


The meaning or semantics of the data values stored in a geographic database depend on the scale of measurement chosen:

    Ratio:
  • values are divisible and multiplicative (an absolute scale defined around zero (0))
  • eg. rainfall of Region 1 is twice that of Region 2
  • Interval:
    • values are additive and subtractive (on a relative scale)
    • eg. Region 1 is 10 degress warmer than Region 2
    Ordinal:
    • values establish order (ranking) only
    • eg. Region A is most suitable (eg. value of 1) and Region B least suitable (eg. value of 5)

    Nominal:
    • numbers establish identity only
    • eg. lot numbers, postal code zones, etc.
Note that values progressing from ratio to interval to ordinal to nominal are decreasing in the amount of information contained.
Different scales of measurement can be used for the same phenomenon. 

Consider, for example, data representing petrol stations.

Note how the scale of measurement cannnot be determined from observing the values alone.

Ratio: 72.9, 68.5, 67.9, 61.3,...
(petrol prices)

Interval: 25, 29, 30, 27,...
(average  temperature of petrol)

Ordinal: 1, 2, 3, 4,...
(ranked by decreasing price of petrol)

Nominal: 1, 2, 3, 4,...
(1=BP, 2=Ampol, 3=Caltex, etc...)

Grid GIS

A grid GIS is based on the raster data model. The foundational unit of storage is the grid cell. Square grid cells are most commonly used to store grid data. 
Each cell specifies the type or value of an attribute.  Only one value is stored per grid cell.  Note that if no data is recorded for that grid cell, then a value must still be stored - usually a zero (0) or a special "no data" symbol.  A group of contiguous cells having an identical value is referred to as a region.
Data is arranged in a matrix and located by coordinates which relate to the row and column numbers. Generally speaking, grid cells (matrices) are easy to store, manipulate and display.

Grid database and layering

Creating a grid database essentially involves overlaying an empty grid on the original data, reading off the data values for each grid cell and storing them in a matrix. 

Because only one value is stored per grid cell, how do we store multiple values for a specific location? We use layering

Data is stored using the layered concept - a theme or closely-related group of data items are stored in one layer. Hence, a grid database may consist of a number of layers, each representing some theme of information (eg. soils, roads, drill holes, etc.).

Each cell can contain one, and only one, data value for a given layer. Therefore, if multiple attribute are found for a particular theme of data (eg. soil type and pH value for the same soil area), then these attributes must be separated into two or more layers (eg. one for soil types, the other for pH values). 

Grid cell data

The types of features that can be represented within a grid cell include the following:
  • punctual data: 0-dimensional data
  • lineal data: 1-D
  • areal data: 2-D
  • surficial data: 3-D (or more accurately 2.5-D)
  • How do you identify the cell location?  (depends on the software)

    • Center of the cell?
    • Top-right-hand corner?
    • Bottom left-hand corner?
    What value do we give to the cell?

    Grid resolution

    The resolution of a grid database is dependent on the grid cell size. Grid cell size IS EXTREMELY IMPORTANT and must be chosen with care when the database is designed. Once set, it is difficult or even impossible to change. Changing the resolution affects a whole range of properties including: 

    Cell values

     

     

    A number of types of operations are available for grid data and may involve one or more layers resulting in a new layer being formed. The following list indicates some of the basic functions of a grid GIS.

     

    Grid GIS functionality

    Operations on one layer:
    • display all/individual cells
    • reclassify cells
    • neighbourhood operations
      • local
        • statistical (mean, max, etc.)
        • calculate slope and aspect
        • filtering data
      • extended
        • create buffer zones
        • obtain drainage paths
        • interpolation
    • measurement
      • distance, size, shape
    • transformations/projections
    Operations on multiple layers:
    • overlay
      • boolean
      • weighted
    • spreading (buffering) through:
      • barriers
      • friction (impedance) surfaces
    • viewshed analysis (intervisibility)


    Week 3