Home > Courses > Archaeological GIS | Dean Snow

 


Lecture 08 : Introduction to GIS (link to Powerpoint file)

Lab 9 : Basics of Microsoft Access

What Is A Database?

Database is loosely defined as any organized collection of data
What can be a Database?
A paper-based collection of site forms can be thought of as a manual form of a database
A collection of computer files (word processor, spreadsheets) can be considered to be a rudimentary computer based database
A simple table of data documenting site information is a very simple computer database
Problem with all the above examples is that the data is either hard to manage and/or is very inefficient
A more efficient way to store data is through the use of a Database Management System (DBMS)

Early computer databases used a file system approach
Data was stored in a group of unrelated files
Custom software programs were created to access and integrate the files and perform analyses

Problems with File System Databases

Data is Unstructured
Data formats can radically vary in the different data files (i.e. one can be a word processor file, another file might be a spreadsheet file)
High Data Dependence
Because any data file format is possible the only way to create an analysis program is through a thorough knowledge of the data files
High Maintenance Costs
High data dependence means a large learning curve for any new program development
Difficulties in Developing New Applications
Developing new applications will often necessitate adding new data which will alter the format of the data in files and therefore force reprogramming of existing programs

Database Management Systems (DBMS)

Data dependency issues of file systems led to DBMS development
What is a DBMS?
A generalized tool for managing and accessing information in a database
Collection of interrelated data
Set of programs to access the data
Data dependency issues minimized by enforcing data standardization
DBMS provides an environment that is both convenient and efficient to use.
Archaeological Database Applications:
Regional Survey: Manage Site Inventory
Site Based: Store information on features & artifacts
Artifact Analysis: Store artifact inventory & attributes
Archaeological DBMS’s allow us to manage data more efficiently than ever before. But to do so a database must be constructed properly.

Benefits of a Database Management System

Application development independent from data
Speedy data retrieval
Ensure data integrity
Enforcement of data standards
Easily modifiable

Flat File Databases

Early database management systems used a flat file approach
Flat file databases use a standard tabular approach
Data is stored a 2-dimensional system of columns and rows
Flat file databases have the “look” of a simple spreadsheet worksheet
Each row contains all the information for the object being cataloged

Basic Database Terminology I

Column, aka Field, aka Attribute
Columns in a 2-dimensional table represent the attributes of the data set (e.g. site ID, site name, site area, artifact count)
Domain
Domain of a column defines the valid data values the field can take
Site Area must be a number greater than 0
Artifact material must be ceramic, lithic, bone - north is not a valid value
Data Type
One aspect of a field’s domain is the type of data it can contain
Common Data Types
Integers: 0, 1, 2, 3….
Floating numbers: 0, 0.2, 1.752
Text (aka Character, aka String): North, Bowl, Burial
Date
Boolean: Yes, No

Basic Database Terminology II

Row, aka Record, aka Tuple
Each row contains all the attributes of a given entity (e.g. the ID, name, area, # of artifacts of a single site)

 

Table
A collection of records (e.g. data on all sites in a region)
Database
A collection of tables

 

Problems with Flat File Databases

Tables can contain lots of wasted space
In the example many records have no information under certain fields
Tables can have an exceedingly large number of fields
Collecting information on ceramic vessel diameters would necessitate adding vessel diameter fields for each ceramic type
Hard to perform certain searches
Finding the sites with ceramics necessitates checking the Bowls, Jars, and Plates fields to see which records have ceramics
Hard to summarize the data
Calculating the total number of artifacts would necessitate adding the data from 8 fields (more if we added new artifact types!)
e.g. Total # = #Mortars + #Pestles + #Bowls + #Jars + #Pl….

Relational Database Systems

The relational database model was developed to overcome the limitations of simple flat file databases.
Relational databases divide the data from a single flat file into two or more tables that can be combined and related as needed

Flat File to Relational Database

In the preceding flat file site inventory database what are the major types of data?
Site Information
Artifact Information
Ground Stone
Ceramics
Lithics
If we split the data into 4 separate tables how could we relate the information from one table to another?
Could use site name but could get 2 sites with same name
Site ID should be unique to each site so this would be the best candidate

Relational Database I Critique

In what way is our first try at a relational database more complicated than the original flat file?
We have added a level of indirection to get from the site information to the artifact information (ie. now need to navigate from one table to the others based on the value in the Site_ID field)
How is the new relational database better than the original flat file?
It really isn’t any better, just more complicated
Still have cells with blank values
Still have to add more columns if we wanted to add a new artifact type
How could we further refine the database?
Modify the artifact tables to have artifact type and quantity fields to replace the separate artifact type fields in each table.
Example: for Ceramics remove the Bowls, Jars, and Plates fields and replace them with a Ceramic type field that with domain Bowl, Jar, and Plate and put the number of artifacts in a field called Quantity

Relational Database II Critique

How is the new database still more complicated than the flat file version?
Still have a level of indirection to get from the site to artifact information
How is the new relational database better than the original flat file?
No more cells with blank values
Can easily add a new artifact type to any of the artifact tables by simply adding a new option to the domain of the type field.
example: if we found a ground stone celt from a site all we have to do is enter celt under the Type field of the Ground Stone table
Can easily collect additional data for artifacts
example: to collect ceramic vessel diameters all we have to do is add a single diameter field to the ceramics table
Much easier for the computer to calculate summary information
Although it may not look like it at first, we will see that it’s much easier to calculate artifact totals when quantity data is contained in a single field


© 2003 MATRIX
Project Director: Anne Pyburn
Indiana University Bloomington