2. Introduction
Data and its storage may be considered to be the heart of any
information system. Data has to be up-to-date, accurate, accessible
in the required form and available to one or perhaps many users
at the same time.
For data to be of value it must be presented in a form which supports
the various operational, financial, managerial, decision making,
administrative and clerical activities within an organisation.
To meet these objectives data needs to be stored efficiently -
to avoid lengthy access times - and with minimal duplication -
to avoid lengthy update times and the possibility of inconsistency
and inaccuracy. For the data stored by a given organisation to
have any value at all its integrity (consistency and accuracy)
must always be assured.
In this section we are going to consider what is know as the conceptual
view (user view) of stored data. As such we do not need to be
concerned with how data is physically stored on specific types
of storage media; neither do we need to consider the various storage
structures and access methods applicable to retrieval of data
from such media.
2.1 Data Storage.
The data of an organisation takes the form of an abstract representation
of objects and events which occur within the organisation's environment.
Within the context of an airline company for example, such objects
might include Aircraft, Passengers and Airports, and include events
such as Flights and the issuing of Tickets.
There are two principle approaches to the storage of data in a
computer-based information system. Data may be stored in separate
files belonging to isolated information systems operating within
individual departments, or data may be stored in a database which
may serve as a resource available across all departments and functional
areas. The following sections consider these two approaches and
their relative merits.
2.2 File Based Systems.
Before we consider the issues arising from file based storage
let us firstly, establish some general concepts and definitions
relating to the storage of data in files.
2.2.1 Data Files.
A file is a complete, named collection of information and the
basic unit of storage that enables a computer to distinguish one
set of information from another. For example a file named "Aircraft"
might contain information about the different types of aircraft
used by a particular airline company and a file named "Airport"
might hold details of all the airports from which the airline
operates.
2.2.2 Records.
The data held within a file are organised into structured groups
of related elements called records. For example, a record describing
an individual aircraft might be composed of the data elements:
"identifying number", "name of manufacturer";
"description", "classification"
(turbo-prop, jet, etc) "seating capacity" and
so on. The aircraft file then contains zero, one, or many such
records; where each record describes an individual aircraft.
2.2.3 Fields.
The individual elements of a record are referred to as fields.
Hence form the example above, "identifying number",
"name of manufacturer", "description",
"classification" and "seating capacity",
each represent an individual field (element) of the aircraft record.
2.2.4 Data Types.
The data to be held within each field of a given record will possess
certain characteristics in terms of size (length measured in characters
or digits) and type (numeric, alphabetic, dates, etc). Each field
of a record is allocated a particular data type which describes
the allowed characteristics of the data to be held by the field
and further indicates the range of operations which can be carried
out on the field; for example, arithmetic operations would be
valid on fields containing numeric data but not on fields containing
an address or a narrative description.
2.2.5 Keys.
A key is a field or combination of fields used to identify a record.
When a key uniquely identifies a record it is referred to as the
primary key.
Continuing with the example of the Aircraft record, if a given
value of the field "identifying number" identifies
an individual aircraft then it could serve as the primary key.
Other fields such as "manufacturer" for example,
could also serve as alternative keys (secondary keys) by which
a set of records (eg all aircraft from a particular
manufacturer) could be identified.
2.3 Issues Arising From File Based Systems.
2.3.1 Data Duplication.
The use of individual data files each serving separate information
system tends to lead to situations in which an organisation maintains
many copies of the same basic information.
For example let us consider a sales department which calculates
bonuses payable to individual sales personnel on the value of
their sales; with actual bonus payments being made only after
confirmation that the sales invoices on which bonuses have been
calculated have been fully paid.
If the sales department was to operate its bonus scheme based
on its own sales and receipts files then the sales department
would undoubtedly be holding copies of data such as employee name,
payroll number, invoice numbers and amounts received, which would
already be recorded elsewhere in the organisation by the personnel
department and by the accounts department for example.
A change in marital status resulting in a change of surname would
therefore result in the need to update several files and queries
against invoices resulting in debit or credit notes being raised
would also require the updating of several files.
2.3.2 Data Inconsistency and Integrity.
Where several discrete files exist serving the immediate requirements
of individual departments there is a strong likelihood that the
common or duplicated data held by these files will get out of
step with each other resulting in different versions of data being
held by the organisation. If the accounts receivable system in
the example above issues a credit note against an incorrectly
totalled invoice and fails to advise the sales department of this
then the integrity of the sales department's files would be lost
and as a result incorrect bonus payments would be made.
2.3.3 File Design.
Not withstanding the above consequences of file based information
systems, taken from the view point of an individual information
system, it is a relatively easy matter to ensure that the required
files are designed to "perfectly" suite user/application
needs. As such, taken in isolation, such information systems are
capable of presenting information in exactly the form required
by their users and also of providing highly efficient usage of
storage and rapid retrieval times.
2.4 Databases and Database Management Systems.
For a generalised consideration of databases we may continue to
apply the data storage concepts (files, records, keys, etc) previously
introduced for file based systems. However, databases are not
just a collection of files; through specific access controls provided
by the Database Management System (DBMS) databases are able to
provide a central resource of data which can be shared between
users on an organisation wide basis.
2.4.1 Database Design.
To satisfy the information needs of users across an organisation
the database has to be designed (in terms of "files"
and "record layouts") in a generalised manner.
Considering the personnel and sales departments views of personnel
records for example. The sales department is only interested in
a specific type of personnel record; those records for sales men
and women. The sales department might also only be interested
in fields such as payroll number, name and bonus. Whilst the personnel
department on the other hand would be interested in all personnel
records irrespective of job description ( a field in the
personnel record) and would want record fields in addition to
those of interest to the sales department, such as home-address,
marital status, date-of-appointment, department, salary-scale
and scale-point, national insurance number and so on.
2.4.2 User Views.
It is the role of the DBMS to provide facilities which enable
data ( from a generalised definition ) to be presented in the
form required by specific users. So for example the DBMS should
provide the sales department with just those record fields which
they require about sales personnel whilst at the same time providing
the personnel department with their requirements.
2.4.3 Database Security and Integrity.
There are two major consequences arising from the shared usage
of data, namely, security and integrity.
Firstly, the DBMS must ensure that users are only allowed access
to data which they are authorised to access. In addition, access
authorisation must also restrict the type of access; limiting
some users to read-only access for certain instances of data for
example. So, if only the accounts receivable users are permitted
to adjust the value of invoices, by the issuing of credit notes
for instance, then the sales department, whilst interested to
see when full payment of an invoices has been received, should
not be permitted to change the invoice records in any way.
Secondly, the DBMS must also ensure that conflicting updates do
not occur. In a stock control systems for example, a user updating
the database for a customer order, must be given exclusive access
to the records of the ordered items so their "quantity
on hand" fields may be updated. Such exclusive access
should prevent anyone from looking at out of date quantity on
hand figures and should also prevent two or more users from trying
to update the same quantity on hand values at the same time. This
serialisation of record updating is controlled through
the DBMS, which in the case of the above example, would issue
locks on the records required by each customer order transaction.
2.4.4 Database Performance.
Because databases maintain data in a generalised form, converting
this generalised form into a series of user views as required,
they are generally less efficient in terms of data storage usage
and in terms of access times compared with their individual counterpart
file based systems.
Return to Contents
Go on to Chapter 3 Relational Databases
© Copyright 1997 CHL. All rights reserved.