SQL: Chapter 2: Files, Databases and Database Management Systems

SQL: Chapter 2: Files, Databases and Database Management Systems


2. Introduction

Data and its storage may be considered to be the heart of any information system. Data has to be up-to-date, accurate, accessible in the required form and available to one or perhaps many users at the same time.

For data to be of value it must be presented in a form which supports the various operational, financial, managerial, decision making, administrative and clerical activities within an organisation.

To meet these objectives data needs to be stored efficiently - to avoid lengthy access times - and with minimal duplication - to avoid lengthy update times and the possibility of inconsistency and inaccuracy. For the data stored by a given organisation to have any value at all its integrity (consistency and accuracy) must always be assured.

In this section we are going to consider what is know as the conceptual view (user view) of stored data. As such we do not need to be concerned with how data is physically stored on specific types of storage media; neither do we need to consider the various storage structures and access methods applicable to retrieval of data from such media.

2.1 Data Storage.

The data of an organisation takes the form of an abstract representation of objects and events which occur within the organisation's environment. Within the context of an airline company for example, such objects might include Aircraft, Passengers and Airports, and include events such as Flights and the issuing of Tickets.

There are two principle approaches to the storage of data in a computer-based information system. Data may be stored in separate files belonging to isolated information systems operating within individual departments, or data may be stored in a database which may serve as a resource available across all departments and functional areas. The following sections consider these two approaches and their relative merits.

2.2 File Based Systems.

Before we consider the issues arising from file based storage let us firstly, establish some general concepts and definitions relating to the storage of data in files.

2.2.1 Data Files.

A file is a complete, named collection of information and the basic unit of storage that enables a computer to distinguish one set of information from another. For example a file named "Aircraft" might contain information about the different types of aircraft used by a particular airline company and a file named "Airport" might hold details of all the airports from which the airline operates.

2.2.2 Records.

The data held within a file are organised into structured groups of related elements called records. For example, a record describing an individual aircraft might be composed of the data elements: "identifying number", "name of manufacturer"; "description", "classification" (turbo-prop, jet, etc) "seating capacity" and so on. The aircraft file then contains zero, one, or many such records; where each record describes an individual aircraft.

2.2.3 Fields.

The individual elements of a record are referred to as fields. Hence form the example above, "identifying number", "name of manufacturer", "description", "classification" and "seating capacity", each represent an individual field (element) of the aircraft record.

2.2.4 Data Types.

The data to be held within each field of a given record will possess certain characteristics in terms of size (length measured in characters or digits) and type (numeric, alphabetic, dates, etc). Each field of a record is allocated a particular data type which describes the allowed characteristics of the data to be held by the field and further indicates the range of operations which can be carried out on the field; for example, arithmetic operations would be valid on fields containing numeric data but not on fields containing an address or a narrative description.

2.2.5 Keys.

A key is a field or combination of fields used to identify a record. When a key uniquely identifies a record it is referred to as the primary key.

Continuing with the example of the Aircraft record, if a given value of the field "identifying number" identifies an individual aircraft then it could serve as the primary key. Other fields such as "manufacturer" for example, could also serve as alternative keys (secondary keys) by which a set of records (eg all aircraft from a particular manufacturer) could be identified.

2.3 Issues Arising From File Based Systems.

2.3.1 Data Duplication.

The use of individual data files each serving separate information system tends to lead to situations in which an organisation maintains many copies of the same basic information.

For example let us consider a sales department which calculates bonuses payable to individual sales personnel on the value of their sales; with actual bonus payments being made only after confirmation that the sales invoices on which bonuses have been calculated have been fully paid.

If the sales department was to operate its bonus scheme based on its own sales and receipts files then the sales department would undoubtedly be holding copies of data such as employee name, payroll number, invoice numbers and amounts received, which would already be recorded elsewhere in the organisation by the personnel department and by the accounts department for example.

A change in marital status resulting in a change of surname would therefore result in the need to update several files and queries against invoices resulting in debit or credit notes being raised would also require the updating of several files.

2.3.2 Data Inconsistency and Integrity.

Where several discrete files exist serving the immediate requirements of individual departments there is a strong likelihood that the common or duplicated data held by these files will get out of step with each other resulting in different versions of data being held by the organisation. If the accounts receivable system in the example above issues a credit note against an incorrectly totalled invoice and fails to advise the sales department of this then the integrity of the sales department's files would be lost and as a result incorrect bonus payments would be made.

2.3.3 File Design.

Not withstanding the above consequences of file based information systems, taken from the view point of an individual information system, it is a relatively easy matter to ensure that the required files are designed to "perfectly" suite user/application needs. As such, taken in isolation, such information systems are capable of presenting information in exactly the form required by their users and also of providing highly efficient usage of storage and rapid retrieval times.

2.4 Databases and Database Management Systems.

For a generalised consideration of databases we may continue to apply the data storage concepts (files, records, keys, etc) previously introduced for file based systems. However, databases are not just a collection of files; through specific access controls provided by the Database Management System (DBMS) databases are able to provide a central resource of data which can be shared between users on an organisation wide basis.

2.4.1 Database Design.

To satisfy the information needs of users across an organisation the database has to be designed (in terms of "files" and "record layouts") in a generalised manner.

Considering the personnel and sales departments views of personnel records for example. The sales department is only interested in a specific type of personnel record; those records for sales men and women. The sales department might also only be interested in fields such as payroll number, name and bonus. Whilst the personnel department on the other hand would be interested in all personnel records irrespective of job description ( a field in the personnel record) and would want record fields in addition to those of interest to the sales department, such as home-address, marital status, date-of-appointment, department, salary-scale and scale-point, national insurance number and so on.

2.4.2 User Views.

It is the role of the DBMS to provide facilities which enable data ( from a generalised definition ) to be presented in the form required by specific users. So for example the DBMS should provide the sales department with just those record fields which they require about sales personnel whilst at the same time providing the personnel department with their requirements.

2.4.3 Database Security and Integrity.

There are two major consequences arising from the shared usage of data, namely, security and integrity.

Firstly, the DBMS must ensure that users are only allowed access to data which they are authorised to access. In addition, access authorisation must also restrict the type of access; limiting some users to read-only access for certain instances of data for example. So, if only the accounts receivable users are permitted to adjust the value of invoices, by the issuing of credit notes for instance, then the sales department, whilst interested to see when full payment of an invoices has been received, should not be permitted to change the invoice records in any way.

Secondly, the DBMS must also ensure that conflicting updates do not occur. In a stock control systems for example, a user updating the database for a customer order, must be given exclusive access to the records of the ordered items so their "quantity on hand" fields may be updated. Such exclusive access should prevent anyone from looking at out of date quantity on hand figures and should also prevent two or more users from trying to update the same quantity on hand values at the same time. This serialisation of record updating is controlled through the DBMS, which in the case of the above example, would issue locks on the records required by each customer order transaction.

2.4.4 Database Performance.

Because databases maintain data in a generalised form, converting this generalised form into a series of user views as required, they are generally less efficient in terms of data storage usage and in terms of access times compared with their individual counterpart file based systems.


Return to Contents
Go on to Chapter 3 Relational Databases


Home Download Previous Next Email Top

© Copyright 1997 CHL. All rights reserved.