Courtesy of istockphoto, Nikada, Image#6922525
The effective process of capturing, processing, analysing and storing data is pivotal to the success or failure of many organisations, with patterns and trends being identified and forecasts being made based on current levels of performance.
Chapter 1
Data Analysis and Design
Databases are a very important and almost integral part of many business systems. Databases have a range of functions that provide immense storage, processing and analytical capabilities.
This chapter will introduce you to a range of database concepts and techniques. You will become familiar with the mechanics of databases in terms of how they have evolved from paper-based and flat file systems to fully relational systems. A range of design methodologies will be examined, one example being logical data modelling, which can then be applied to your own database designs.
This chapter will provide you with the knowledge and skills required to support you in your own database analysis and design.
The chapter is structured around the following learning outcomes:
- Know modelling methodologies and techniques.
- Understand the tools and documentation required in a logical data modelling methodology.
- Be able to create a logical data model.
- Be able to test a logical data model.
Know Modelling Methodologies and Techniques
When designing a system you would normally follow some sort of framework or method that will provide a structured walk-through for each of the steps and stages involved. When designing a database there are various approaches and techniques that can be applied to ensure that the design meets the needs of the end-user, that it functions, and that it is dynamic and robust.
Database Types
Databases have evolved from users and developers being able to understand the semantics of data sets and communicating this understanding clearly and logically. To facilitate this, a specific data model (or models) can be used as a framework for examining and understanding the entities, attributes and relationships between data sets.
Data models can be broken down into three categories:
- object-based models: entity relationship, semantic, object-orientated and functional
- record-based models: hierarchical, network and relational
- physical data models.
Flat file
A flat file is a database system where each database is stored in a single table. Flat files are files that have no records and no structured relationships.
Hierarchical Model
The hierarchical data model is so called because of the way in which the data is arranged. The hierarchical model is based on a tree structure with a single table as the root, with the tables forming the branches as shown in Figure 1.1.
Figure 1.1 Hierarchical data model.
The relationships within this structure are described as parents and children, where a parent can have multiple children but a child can have only one parent. The way in which parents and children are linked together is through the use of ‘pointers’. A parent will have a list of pointers extending to each of their children.
The child–parent rule ensures that data is systematically accessible. In terms of navigation, to access a low-level table you would start at the root and work down the tree until you reached the target.
There are a number of problems with the hierarchical structure, including:
- The user must have a good knowledge about how the tree is structured in order to find anything.
- A record cannot be added to a child table until it has already been incorporated into the parent table.
- There will be repetition of data within the database.
- Data redundancy occurs, owing to the fact that a hierarchical database can cope with 1:M relationships but not M:N relationships because a child can have only one parent.
As a result of these problems a different data model was designed to overcome some of the defects attributed to the hierarchical structure.
The network Database Model
The model originates from the Conference on Data Systems Languages (CODASYL) and was designed to solve some of the more serious problems attributed to the hierarchical database model.
There are similarities between the two models; however, instead of using a single-parent tree hierarchy, the network model uses a set theory to provide a tree-like structure. Child tables can have more than one parent, thus supporting many-to-many relationships.
The design of the network database looks like several trees that share branches, so that children can have multiple parents and vice versa as shown in Figure 1.2.
Figure 1.2 Network model structure.
Although an improvement on the hierarchical model the network model still had some intrinsic problems. The major problem was that the model was difficult to implement and maintain, most implementations being used by computer programmers and not end-users. A less complex database model was required that could ...