The aim of this literature review is to develop the range of study skills necessary to successfully participate and succeed in the foundation degree. It is supported by the need to become a reflective practitioner and to be able to apply learning from different contexts.
In view of the study skills discussed above, it will be applied to the Database Management System (DBMS 1200) unit criteria by describing the fundamental concepts, characteristics and structures of the relational model and other database models. Sources, such as the textbooks, eBooks, Internet, computer magazines and journals will be researched. It will start with a history and evolution of database systems, descriptions of some basic concepts and structures of database models. In the second section of this essay will be practically looked at the Database Management System. How does the Database Management System provide methods to organize, store, retrieve and manipulate with the data in the database? Comparing and providing detailed and structured treatment of conceptual data model diagram with the logical data model diagram. Also, will be shown the main differences between them. The third section of this essay will introduce to the query language SQL by showing an example of creating and implementing fully functional college’s database, and components of SQL, such as Data Definition Language (DDL) and Data Manipulation Language (DML). Demonstrate knowledge of SQL by designing, manipulating, implementing and executing complex queries. In the concluding section will be discussed of advantages and disadvantages of the relational DBMS. Also, in this section will be analysed the result of findings.
History and evolution of Database Systems
Why did the Database System develop?
Human beings began to place in an archive the data very long time ago. Recent research on the history of Database Systems (Quick Base, 2017) and (Ward, 2008) have shown the evolution of database. Before the 1950s, the data were stored as paper records. A lot of manpower was involved, and a lot of time was wasted, for example when searching specific data. It was inefficient, so the revolution had begun in the early 1960s. The data processing used the magnetic tapes for storage and late 1960s and 1970s used the magnetic hard disk allowing direct access to data. The data was stored in a computer file and the files were processed by computer programs (application software), known as File Processing System. Filed based system was a system of small applications. Different applications could be contained similar data and the data could be kept in different files. The File based Systems had many drawbacks, such as separation and isolation of data, the same data were held by different locations. If one item in a file needed updating then it needed to be updated I all relevant files: otherwise, the inconsistencies were developed. As a result of inconsistencies, data redundancy, separation, and isolation of data, weak security in File Processing System was required a better way to handle the data. This requirement had led to the first-generation DBMS: the hierarchical and network models (Hoffer at el, 2011).
A database model (Lucidchart, 2017) shows the logical structure of a database, relationships and constraints. Some of the most common data models are Hierarchical database model and Network Model, Relational model, Object-oriented database model, and the Object-relational model.
The introduction of shared files solved some of the problems because the different applications shared some of the same files. It was an early system where the user viewed the data as a Hierarchical Trees (Figure 1).
Figure 1: Unknown (2013). Hierarchical Tree Diagram.
This model had organized data in a structure of a tree, where a record has had only a single root, such as one parent can have many children, but one child can have only one parent. It has been a good model in some points to describe the real-world relationships, groups of data were related to each other and related data could be viewed together, reducing redundancy. The critical point of data as a Hierarchical Tree from Tsichrtzisand and Lochovsky (1976, p.123) have been stated:
· Of the strict hierarchical ordering, operations such as insertion and deletion become quite complex.
· A delete operation can lead to the loss of information present in the descendants if null records are not permitted.
· Consequently, users have to be careful when performing a delete operation.
· It is sometimes not possible to answer symmetrical queries easily in a hierarchical system.
However, the IBM (International Business Machines) company, the largest computer company in the world, originated in 1911, had used the Hierarchical Model in the 1960 and 1970.
By the late 1960s, had been developed a different data model. It was Network Model. The Network Model has had many-to-many relationships, where all records got a direct relationship with each other, every record can directly communicate with each other (Figure 2).
Figure 2: Razorbliss (2012). Network Model Diagram.
Network Model had given more data relationship types and more efficient and flexible data access, but in both (hierarchical and network) models were required to write the programs to create, access and change the data in the database. Both models were suffered from the same drawbacks as the File Processing System, such as limited data independence and lengthy for application development (Hoffer et al, 2011). As result of this limitations had developed second-generation DBMS, the relational data model.
Codd (1970), from International Business Corporation (IBM), introduced the Relational Model. Codd was a British computer scientist who worked for IBM at their research centre in San Jose, California. Relational Database Systems replaced the Hierarchical and Network-based systems and became widespread commercial acceptance in the 1980s. The model of Relational Database System is based on the concepts of tables, rows and columns (Figure 3).
Figure 3: Data Integration Glossary (2012). Relational Model.
Also, in the 1970s was developed the Structured Query Language (SQL) to support relational products and in 1980s SQL became the standard query language.
The Relational model (Ward, 2008) is the most common model and the conceptual basis of relational databases. A method of structures data using relations uses terminology taken from mathematics, especially theory and predicate logic. The data must be stored in tables, also known as relations (the mathematical term). The table consists of columns and rows, each column lists an attribute of the entity (Figure 4).
Figure 4: Kai Kivimaki (2007). The concepts of the relational database.
The junction of one column and one row is given a unique value, which called a tuple. All attributes in a relation are called domain, the number of the attributes which contains in it is called the degree of a relationship and the number of tuples which contains in it is called the cardinality of relation. The types of relationships between the tables have included one-to-one (1:1), one-to-many (1: M) or many-to-many (M: M relationship is not accepted in RDBMS to avoid having data redundancy).
One more very important characteristic of the relational model is the usage of keys. Primary key (an attribute or combination of attribute) is one of the most important keys that uniquely identify each tuple in the relation. A foreign key is an attribute or combination of attributes is chosen as a primary key in another relation. The table with a primary key which is referenced this primary key as a foreign key in another table called a primary table or parent table and another table with a foreign key called a related table or child table.
The key with two or more attributes (whether the primary key or foreign key) is known as a composite key or compound key. These concepts are very important for the relational model. Besides, how the data are structured in the relational model, relational model has set of rules to enforce data integrity, known as integrity constraints and how the data will be manipulated, known as relational calculus.
Object-oriented database model, a third generation, was introduced the late 1980s (Hoffer et al, 2011). The object-oriented database model has stored the objects (for example, patient in hospital database), where data and their relationships in a single structure, instead of tables and rows, or reusable software elements, or media, such as images, which impossible to store in the relational database. This model consists of an attribute, piece of data, and method, software program, for what to do with the data. The Object-oriented database management system (OODBMS) is based on this model and worked with Java and C++ programming languages. The main advantages of this model over relational are at a speed to store the data, and no any primary and foreign keys.
Today companies are used Object-oriented model for situations, where need high performance with complex data and Relational model for the traditional functions.
Object-relational model is a hybrid model, similar to a relational, but with object-oriented model and implements both the concepts in one single model. This model was developed to manage both types of data.
Concepts and characteristics of DBMS
Bergholt at el (1998, p.7) that:
A database is a large, persistent, integrated collection of dynamic data that provides some operations to describe, establish, manipulate, and access this data. A database management system (DBMS) supports a database by providing languages and services that make it possible for database administrators and users to build, maintain, and efficiently query and update the database. A database application stores its data in a database and uses the services of a database management system to retrieve and update the data and to protect and maintain its integrity.
Originally, Database Management Systems software was very large and expensive, running on very large computers. Nowadays, as a common tool for the even small machine.
Database design has started from a high-level view to the implementation level, where the details have increased for each level. The ANSI/SPARC (American National Standards Institute /Standards Planning and Requirements Committee) in 1975 had introduced the ANSI/SPARC three-layer architecture, as an abstract design standard between the users and the storage device in DBMS (Figure 5).
Figure 5: Shannon (2007). The ANSI-SPARC Three-level architecture.
Almost all the commercial databases are based on this architecture, however, it never became a standard.
Definitions of conceptual and logical data design
Data modelling is a representation of the data which collected and stored in a database, according to the business requirements. Data modelling is the first step process to design database, in which has shown how the data is connected to each other.
A conceptual data model is a high-level overview, when has only the relationships and entities, when are no attributes and no any primary keys, shown only the entities that have described the data and the relationship between them (Figure 6).
Figure 6: Gupta (2012). Conceptual Model Design.
A logical data model is the next high-level overview, but in this level, have shown all entities, all attributes for each entity and specified the primary key for each entity, also a foreign key, identifying the relationship between entities (Figure 7).
Figure 7: Gupta (2012). Logical Model Design.
The complexity has increased from conceptual model to logical model, shown from the first step in a conceptual model where all entities have related to each other, then in the logical model, where shown the details of all entities. The conceptual model and logical model sometimes are reviewed as a single move. Differences between the models and increasing complexity have shown in Figure 8.
Figure 8: Gupta (2012). Compare the different features of data models.
Conceptual and logical data modelling have collected information about the business needs, represented these needs and requirements on ERD (Entity relationship diagram), reviewed by clients, management, business company to see and discuss if more information needs before moving to physical modelling.
Introduction to SQL, as the standard query language.
Example of designing and implementing fully functional college’s database to maintain details of its lecturers’ subject area skills, according to the client’s requirement from college’s scenario.
From the college’s scenario has been identified the entities and the relationships between them, and created boxes for each entity with lines between them, which represent their relationships. From a college’s scenario has been identified four entities: department, lecturer, subject and timetable. (Figure 9).
Figure 9: Conceptual data model.
This is a high-level overview or conceptual data model of the database, according to a given scenario. In the next high-level overview or logical data model have been identified the entities that have direct relationship by finding the common attributes in different entities, identified the cardinality for each relationship by finding how many instances of each entity is related to the number of instances of another related entity and defined primary (PK), foreign (FK) and compound/composite keys(CK) in Figure 10. At this stage has been identified all entities with their attributes:
ü Department with attributes: Depart_ID (PK) and Depart_name.
ü Lecturer with attributes: Lecturer_ID (PK), Lecturer_name, Lecturer_grade, Hire_date, Salary, Depart_ID (FK).
ü Subject with attributes: Subject_code (PK), Subject_name and Level.
ü Timetable with attributes: Lecturer_ID (CK), Subject_code (CK) and Duration.
Figure 10: Logical data model.
In next step is used the standard query language of DBMS, SQL, which let interact and quickly extract the result from the database. The components of SQL are Data Definition Language (DDL) and Data Manipulation Language (DML). The DDL is used to create tables, views, indexes and control access to the database, using CREATE and ALTER statements, whereas DML is used to manipulate and query the data in the database using INSERT, UPDATE, DELETE and SELECT statements (Ward, 2008).
All the tables have been created using MS Access and inserted all records into those tables using SQL commands.
The queries have been used to create table “Department”:
Create table query “Department”
Insert query into “Department” table
Create table Department ( Depart_ID Text(10), Depart_Name Text (30), Primary Key ( Depart_ID));
Insert into Department (Depart_ID, Depart_Name) Values (‘CS08′,’Computer Science’);
As the result has been gotten the table “Department” in Figure 11.
Figure 11: Table “Department”.
The queries have been used to create table “Lecturer”:
Create table query “Lecturer”
Insert query into “Lecturer” table
Create table Lecturer ( Lecturer_ID Text (20) Primary Key Not Null, Lecturer_Name Text (50), Lecturer_Grade Int, Hire_Date Date, Salary Currency, Depart_ID Text (10), Foreign Key (Depart_ID) References Department (Depart_ID));
Insert into Lecturer (Lecturer_ID, Lecturer_Name, Lecturer_Grade, Hire_Date, Salary, Depart_ID) Values (‘STC3355′,’Barbara Dean’,2,’04-02-2010′,’FC11′);
As the result has been gotten the table “Lecturer” in Figure 12.
Figure 12: Table “Lecturer”.
The queries have been used to create table “Subject”:
Create table query “Subject”
Insert query into “Subject” table
Create table Subject (Subject_Code Text (10) Primary Key, Subject_Name Text (50), Level Int);
Insert into Subject (Subject_Code, Subject_Name, Level) Values (‘Co5′,’ICT System Support L5’, 5);
As the result has been gotten the table “Subject” in Figure 13.
Figure 13: Table “Subject”.
The queries have been used to create table “Timetable”:
Create table query “Timetable”
Insert query into “Timetable”
Create table Timetable (Lecturer_ID Text (10), Subject_Code Text (10), Duration Int, Primary Key (Lecturer_ID, Subject_Code), Foreign Key (Lecturer_ID) References Lecturer (Lecturer_Id), Foreign Key (Subject_Code) References Subject (Subject_Code));
Insert into Timetable (Lecturer_ID, Subject_Code, Duration) Values (‘STC8811′,’Co5’, 2);
As the result has been gotten the table “Timetable” in Figure 14.
Figure 14: Table “Timetable”.
In next step has been written and executed the SQL complex queries using the created college’s database.
Query 1: Show full details of all the lecturers started working between 10/01/2015 and 12/06/2015
Select * From Lecturer Where Hire_Date Between # 01/10/2015 and # 06/12/2015#;
The result has been shown in Figure 15.
Figure15: Executed the SQL query 1.
Query 2: Show all the lecturer numbers and names, subject number, lesson duration, subject name and the education level where the education level is less than four.
Select Lecturer.Lecturer_ID, Lecturer.Lecturer_Name, Subject.Subject_Code, Subject.Subject_Name, Subject.Level, Timetable.Duration From (Timetable Inner Join Lecturer on Lecturer_ID = Timetable.Lecturer_ID) Inner Join Subject on Timetable.Subject_Code = Subject.Subject_Code Where Level