Introduction to the Database Management System
Database Management System (DBMS) is software which is used to create and manage databases. DBMS provides the user and programmers to create, retrieve, update and manage databases.
Database is a collection of organized information so that it can easily be managed and updated.
The Database & DMBSsoftware together they form Database System.
Data V/S Information
Data are simply facts or some bits of information but not information itself. When they are processed, organized, structured and presented to make them meaningful or useful then they are called Information.
What is File Processing System?
File Processing System was used to store data before DBMS was invented. In this system, all data are stored in a permanent memory i.e Secondary Memory. Many programs and software are written for extracting data from these files and to add record to these files.
- Data Redundancy and Inconsistency
- Data isolation & Data integrity
- Difficult in accessing data
- Security problems
- Concurrent access is not possible
Advantages of DBMS over File Processing System
- Redundancy is controlled
- Unauthorized access is restricted
- Providing multiple user interface
- Provide backup and recovery
- Enforcing integrity constrains
What is Unstructured & Structured Data?
Most IT workers are used to structured data. It is written in a format that’s easy for machines to understand, though it baffles most people unless they’re programmers. Structured data is easily searchable by basic algorithms. Examples include spreadsheets and data from machine sensors.
Unstructured data is more like human languages. It doesn’t fit nicely into relational databases like SQL, and searching it based on the old algorithms ranges from difficult to completely impossible. Examples include emails, text documents (Word docs, PDFs, etc.), social media posts, videos, audio files, and images.
Three Levels of Data Abstraction & Data Independence
Data Abstraction is simply used to hide irrelevant details from the user.
Three levels of Data Abstraction are:
- Physical Level (Lower Level): It tells how the data are actually stored in a memory.
Suppose we need to store details of an employee. Blocks of storage and the amount of memory used for these purpose is hidden from the user.
- Logical Level (Next Higher Level): In this level, all the information is stored in a database in tabular form. It also stored the relationship between different data entities.
- View Level (Highest Level): Only a part of actual database is viewed by the users. This level is used to ease the accessibility of the database by an individual user. User views data in the form of Row and Column. Multiple view of the same database may exist.
Data Independence means if we manipulate one level then other levels should not be affected.
These are the two types of level:
- Physical Independence: The changes occur in Physical layer of database and due to that logical layer should not be affected.
- Logical Independence: The changes occur in Logical layer of database and due to that view layer should not be affected.
What is a View?
View is basically a virtual table that does not really exist on its own but it is derived from one or more underlying base tables. A view can contain all or selected rows of a table. It is stored in a database.
- A view can represent a subset of a data from the table.
- Views can join & simplify multiple tables into a single virtual table
- Views can be used as aggregated tables
- Views can hide complexity of data
- Views takes very little space to store
- Views can provide extra security depending on SQL engine used.
Types of Views:
- Simple View
- Complex View
|Simple view||Complex View|
|Created from one table||Created from one or more tables|
|Doesn’t contain functions||Contain functions|
|Doesn’t contain group of data||Contain group of data|
Some important terms to remember:
- Checkpoint: Checkpoint is a mechanism where all the previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions were committed.
- Schema: The database schema of a database system is its structure described in a formal language supported by the database management system (DBMS). The term “schema” refers to the organization of data as a blueprint of how the database is constructed (divided into database tables in the case of relational databases).
- Superkey: A superkey is a combination of columns that uniquely identifies any row within a relational database management system (RDBMS) table. A candidate keyis a closely related concept where the superkey is reduced to the minimum number of columns required to uniquely identify each row.
- Primary Key: A primary key is a special relational database table column (or combination of columns) designated to uniquely identify all table records. A primary key’smain features are: It must contain a unique value for each row of data. It cannot contain null values.
- Unique Constraints: A unique constraint is a type of column restriction within a table, which dictates that all values in that column must be unique though may be null.
To ensure that a column is UNIQUE and cannot contain null values, the column must be specified as NOT NULL. Interestingly, these are a primary key’s two main attributes. Defining both attributes in a newly-created column should be given serious consideration for the primary key designation.
- Atomicity: In database systems, atomicity (or atomicness; from Greek atomos, undividable) is one of the ACID (Atomicity, Consistency, Isolation, Durability) transaction properties. An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs.
- Aggregations: Data aggregation is any process in which information is gathered and expressed in a summary form, for purposes such as statistical analysis. A common aggregation purpose is to get more information about particular groups based on specific variables such as age, profession, or income.
- Data Independence: It means we change the physical storage/level without affecting the conceptual or external view of the data. The new changes are absorbed by mapping techniques. Logical data independence is the ability to modify the logical schema without causing application program to be rewritten.