The design of a distributed database system is a complex task. Fragmentation is one of the core designing technique in distributed database system.
Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the table are called fragments. Fragmentation can be of three types: horizontal, vertical, and hybrid (combination of horizontal and vertical). Horizontal fragmentation can further be classified into two techniques: primary horizontal fragmentation and derived horizontal fragmentation.
Fragmentation should be done in a way so that the original table can be reconstructed from the fragments. This is needed so that the original table can be reconstructed from the fragments whenever required. This requirement is called “reconstructtiveness.”
Advantages of Fragmentation:
• Since data is stored close to the site of usage, efficiency of the database system is increased.
• Local query optimization techniques are sufficient for most queries since data is locally available.
• Since irrelevant data is not available at the sites, security and privacy of the database system can be maintained.
Disadvantages of Fragmentation:
• When data from different fragments are required, the access speeds may be very high.
• In case of recursive fragmentations, the job of reconstruction will need expensive techniques.
• Lack of back-up copies of data in different sites may render the database ineffective in case of failure of a site.
Types of fragmentation: There are three types of fragmentation in Distributed database system
1. Vertical fragmentation.
2. Horizontal fragmentation.
3. Hybrid fragmentation.
Vertical Fragmentation:
In vertical fragmentation, the fields or columns of a table are grouped into fragments. In order to maintain reconstructiveness, each fragment should contain the primary key field(s) of the table. Vertical fragmentation can be used to enforce privacy of data.
For example, let us consider that a University database keeps records of all registered students in a Student table having the following schema.
STUDENT
Regd_No Name Course Address Semester Fees Marks
Now, the fees details are maintained in the accounts section. In this case, the designer will fragment the database as follows −
CREATE TABLE STD_FEES AS
SELECT Regd_No, Fees
FROM STUDENT;
Horizontal Fragmentation:
Horizontal fragmentation groups the tuples of a table in accordance to values of one or more fields. Horizontal fragmentation should also confirm to the rule of reconstructiveness. Each horizontal fragment must have all columns of the original base table.
For example, in the student schema, if the details of all students of Computer Science Course needs to be maintained at the School of Computer Science, then the designer will horizontally fragment the database as follows −
CREATE COMP_STD AS
SELECT * FROM STUDENT
WHERE COURSE = “Computer Science”;
Hybrid Fragmentation:
In hybrid fragmentation, a combination of horizontal and vertical fragmentation techniques are used. This is the most flexible fragmentation technique since it generates fragments with minimal extraneous information. However, reconstruction of the original table is often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
• At first, generate a set of horizontal fragments; then generate vertical fragments from one or more of the horizontal fragments.
• At first, generate a set of vertical fragments; then generate horizontal fragments from one or more of the vertical fragments.