Green Database

An overview

During the early times, the database developers who managed the analytical systems did not have enough experience working in data warehouse environment. Their skill set included optimising performance using database parameters and associated models. However, the current state of affairs has changed all that. We are now geared towards determining efficient methods that could save on resources, amplify speed and enhance throughput. And when applied to database systems they are aggregately termed as “Green Database”. This and the factors is what we shall discuss in the subsequent pages.

A green database can be implemented in various ways; beginning from query optimisation to schema optimisation coupled with database virtualisation to implementing cloud computing could be pertinent stakeholders.

In this section we discuss in detail these various methods, their potential drawbacks and solutions.

  1. Database Virtualisation: In layman’s term, virtualisation technique enables the creation of a virtual version of an actual entity. Broadly they are classified into hardware and software virtualisation of which when dealing with software, esp. and application which in this case will be database systems. it will be then appropriate to talk about resource virtualisation that is allocated to these applications. (Soror et al, 2007). In this work, the authors have proposed to implement a virtualisation approach in database systems so as to curtail the total ownership cost incurred. They propose that rather than having a single uniform database system on a single processor, it should be a better resource utilisation to have multiple database systems on the same machine implemented on Virtual machines, a concept known as server consolidation which is already widely known, not only this, it would be effective to exploit the ability of virtual machines to save the images of the systems that they hold. So in this case, if we are able to save the database system image and then use it on another platform it will help in reducing costs, simply software distribution and its deployment.The authors further state that such ready to deploy virtual machine images are known as “Software appliances” (Soror et al, 2007). The authors propose that it will be effective to have a similar database tool. Its interesting to note that while this author suggest database virtual machines as software appliances, another group of researchers (Minhas et al, 2008) have termed it as “virtual appliance”. In their work, they have done experimental studies and have proved that the total average overhead cost of deploying and running a database system on a virtual machine “is less than 10%”.Drawbacks of Database virtualisation techniques: One of the potential impediments is the allocation of physical system resources to a virtualised database environment. Because even if the database system is to be virtualised not only we need to allocate the virtual machine the physical resources like CPU utilisation, I/O bandwidth but also the database parameters like memory-buffer pool sizes etc. However, with database systems they have their inbuilt query optimisers.Solution: The authors have proposed a Combinatorial Cost Modelling solution in which they first make the database system’s query optimiser get aware of the virtual machine resource allocation and then to use the optimizer to calculate the cost of the workload execution for multiple levels of resource allocation in the virtual machines.
  2. Query Optimisation: Queries are involved that interact with the databases and show the result. Much work has been done in database query optimisation and several techniques have been proposed too. Experiments have proved that the traditional queries like, joins, partial joins, indexes and skew queries tend to impose a high overhead value on the processor while being executed as discussed by various researchers (Bini et al, 2009; Horng et al, 1994; Sellis & Shapiro, 1991; Li et al, 2010) There have been various algorithms designed that address these issues. Traditionally query optimisers have been using single query as they are the commonly used execution tasks in a traditional database system like a relational database system. Single query optimisation strives to curtail the time required to calculate the output required for a given query. Here we will briefly discuss the Genetic Algorithm. (Murat et al, 2007).GA as the name suggests derives its conception from biology. the main data structure is a vector (called chromosome) that represents a solution object to a problem. Elements of the chromosome (called genes) conatin a part of the solution of the problem. The quality of the solution object is defined by its proximity to the optimal solution (called the fitness function).GA recursively searches for an optimum solution by implementing evolutionary operations (also termed as genetic operators). In the beginning, pools of random chromosomes are generated that represent a set of different solutions. Then, genetic operators are applied to the pool of chromosomes that in turn generates a set of new refined chromosomes from the existing chromosomes for the next iteration. it would be worthwhile to mention the genetic operators too;a) Crossover Operation: In this operation, sections of the parent chromosome are exchanged with each other to create a new chromosome. This process is called a crossover operation.b) Mutation Chromosome: A random modification of existing genes within the chromosomes creates a mutated chromosome. But how does the selection of a fit chromosome occur, well, for that selection techniques are applied and one such technique is “Roulette Wheel” (Murat et al, 2007) which as the name suggests the area of the sections of the roulette wheel corresponds to an individual chromosome. Then random numbers are generated to depict points for the sections of the wheel and then chromosomes that contain the points are selected for the next generation.

    Conclusion

    So as we can see, managing a database is a norm of the past now. Its pertinent now to not only manage it but also to ensure that the resources that the database consumes are far however not compromising on database integrity.

References

Jorng-Tzong Horng; Cheng-Yan Kao; Baw-Jhiune Liu (1994).A genetic algorithm for database query optimization. In- Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on.27-29 Jun 1994.Orlando, FL, USA.p.350 – 355vol.1.

Bini, T.A.; Lange, A.; Sunye, M.S.; Silva, F.(2009). Stableness in large join query optimization. In- Computer and Information Sciences, 2009. ISCIS 2009. 24th International Symposium on. 14-16 Sept. 2009.Guzelyurt. p. 639 – 644.

Sellis, T.K.; Shapiro, L.(1991).Query optimization for nontraditional database applications. Software Engineering, IEEE Transactions on. Volume:17.Issue:1.P.77-86.

Soror, A.A.; Aboulnaga, A.; Salem, K.(2007).Database Virtualization: A New Frontier for Database Tuning and Physical Design. In-Data Engineering Workshop, 2007 IEEE 23rd International Conference on.17-20 April 2007.Istanbul.p.388-394.

Minhas, U.F.; Yadav, J.; Aboulnaga, A.; Salem, K.(2008). Database systems on virtual machines: How much do you lose?. In- Data Engineering Workshop, 2008. ICDEW 2008. IEEE 24th International Conference on. 7-12 April 2008. Cancun. p. 35 – 41.

Murat Ali Bayir; Ismail H. Toroslu; Ahmet Cosar.(2007). Genetic Algorithm for the Multiple-Query Optimization Problem. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on. Volume: 37 , Issue: 1. P.147 – 153

Advertisements