Powered By Blogger

Tuesday 3 May 2011

cs614 idea solution




Assignment No. 02
SEMESTER SPRING 2011
CS614- Data Warehousing



Please don’t copy the solution,

Please change the solution in your own words. Thanks



Total Marks: 20



Question 1: [13 marks]

In this paper [1] hierarchical de-normalization is used to optimize the performance of data warehouse design. Study the paper and prepare a summary of how the technique is used for optimization.



Answer:

De-normalization means allowing redundancy in a table, it means in de-normalization database, data and information can be duplicated. De-normalized schema avoids joins and it places all data in same place. Full table scanning and it is slower than normalization.



Hierarchical de-normalization is complex method. The dimensions (categories) are placed in hierarchical levels (Step by step categories e.g., day, week, month, year etc)



Hierarchical de-normalization uses bitmap indexes for columns that tell which value should be taken from the columns. This method joins the categories in organized order i.e. moving from highest level category to lowest level category. Every element of any category is called node. This technique is used for optimization because it gives efficient results and helps to drill down the information to the lowest level. It also reduces response times of query.

Organizations define their own Hierarchical structure according to their needs, to be used in data ware house.





Question 2: [07 marks]



Compare the results of one dimensional and multidimensional query, given in this paper [1]. Give reasons, why the response time of multidimensional queries for de-normalized schema is less than that for normalized schema. Justify your answer.

Answer:

A query is a question in form of code that is sent to a database in order to get information back from the database. It is used as the way of retrieving the information from database

Dimensions are categories e.g. Country, Province, Division, Region etc.

One dimensional query means asking the database to get single type of record by selecting 1 column at a time. Multi dimensional query means asking the database to get two or more than two types of records. A number of records required to compute the answer by selecting column at a time. In Normalized schema, Multi dimensional queries response time is Greater because multiple records are selected from the DIFFERENT TABLES OF DATABASE. In De-normalized schema, Multi dimensional queries response time is lesser because multiple records are selected from the SAME TABLE which is less time to retrieve and manipulate the values.

No comments:

Post a Comment