Monday, December 1, 2008

DATAWAREHOUSING

DATAWAREHOUSING

ERP is meant to store and facilitate movement of data seamlessly between various departments,
whereas BI is meant for effective and efficient retrieval or data for analytical purposes.

Examples of ERP software’s include SAP, Oracle 11i, JD Edwards, PeopleSoft .
Examples of BI software’s include Business Objects, Cognos, Hyperion, ProClarity,
Microsoft SSAS & SSRS and SAP BIW.

-----------------------------------------------------------------------------------
Dimensional Modeling

The traditional data model called ER model is how most data is stored in RDBMS or typical databases;
however this model does not support fast retrieval of information. ER model, i.e. Entity Relationship
model is not optimized for query purposes. It is meant to store the data. Hence it is converted into
Dimensional model which is optimized for query.

A Dimensional model essentially comprises of schemas. Schemas are the representation of objects and data.
There are various kinds of schemas in Dimensional model, e.g. Star schema, Snowflake schema, Galaxy schema,
Hybrid schema and others. Star and snowflake are the two types which are broadly used.


Star Schema –
Any source system is highly normalized. We de-normalize source database with the help of star schema.
Star schema is the de-normalized version which is presented in the form of a star. The centre of the
star schema is the fact table which comprises the measures of the business. The fact table is connected to
various dimensions, thus giving it the shape of a star.

Snowflake schema -
Snowflake schema can be called as an extension of the Star schema. It is in normalized form. For example,
if we have a dimension called Product Dimension associated with fact table, this Product Dimension will be
further sub-divided into multiple tables to form a Snowflake schema.

The difference between Star and Snowflake schema is that, in Star schema, dimensions are less and hence
complexity of business also reduces. It does not involve granular level analysis. On the other hand, there
is large number of sub dimensions with millions of records in these tables in snowflake schema. Dimensions
change rapidly in this type of schema and it requires granular level analysis. Since dimensions are split
into sub tables, it subsequently leads to increase in the number of joins in Snowflake schema. The
processing time or query time also reduces as the consequence of having a lot of joins.


The core advantage of Snowflake schema is that, we can choose the sub dimension which is required for
analysis, without taking the entire parent dimensions at one shot. This makes the whole process simpler
and easier for the business users.
In star schema, we do not enjoy the same privilege.

In most cases, the combination of these two schema's is prevalent in the
implementation of data warehouses. We can come to the conclusion that, the purpose of creating Dimensional
models is to convert data into a query friendly environment where retrieval is much faster as compared to that in a conventional ER model.

Data Warehousing are of two types one is OLTP (Online Transaction Processing) and OLAP (Online Analytic Processing) .

OLTP is used mainly in Banking Doamin and Share Trading domain .
OLAP is used to getting Reports for example Sales Report of a particular month

1 comment:

  1. Man, Your blog is turning out to be a really good one.

    Nice techie articles.

    ReplyDelete