Introduction to OLAP
OLAP (or Online Analytical Processing) has been growing in popularity due to the increase in data volumes and the recognition of the business value of analytics. Until the mid-nineties, performing OLAP analysis was an extremely costly process mainly restricted to larger organizations.
The major OLAP vendor are Hyperion, Cognos, Business Objects, MicroStrategy. The cost per seat were in the range of $1500 to $5000 per annum. The setting up of the environment to perform OLAP analysis would also require substantial investments in time and monetary resources.
This has changed as the major database vendor have started to incorporate OLAP modules within their database offering - Microsoft SQL Server 2000 with Analysis Services, Oracle with Express and Darwin, and IBM with DB2.
What is OLAP?
OLAP allows business users to slice and dice data at will. Normally data in an organization is distributed in multiple data sources and are incompatible with each other. A retail example: Point-of-sales data and sales made via call-center or the Web are stored in different location and formats. It would a time consuming process for an executive to obtain OLAP reports such as - What are the most popular products purchased by customers between the ages 15 to 30?
Part of the OLAP implementation process involves extracting data from the various data repositories and making them compatible. Making data compatible involves ensuring that the meaning of the data in one repository matches all other repositories. An example of incompatible data: Customer ages can be stored as birth date for purchases made over the web and stored as age categories (i.e. between 15 and 30) for in store sales.
It is not always necessary to create a data warehouse for OLAP analysis. Data stored by operational systems, such as point-of-sales, are in types of databases called OLTPs. OLTP, Online Transaction Process, databases do not have any difference from a structural perspective from any other databases. The main difference, and only, difference is the way in which data is stored.
Examples of OLTPs can include ERP, CRM, SCM, Point-of-Sale applications, Call Center.
OLTPs are designed for optimal transaction speed. When a consumer makes a purchase online, they expect the transactions to occur instantaneously. With a database design, call data modeling, optimized for transactions the record 'Consumer name, Address, Telephone, Order Number, Order Name, Price, Payment Method' is created quickly on the database and the results can be recalled by managers equally quickly if needed.
Figure 1. Data Model for OLTP
Data are not typically stored for an extended period on OLTPs for storage cost and transaction speed reasons.
OLAPs have a different mandate from OLTPs. OLAPs are designed to give an overview analysis of what happened. Hence the data storage (i.e. data modeling) has to be set up differently. The most common method is called the star design.
Figure 2. Star Data Model for OLAP
The central table in an OLAP start data model is called the fact table. The surrounding tables are called the dimensions. Using the above data model, it is possible to build reports that answer questions such as:
• The supervisor that gave the most discounts.
• The quantity shipped on a particular date, month, year or quarter.
• In which zip code did product A sell the most.
To obtain answers, such as the ones above, from a data model OLAP cubes are created. OLAP cubes are not strictly cuboids - it is the name given to the process of linking data from the different dimensions. The cubes can be developed along business units such as sales or marketing. Or a giant cube can be formed with all the dimensions.
Figure 3. OLAP Cube with Time, Customer and Product Dimensions
OLAP can be a valuable and rewarding business tool. Aside from producing reports, OLAP analysis can aid an organization evaluate balanced scorecard targets.
Figure 4. Steps in the OLAP Creation Process
No comments:
Post a Comment