Creating the correct Cube structure
Posted by Vladimir Lyutetsky on 26 November 2009 02:46 PM
Recently one of our regular customers called our tech support complaining of the Direct-version's slow performance: an OLAP-slice of a thousand rows and a few columns took several minutes to get calculated.
A Few Things about Data Sources
The vast majority of the desktop-OLAP clients currently at the market are able to process just one table as data source, while the real data for OLAP-analysis is spread among several ones (for the details of OLAP-data schemas see here).
A Few Things about Cube Structure Elements
Any OLAP-report consists of two basic parts: the headers area and the data (fact) area. The fact area contains measures, the headers area - dimensions, hierarchies (Parent-Child, multilevel and flat), and attributes. In general, the structure of the RadarCube Direct elements looks like the elements structure conventional for Microsoft Analysis Services, so if you're acquainted with that, just skip this part and go on reading from the next chapter. However, for those, whose mind has not yet been clouded with the professional OLAP-vocabulary, I'm going to brief you:
Flat and Multilevel Hierarchies
Let's assume we have a couple of tables, describing products: Products and Categories - related by a foreign key:
In this case, in the Cube Editor, we'll create two flat (attribute) hierarchies ("Categories" and "Products") and one multilevel ("Categories-Products") that will combine the flat ones. So, as a result we'll get the following:
In this case, the root node of the Cube Structure Tree is the "Products" dimension node. It's also worth mentioning that multilevel hierarchies may (even must) be created, if the fields that form them, are in the same table. For instance, there's a "Geography" table with the "Country" and "City" fields. It's obvious a city like, say, Seattle cannot be in several countries at the same time, like "USA", "UK" and "Canada". Hence, in this case the "Country" and "City" fields are interdependent and should be organized into a multilevel hierarchy "Country-City".
Parent-Child HierarchiesHierarchies of this type are created on the base of self-referencing tables. By the way, they cannot be created in case, when all OLAP-data are presented in the single table. So, for example, we have an "Employees" table with the data of the company workers. The self-referencing key in this case is the company organization structure. To create a Parent-Child hierarchy, you need to fill in the hierarchy properties marked by pink square in the Cube Editor.
As a result, the Parent-Child hierarchy in OLAP Grid will look like this:
In our "Employees" table, in addition to the EmployeeName field and the key fields, describing the hierarchical structure of the table, there're some auxiliary fields, like Address, City, Country, HomePhone etc. that contain additional personal information. In OLAP Grid it looks like this:
The attributes are created in the Cube Editor for any hierarchy from the flat hierarchies set in the same dimension. All that needs to be done here is creating a set of flat hierarchies in the dimension, and then choosing the one, you want to create attributes for, right-clicking it and selecting "Info attributes editor..." from the menu. In OLAP Grid switching on/off hierarchy attributes is available from the context menus of the Grid cells.
Guidelines for Creating a Fast Cube
So, here comes the crux of the matter - all this article was written for the sake of this paragraph, and all the things discussed above were to make it clear ;)
Clue #1: Create and use multilevel hierarchies, wherever possible. Analyze the data schema, locate the interdependent fields and make up multilevel hierarchies on their basis. Remember, RadarCube is able to process a multilevel hierarchy dozens times as fast as a set of flat lists that comprise it.
Now I can explain how I solved the Cube performance problem for our customer: