MDX Scope VS Pre Aggregated measures.

MDX Scope VS Pre Aggregated measures.

So I looked into the use of Scope in MDX and compared it to using pre aggregated measures in the cube.

I am a big fan of handling any type of row level business rules in SQL and any type of set level business rules in MDX calculated measures. Period to Date sales should definitely be coded up in MDX calculated measures whereas calculating discount for products based on what category they belong to should be done at row level in SQL before loading the data into the cube.

Scope seems to be quite good at handling the row level type calculations though, there are pros and cons to both approaches though.

The example I will use is not a good “real life” example, but it’s as good as any for a technical example.

On the “Internet order Quantity” measure in the AdventureWorks 2008 R2 cube database, we will “discount” the quantity based on which category the products belong to.

The Scope approach.

1.)      On the Internet sales measure group, duplicate the “Internet order Quantity” measure and call it “Internet Order Quantity2”

2.)      Add the following code to the cube in the calculations tab:


–-***Accessories 0 % discount

–-***15% Discount on Bikes

SCOPE({[Product].[Product Categories].[Category].&[1].FirstChild.FirstChild:
[Product].[Product Categories].[Category].&[1].LastChild.LastChild}, 
[Measures].[Internet Order Quantity2]);

THIS = [Measures].[Internet Order Quantity2] * 0.85; 

END SCOPE; 

–-***10% Discount on Clothing

SCOPE({[Product].[Product Categories].[Category].&[3].FirstChild.FirstChild:
[Product].[Product Categories].[Category].&[3].LastChild.LastChild}, 
[Measures].[Internet Order Quantity2]);

THIS = [Measures].[Internet Order Quantity2] * 0.9; 

END SCOPE; 

–-***5% Discount on Components

SCOPE({[Product].[Product Categories].[Category].&[2].FirstChild.FirstChild:
[Product].[Product Categories].[Category].&[2].LastChild.LastChild}, 
[Measures].[Internet Order Quantity2]);

THIS = [Measures].[Internet Order Quantity2] * 0.95; 

END SCOPE; 

 

3.)      Save, publish and deploy the cube.

4.)      Open a new mdx query connection in Management Studio and run the following code:


SELECT
{
      [Measures].[Internet Order Quantity],
      [Measures].[Internet Order Quantity2]
} ON 0,
NON EMPTY
(
{
      {[Product].[Product Categories].[All Products].FirstChild.FirstChild.FirstChild:
       [Product].[Product Categories].[All Products].LastChild.LastChild.LastChild}
},
{
      [Date].[Calendar].Members
}
)
 ON 1
FROM
      [Adventure Works]

 

I have included both variables (Original and discounted using scope) in the query to confirm that the results are as desired. The rows at the top have no difference, but that is because they belong to the Accessories department which wasn’t discounted, the rows further down show the clear difference.

What the scope code is doing: Quite simple really, inside each scope statement for each of the Poroducts derived by getting the Grand children of category (FirstChild.FirstChild to lastChild.LastChild) it is replacing the value of “Internet Order Quantity” with whatever code is specified in the piece of code followed by “This =”

The pre-aggregated approach.

1.)      Create a view in the sql database, based on the FactInternetSales table using the same logic used in the cube, the code should look like this:

 


CREATE VIEW dbo.vw_FactInternetSales
AS
SELECT
      fis.*,
      CASE
      WHEN dpc.ProductCategoryKey = 1 THEN fis.OrderQuantity * 0.85 
      WHEN dpc.ProductCategoryKey = 2 THEN fis.OrderQuantity * 0.95
      WHEN dpc.ProductCategoryKey = 3 THEN fis.OrderQuantity * 0.9
      ELSE
          fis.OrderQuantity
      End AS OrderQuantity3 
FROM
      dbo.FactInternetSales  fis
      INNER JOIN dbo.DimProduct dp
      ON    fis.ProductKey = dp.ProductKey 
      INNER JOIN dbo.DimProductSubcategory dpsc
      ON    dp.ProductSubcategoryKey = dpsc.ProductSubcategoryKey 
      INNER JOIN dbo.DimProductCategory dpc 
      ON    dpsc.ProductCategoryKey = dpc.ProductCategoryKey 

     

2.)      In the data source view of the cube replace the existing internet sales table with the view created in step 1.

3.)      Add OrderQuantity3 to the measure group, call it “Internet order Quantity3”, set up the format string and other attributes to be the same as Internet Order Quantity2.

4.)      Change the code in the partitions to include the “Internet order Quantity3” column and to use the view as opposed to the table it used to have (For this example just use select *)

5.)      Publish and process the cube.

6.)      Add “Internet Order Quantity3” to the column list being selected in the MDX statement executed in Management studio earlier, it should now look like this:

 


SELECT
{
      [Measures].[Internet Order Quantity],
      [Measures].[Internet Order Quantity2],
      [Measures].[Internet Order Quantity3]
} ON 0,
NON EMPTY
(
{
      {[Product].[Product Categories].[All Products].FirstChild.FirstChild.FirstChild:
       [Product].[Product Categories].[All Products].LastChild.LastChild.LastChild}
},
{
      [Date].[Calendar].Members
}
)
 ON 1
FROM
      [Adventure Works]

 

 

As you can see the pre aggregated approach returns exactly the same result as the scope version.

I tested the performance in MDX of both versions and i found that the pre aggregated version seems to perform a bit better. I did the tests with this code (I tried to make the query as granular as possible for a good test):

First clear the cache by following these instructions:

http://msdn.microsoft.com/en-us/library/hh230974.aspx

Then run this query in MDX to test the Scope approach:


SELECT
{
      [Measures].[Internet Order Quantity2]
} ON 0,
NON EMPTY
(
{
      [Promotion].[Promotions].[All Promotions].FirstChild.FirstChild.FirstChild:
      [Promotion].[Promotions].[All Promotions].LastChild.LastChild.LastChild
},
{
      [Customer].[Customer Geography].[All Customers].FirstChild.FirstChild.FirstChild:
      [Customer].[Customer Geography].[All Customers].LastChild.LastChild.LastChild
},
{
      {[Product].[Product Categories].[All Products].FirstChild.FirstChild.FirstChild:
       [Product].[Product Categories].[All Products].LastChild.LastChild.LastChild}
},
{
      [Date].[Calendar].Members
}
)
 ON 1
FROM
      [Adventure Works]

 

This query took 58 seconds on my pc. Now if i run the clear cache query again and change the “Internet Order Quantity2” measure to “Internet order Quantity3” the query completes in 47 seconds, not much of a saving, but it does make a differences.

Both approaches I believe are good, but I personally prefer the pre-aggregated in sql approach for a few reasons:

1.)      Performance

2.)      Deployment into production (a view in sql is deployed a lot easier than a cube), in both cases the cube needs to be processed though.

3.)      The sql approach will not be affected by a design change to hierarchies on the dimensions.

The scope approach however can change logic at any level in the hierarchies ie different logic at category level to product level, this will be a lot more difficult to achieve in sql though (probably need to do a bit in both parts cube and sql).

Advertisements

About ditchie2013

A Microsoft SQL Server Analysis Services developer.
Aside | This entry was posted in Analysis Services, MDX Performance, SQL, TSQL. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s