- 1. Use CONTINUE over IF
- 2. Simplify the comparison
- 3. Replace ATTR functions with DB
- 4. Replace DB with DIMIX
- 5. Dimension relationship functions (ELPAR, ELISPAR, ELISANC) are faster than DB on small dimensions
- ELISCOMP vs ELISPAR vs ELPAR
- DB vs ELPAR
- DB vs ELISANC
- 6. Replace DB with [‘element’]
- 7. Replace [‘element’] with [‘dimension’:’element’]
- Conclusion
- READ MORE:
Sep 1, 2019
7 tips to writing faster IBM TM1 and Planning Analytics rules
One of the reasons why IBM TM1/Planning Analytics is so fast at performing complex calculations is because it gives developers many options to optimise the calculation rules.
The way developers write a rule has a huge impact in term of performance so before writing a rule, there are some key questions TM1 developer should be aware of such as:
- Should I use CONTINUE over IF statements?
- Should I get an attribute value with ATTRS or DB?
- Should I specify the dimension name in an area statement?
Thanks to the }StatsByRule cube introduced with IBM Planning Analytics, we can now find out the evaluation time of each rule and give a definitive answer to these questions.
In this article we evaluated the performance of the most popular functions to give you some guidance about writing faster rules.
* To calculate the percentages below, we used an average of the total time taken for the rule to be evaluated from the }StatsByRule cube across different cubes sizes with IBM Planning Analytics v2.0.7. Every model is different, you might not get the same percentages in your model as the calculation time depends on factors such as IBM Planning Analytics server version, number of elements per dimension, number of dimensions in the cube, dimension ordering, cubes sizes, etc…
1. Use CONTINUE over IF
To improve the performance of a complex rule, you should first try to simplify it. One of the ways to do that is to use the CONTINUE statement.
If you see multiple nested IF statements in a single rule you could try to split this rule into multiple rules, one rule per IF statement.
Use rule priority to order the conditions of the original rule: TM1 executes rule from top to bottom. For a given data point in a cube once the left-hand side of the rule is satisfied the right-hand of the rule is evaluate. The rule calculation engine in TM1 stops evaluating any further rules below the qualifying rules unless instructed by a CONTINUE statement to keep evaluating subsequent rule statements
We compared one rule with two nested IF versus two rules with a CONTINUE statement:
The time to evaluate the two simple rules is 40% quicker than evaluating one rule with two IF statements.
Even though CONTINUE statements can reduce the calculation time and the readability of the cube rules, too many CONTINUE statements will create some performance issues.
Reducing the number of IF statements without using too many CONTINUE can be achieved by adding more constraints to the area statement on the left-hand side of the rule to reduce the slice of the cube that the rule applies to.
2. Simplify the comparison
Now that we simplified the rule, the next step would be to try to simplify the condition inside the IF statement.
Numeric comparison is generally much faster than string comparison, and the length of the text will also have an impact on the comparison:
The example above just confirmed some common practices in most programming languages. This is not TM1 specific but the same in any ‘language’.
String comparison, even if it is one character, if that character is the same there is a second comparison to see if the end of the string is reached whereas a numeric comparison is just a single comparison operation. The last one has no comparison at all which, as in any language, would explain the difference in execution time.
Actually, in TM1 what takes most of the time when comparing values is not the comparison itself but the function used to return the value used inside the comparison.
Let’s now have a look at these functions. We are going to start by comparing ATTRN, ATTRS and SUBST. These three functions are probably the most popular functions when you want to limit the rule to a subset of elements of one dimension.
We’ve found out that the ATTRS function is faster than ATTRN even though the first one requires a string comparison. The SUBST function which returns a sub-string of the element is 75% faster than ATTRN!
The difference between the 3rd and 4th is that the SUBST function is a “starts-with” operation and therefore incurs extra overhead whereas the 4th is a simple element name equals operations.
3. Replace ATTR functions with DB
Using the ATTRS(dimension, element, attribute) or ATTRN(dimension, element, attribute) function looks like the more logic way to get an attribute value, but actually it is much slower than getting the value directly from the }ElementAttributes_ cube using the DB function, DB(‘}ElementAttributes_Dimension’, !Dimension, attribute). On average in our environment, we have found out that the DB function is 70% faster:
Even though ATTR and DB will retrieve the same value of the attribute, it is important to be aware of the differences between these functions:
- ATTRN will return a numeric value, ATTRS will return a string value, DB can return either string or numeric value.
- Using ATTRS to return an alias will return either the alias value or the principle name if the alias is not populated. Using DB will only return alias when the alias is populated in the }ElementAttributes_ cube.
4. Replace DB with DIMIX
To compare two elements between each other, especially when working with dates, you could either use the DIMIX(dimension, element) function to get the index number or get the value from an attribute.
Even though it is a rare use-case, using the DIMIX function can reduce by 50% the evaluation time of the rule:
* You should use the DIMIX function with care, use it only when you can guarantee the index of the element in the natural order of the elements in the dimension won’t change.
5. Dimension relationship functions (ELPAR, ELISPAR, ELISANC) are faster than DB on small dimensions
ELISCOMP vs ELISPAR vs ELPAR
A common requirement when writing a rule is to determine if an element is part of a group or a specific parent. To find out if one element is the child of one parent, you could use any of the following:
- ELISCOMP(dimension, element1, element2) to find out if element1 is a child of element 2
- ELISPAR(dimension, element1, element2) to find out if element1 is a parent of element 2
- ELPAR(dimension, element1, 1) @= element2 to find out if element2 is the first parent of element1
During our tests, ELISPAR and ELISCOMP take around the same time to calculate, it is ELPAR which is 60 % faster:
ELISPAR is an iterator over the collection of parents of checking each and every one of them if they happen to be ‘Total P’ whereas in the ELPAR case, it is only checking if the first of the parents of ‘N Element’ is equal to ‘Total P’.
Although there is some overhead to ‘iterating’ vs retrieving a single element, the difference is proportional to the number of parents. The deeper your hierarchy is, the bigger the variance will be between ELISPAR and ELPAR.
DB vs ELPAR
Another way to find out if an element belongs to a specific parent is to create an attribute and using DB to evaluate that attribute value. Alternatively, create a rollup using the value and making the element a member of the rollup.
For small dimensions with less than 100,000 elements, the ELPAR function is 60% faster than DB.
However, in our environment with dimensions with more than 100,000 elements, the DB function will start to be faster by more than 10%.
The gain on performance is correlated with the dimension size, the gain will be bigger when the size and complexity grow.
DB vs ELISANC
To find out if element2 is an ancestor of element1, you could either use the ELISANC(dimension, element1, element2) function or an attribute value with the DB function as above. In our cube, the ELISANC function is 40% faster on small dimensions:
However, on large dimensions, the DB function will be faster by 12% in our environment:
6. Replace DB with [‘element’]
When referring to another relative data-point in the same cube the rule can be expressed two ways:
- The short form is qualifying only the element(s) that are different in the left-hand side of the rule; [ ‘element’].
- The long form DB statement can be used to explicit reference all the dimension of the cube. Use combination of !bang and qualified elements to reference the other part of the cube.
Using the short form is 40% faster than the DB function. The performance difference will vary based on number of dimensions in the cube vs number of elements specified in the short form.
There is simply less overhead to the short form as TM1 only retrieves or overwrites the elements specified in the ‘current cursor’ whereas DB builds a completely new ‘cursor’ identifying a cell. The performance difference will vary based on the number of dimensions in the cube vs the number of elements specified in the short form.
7. Replace [‘element’] with [‘dimension’:’element’]
In an area statement, TM1 will require you to specify the dimension name in the statement only if there is an element with the same name in more than one dimension. Specifying the dimension name can decrease the evaluation time of the rule by 10%:
Including the dimension reference is faster because this constrains the element search to a single dimension instead of searching over all dimensions.
For small cubes, there may be no real gain but for cubes with many dimensions, or cubes with few really large dimensions then the gain can be significant as TM1 doesn’t have to search all dimensions in the cube for the element.
Especially for cubes with very big dimensions it is recommended to always qualify the dimension. A nice side effect is that qualifying the dimension improves the readability of the rule.
Conclusion
The }StatsbyRule cube is a very powerful feature, it enables developers to test different rule expressions to achieve better performance.
It is important to keep in mind that the performance gain depends on many factors such as IBM Planning Analytics version, cubes sizes, dimensions sizes, dimensions order… that is why you might not get the same results in different environments.
One of the reasons business users love TM1, is because they can easily understand what they see. Even though simplifying the rule is likely to improve the performance, it is important to keep in mind that a TM1 developer should write rules for the end-user that balances these three core principles:
- Design for Functionality: The solution must fulfill its design intent, make the solution intuitive.
- Design for Maintainability: Think beyond the immediate requirements. Adhere to coherent designs standard.
- Design for Performance: Leverage TM1’s unique strengths, avoid known performance bottlenecks.