What can you do when your users are complaining that Excel or TM1 Web is hanging. How do you deal with this situation. Hopefully it does not happen often in the TM1 world but it can happen if:
- A process which updates metadata is running.
- A user is opening a very Large spreadsheet.
- A user is opening a Large cube view.
- A user is doing a data spreading on a very Large dimension.
Pulse has lots of features which can help you to troubleshoot slow performance, what is more, is that it gives you the tools to find out quickly what could be the cause of the issue.
1 - Check what's happening
Pulse has a Live Monitor features which shows you what's happening on the server. It tells you the amount of CPU, Memory used by your TM1 application, the number of users logged in and the wait time. In the example below we can see that Oliver Martin is running a process and Anthony Taylor is waiting:
2- Analyse what happened
With the Pulse Web client, you can see what's happening right now but if you want to know what happened a few minutes ago, you can use the Pulse Thick Client. It allows you to pause and rewind the threads to investigate what happened in the past. It enables you to analyse the event step by step:
In our issue a TM1 process "z.Dim.UpdateLargeDimension" is causing the lock. Now we need to investigate if this TM1 process took more time than usual and if this is the case, we need to find out why.
3 - Check Process Execution History
Pulse keeps lots of information about your TM1 objects, in the process history tab, you can find for all your TI, the min run time, the avg run time and the max run time. In our example, the min run time of "z.Dim.UpdateLargeDimension" is less than 1 sec and the maximum run time is 40 sec:
With this information, we can investigate the cause whether it is because of changes in the TI process or data source
4 - Check each time a process is executed
In the Query tab, you can pick a process or a chore and see each time it was executed between 2 dates. In our example, we can see that the execution time increased the 21/08/2016 from less than 1 sec to 30 sec:
This significant change in run time can be caused by a TI process or data source change. The next section will show how to investigate change in the code.
5 - Look for changes in the code
Pulse has a Change Tracking feature, every change that you make on a TI or a chore are stored in the Pulse Version Control System. In other words, every time an object is changed by a user or the system, it is tracked and logged into Pulse.
In the Change History tab, we can see all changes that happened in our TM1 instance CXMD, you can filter by TM1 objects in order to see only Processes:
6 - Check the history of changes
Clicking on zDim.UpdateLargeDimension process will display the history of changes for the process:
Now we can go through the changes and try to find out which part of the code could be the cause of the execution time increase. Pulse highlights new line added into the code in green and line deletion in red.
In this example, the issue comes from a new line of code which run a new process:
The beauty here is that we do not have to go through all the code to find what could have changed. With Pulse you can easily see what has been updated and when it has been updated.
Now that we have found the issue, you can either open the TM1 process and remove the change or click to a previous version of the TI and click on the Rollback button. It will overwrite the current version of the TI and replace it by a previous version. This feature can be very helpful if you have lots of code lines that you need to update:
Exercise caution when using the Rollback feature on rule files in Production. This feature saves the rule after rollback and saving a rule will create a lock on the TM1 instance. Depending on how long it takes to save the rules, users might be locked.
7 - set up alerts
In order to anticipate these types of situations, Pulse gives you the ability to set up Alerts which will send you emails or even kill a thread when a threshold is reached. For example you can set up an Alert if the wait time is superior to 60 sec or if a process takes more than 100 sec, Pulse will either send an email, kill the thread or both.
Pulse helped us to understand what was happening on the server. We've found out that one user was waiting because another user was running a TI. Thanks to the change tracking capability, we were able to find out which line of code was causing the issue.