Feb 2, 2020

4 ways to speed-up your processes with parallel loading in IBM TM1 and Planning Analytics

What is parallel loading?

TurboIntegrator processes (TIs) is the built-in ETL (Extract, Transform, Load) tool of IBM TM1 and Planning Analytics. TIs are very powerful and fast, with just few seconds, millions of records or cells can be loaded or copied into your cubes. You can make them even faster by running them in parallel.

Instead of running one process loading one year of data, you could run 12 processes in parallel (each loading one month of data).

Why running processes in parallel?

The main reason for running processes in parallel is to improve your user experience. To get the best experience out of their IBM TM1 and Planning Analytics application, users should wait as little as possible. Especially if some TIs need to run on a daily basis, you need these TIs to be as fast as possible.

Context

In this article we go through a very common use case, copying one-year of data across two cubes. Our process copying data from the Employee cube to the Employee Reporting cube takes around 80 seconds.

We will see in this article how to reduce the run-time to 20 seconds, by running the processes per month in parallel:

How to run processes in parallel

This article covers four ways to execute processes in parallel, TM1RunTI, Hustle, RushTI and RunProcess. Please find below a quick comparaison:

There are two other ways which are not covered in this article as they are either too cumbersome (setting up TM1 chores manually for each TI) or highly technical (making your own TM1 REST API calls).

1. TM1RunTI

TM1RunTI is a command line interface tool that can initiate an IBM TM1 TurboIntegrator (TI) process or chore from within any application capable of issuing operating system commands.

The TM1RunTI executable file (tm1runti.exe) can be found in the bin directory of a TM1 server install (Program FilesIBMcognostm1bin).

Create a config file

The first step is to create a configuration file (tm1runti.config) to store the TM1 user credentials (TM1RunTI will require one TM1 user to authenticate to the TM1 server).

TM1 includes a utility, TM1Crypt to encrypt your password. TM1RunTI will then use an encryption key and a password file to authenticate to TM1.

To encrypt your password, you just need to run the following command line in a Windows Command Prompt:

"C:Program Filesibmcognostm1_64bintm1crypt.exe"   -keyfile C:TM1tm1srv01Scriptstm1key.dat     -outfile C:TM1tm1srv01Scriptstm1cipher.dat -validate

By executing the command line, you will need to enter the password and that’s it!

TM1Crypt will create two new files:

  • C:TM1tm1srv01Scriptstm1key.dat: The key to decrypt the password
  • C:TM1tm1srv01Scriptstm1cipher.dat: The encrypted password

Here the example of tm1runti.config file we are using:

[TM1RunTI]adminhost="localhost"server="tm1srv01"user="admin"passwordkeyfile="C:TM1tm1srv01Scriptstm1key.dat"passwordfile="C:TM1tm1srv01Scriptstm1cipher.dat"

To test your configuration, try to execute one TM1RunTI command from a Windows Command Prompt as below:

"C:Program Filesibmcognostm1_64bin64tm1runti.exe"     -i "C:TM1tm1srv01Scriptstm1runti.config"      -process Cub.EmployeeReporting.CopyFrom.Employee.ByMonth 

If no errors returned, it means that TM1RunTI successfully executed the process:

Executing TM1RunTI from a TI

Once the configuration is correct, let’s write a TM1 process to execute multiple TM1RunTI.

To execute the TM1RunTI command line from a TI, we are using the ExecuteCommand(sCMD,0) function (0 means that we are not waiting for the end of the processes before executing the next one).

The code for one ExecuteCommand will look like this:

sTM1RunTIExe = '"C:Program Filesibmcognostm1_64bin64tm1runti.exe"';sLoginInfo = ' -i "C:TM1tm1srv01Scriptstm1runti.config"';sProcess = ' -process Cub.EmployeeReporting.CopyFrom.Employee.ByMonth';sParameters = ' pVersion=Budget pYear=2018 pMonth=Jan';sCMD = sTM1RunTIExe | sLoginInfo | sProcess | sParameters;ExecuteCommand(sCMD,0);

Now we just need to add as many ExecuteCommand as we need, in our example we are going to execute 12 TM1RunTI commands, one per month:

By executing the process, we can see in the sessions the 12 processes running at the same time:

When to use TM1RunTI?

Even though TM1RunTI is fairly easy to use, the main limitation is that TM1RunTI does not manage threads. The number of processes running with TM1RunTI should never exceed the number of CPU, TM1 is allowed to use. If you hit this limitation instead of trying to manage the processes with your own code, you could try to use Hustle.

2. Hustle

Hustle is a small utility that can be used to manage threads when executing command line tools. The tool was built to take advantage of parallel loading in IBM TM1 and Planning Analytics, specifically TM1RunTI.

Hustle enables you to specify the number of concurrent processes you want to be executed at any one time and pass a batch of commands to be executed on these threads.

For example, if TM1 is allowed to use 10 CPU, you can have a file which contains 50 TM1RunTI jobs and pass in 10 as the number of cores to use. Hustle will launch the first 10 threads and watch the queue and as each thread finished start a new one, keeping 10 cores running until all 50 jobs are completed.

Download Hustle

First you will need to download Hustle and put it on the TM1 server (no installation required):

Make sure TM1RUNTI is setup correctly

To execute TM1 processes, Hustle uses TM1RunTI, so before using Hustle, you need to make sure that TM1RunTI is configured correctly (Just follow the steps described above).

Executing Hustle from a TM1 process

First we need to create a file to store all the TM1RunTI commands. To do that we are going to use the AsciiOutPut function as below:

sTM1RunTIExe = '"C:Program Filesibmcognostm1_64bin64tm1runti.exe"';sConfigFile = ' -i "C:TM1tm1srv01Scriptstm1runti.config"';sProcess = ' -process Cub.EmployeeReporting.CopyFrom.Employee.ByMonth';sCmd = sTM1RunTIExe | sConfigFile | sProcess;sParameters = ' pVersion=Budget pYear=2018 pMonth=Jan';ASCIIOutput(sHustleFile, sCmd | sParameters);

Once the process run, the tasks file will look like this:

Executing Hustle

Hustle is very simple to use, the command line tool takes 2 arguments:

  • A path to text file that contains the commands to be executed (sHustleFile)
  • The maximum number of threads to be used (pNbThread)

The command line in our TI will look like this:

sHustleExe = 'C:HustleHustle.exe';sHustleFile = 'C:HustleRunTIBatchByMonth.txt';sCmdHustle = sHustleExe | ' ' | sHustleFile | ' '|pNbThreads;

Then to execute the Hustle command line in a process, we use the ExecuteCommand function as below:

ExecuteCommand(sCmdHustle, 1);

By executing the process, you will see the main process running and 4 other processes running at the same time, once one process finishes, Hustle will start the next one:

When to use Hustle?

Hustle is very popular in the TM1 community. There should no hesitation between using Hustle over TM1RunTI as it will remove you the overhead of managing the threads. The two main reasons why you would not use it are:

  1. Hustle has to be located on the TM1 server, if you are not allowed to add this small utility to your server then you can’t use it.
  2. Each TM1RunTI command executed, needs to authenticate to TM1, if you need to execute hundreds of them, this authentication part (specially with CAM security) might take a bit of time.

These two limitations can be overcome with RushTI.

3. RushTI

RushTI is a Python script enabling you to execute IBM TM1 and Planning Analytics processes in parallel using only one connection using the TM1 REST API.

RushTI manages parallel threads in a similar way as Hustle by specifying the maximum number of threads to run at the same time.

Setting up RushTI

RushTI requires TM1py and uses the TM1 REST API to connect to any TM1 instances, assuming the REST API is enabled.

All steps to install RushTI can be found in this article:

In this example we downloaded RushTI and put it on the C:TM1pyRushTI folder:

Configuration file

The details to authenticate to TM1 needs to be added to the config.ini file in the RushTI folder. To check your credentials, you can run the check.py script as it is described in this article:

Our TM1 instance is using security mode 1 so we just need to reference the user name and password as below:

[tm1srv01]address=localhostport=8352user=Adminpassword=appledecode_b64=Falsessl=True

TM1py gives you two ways to avoid having your password stored in a file:

  • With TM1 security mode 1, you could encode the password using a base64 encoder.
  • With CAM Security, TM1py can login using the CAM gateway as below:
[tm1srv01]address=localhostport=8352namespace=CUBEWISEgateway=http://localhost:80/ibmcognos/cgi-bin/cognos.cgissl=True

Create tasks.txt

Once RushTI is setup, to use it, similar as Hustle, we need to store in a text file the list of processes to execute in parallel. To do this we are using the ASCIIOUTPUT function as below:

sTaskFile = 'C:TM1pyRushTITasks.txt';sCmd = 'instance="tm1srv01" process="Cub.EmployeeReporting.CopyFrom.Employee.ByMonth" ';sParameters = ' pVersion=Budget pYear=2018 pMonth=Jan';ASCIIOutput(sTaskFile, sCmd | sParameters);

Then to execute RushTI, we are using the ExecuteCommand function:

sRushTI = 'C:TM1pyRushTIRushTI.py';sCommand = 'python ' | sRushTI | ' ' | sTaskFile|  ' '|pNbThreads;ExecuteCommand(sCommand,1);

pNbThreads is the number maximum of threads that RushTI will execute.

The process will first create the task.txt file as below:

Then RushTI will handle the treads similar as Hustle. It will make sure the number of threads running at the same time will never exceed the number of threads specified:

RushTI vs Hustle

Even though at first, RushTI and Hustle look very similar, RushTI has three main advantages.

RushTI uses one connection only

Hustle uses TM1RunTI, therefore it has the same limitation, each thread will first need to connect to TM1. In certain TM1 applications with CAM Security, TM1 developers had to set up one different user per thread to avoid locking.

RushTI does the same as Hustle except that it connects to TM1 only once at the beginning and then reuses the same connection to run all threads. By using RushTI you will gain the authentication time and remove some potential locking issues.

RushTI can execute processes on different instances

RushTI uses the TM1 REST API and therefore it does not have to sit on the TM1 server. One RushTI script can connect to any TM1 instances as long as the TM1 REST API is open.

Ordering the threads

RushTI enables you to set a predecessor to each process. It means that RushTI will wait that the “predecessor” process has finished before executing it.

This can be very useful when running allocations, you might need to wait that a set of processes finished before running the next set.

In the example below we are splitting the processes by department and months. We set as predecessor each department making sure that all data for one department has been copied before running the next department.

All processes with id=”2” will wait that all processes with id=”1” have finished as they have predecessor=”1”:

4. RunProcess

IBM Planning Analytics (TM1 v11) introduced a new function called RunProcess. “RunProcess lets you run TurboIntegrator processes in parallel, each on its own thread that is managed by TM1® Server.”

The great advantage of using RunProcess is that it does not require any settings, you just need to use the function multiple times in a master process such as below:

RunProcess('Cub.EmployeeReporting.CopyFrom.Employee.ByMonth',    'pVersion','Budget',    'pYear','2018',    'pMonth','Jan');RunProcess('Cub.EmployeeReporting.CopyFrom.Employee.ByMonth',    'pVersion','Budget',    'pYear','2018',    'pMonth','Feb');

In the example below, we are executing 12 RunProcess functions inside the same process. After executing the master process, you will see 12 new processes running in with its own thread:

RunProcess does not come with a native way to manage threads. If the number of processes you need to run in parallel exceed the number of CPU available, you will need to write some smart logic to make sure the number of processes running never exceed the number of CPUs TM1 is allowed to use.

ExecuteProcess vs RunProcess

All sub-processes that are part of the same master process through a chain of ExecuteProcess are all part of the same transaction. Until the master process finishes its epilog and commits, any data changes made by processes within the transaction will be available only to other processes that are part of the transaction. From an external perspective no changes made during the transaction are available until the final commit.

Any processes using RunProcess is not part of the same transaction. So the copy of the data it has available is the state of the data model BEFORE the commencement of the master process.

Optimizing parallel processing

Once you start running processes in parallel, the next question you could ask yourself is how to make your processes run even faster!

How many threads to run in parallel

The first question you could ask yourself, is how many processes can I run in parallel? As explained before, the number of threads running in parallel should not exceed the number of CPUs available. If you reach this limit, you could increase the number of CPU on your server to run more TIs but increasing the number of CPU will not necessarily increase the speed. As it is explained in the Mastering MTQ article, there is tipping point where splitting your processes won’t gain you much speed.

Dimension(s) to split the process

The next question you could ask yourself is which dimension should drive the splitting of processes?

In our example, the Period dimension drives the splitting (one process per month). What if instead of running one process per month, we would run one process per Region (14 TIs), one per Department (9 TIs) or one per combination of month and department (12 * 9 = 108 TIs).

The only way to find out, is to do some testing!

In our example, running 12 months in parallel took 20 seconds, it took 52 seconds by departments and it took 40 seconds when running by months and departments.

Reordering dimensions for speed

If your processes are copying data from one cube to another, you could speed-up the execution time by improving the query time of your cube source. To improve the query time, you could reorder dimensions.

To find the optimal dimensions order to improve the query time, you will need to do some testing but putting first the dimension driving the split should improve the calculation time.

READ MORE:

Related content

Loading related content