Triggering TBSM Rules

May 13th, 2016


Tivoli Business Service Manager can calculate amazing things for you, if you only need them. This is thanks to the powerful rules engine being the key part of TBSM as well as the Netcool/Impact policies engine running  just under the hood together with every TBSM edition. You can present your calculation results later on on a dashboard or in reports, depending if you think of a real time scorecard or historical KPI reports.

In this article, I’ll show how you can make TBSM to process its inputs by triggering various template rules in various ways. It is something that isn’t really well documented or at least it isn’t well documented in a single place.


Status, numerical and text rules triggers

In this chapter  I’ll show three kinds of rules (status, numerical and text) and I’ll show how TBSM triggers them, so processes the input data, runs them and returns the outputs.

In general these three techniques always kick off TBSM rules based on the same two conditions: time and a new value. Here they are:

  1. Omnibus Event Reader service (for incoming status rules or numerical rules or text rules based on Omnibus ObjectServer as the data feed)
  2. TBSM data fetchers (for numerical or text rules with fetcher based data feed)
  3. TBSM/Impact services of type Policy activator (using Impact policy with PassToTBSM function calls to send data to numerical or text rules)
Figure 1. Three techniques of triggering status, numerical and text rules in TBSM

Figure 1. Three techniques of triggering status, numerical and text rules in TBSM

Make note. There are ITM policy fetchers as well as highly undocumented any-policy fetchers configurable in TBSM, I’ll not comment on them in this material, however their basis is just like any fetcher: the time.

Let’s take a look at the first type of rules triggers, the most popular, the OMNIbus Event Reader service in Impact, widely used in TBSM to process events stored in ObjectServer against the service trees.

OMNIbus Event Reader

Omnibus Event Reader is an Impact service running regularly by default every 3 seconds in order to connect Netcool/OMNIbus for getting events stored in the Objectserver memory that might be affecting TBSM service tree elements. It selects the events based on the following default filter:

(Class <> 12000) AND (Type <> 2) AND ((Severity <> RAD_RawInputLastValue) or (RAD_FunctionType = '')) 
AND (RAD_SeenByTBSM = 0) AND (BSM_Identity <> '')

(Severity <> RAD_RawInputLastValue) is the condition ensuring that event will be tested against containing a new value in the Severity field comparing to the previous event.

The Event Reader itself can be found in Impact UI server within Services page in TIP among other services included in the Impact project called TBSM_BASE:

Figure 2. Configuration of TBSMOMNIbusEventReader service

Figure 2. Configuration of TBSMOMNIbusEventReader service

Make note. TBSM allows you configuring other event readers but you can use just one same time.

All incoming status rules use this Event Reader by default. There’s embedded mapping between the name of the Event Reader and the Data source field in Status/Text/Numerical rules, hence just “ObjectServer” caption occurs in the new rule form:

Figure 3. Screenshot of New Incoming status rule form

Figure 3. Screenshot of New Incoming status rule form

Policy activators

Impact services of type Policy activators simply call a policy every defined period of time and run that policy.

Figure 4. Screenshot of a policy activator service configuration for TBSMTreeRuleHeartbeat

Figure 4. Screenshot of a policy activator service configuration for  TBSMTreeRuleHeartbeat

The policy needs to be created earlier. In order to trigger TBSM rules, it is required to call PassToTBSM() function and contain in its argument an object. Let’s say this is my TBSMTreeRulesHeartbeat policy:

 Seconds = GetDate();
 Time = LocalTime(Seconds, "HH:mm:ss");
 randNum = Random(100000);
 ev.timestamp = String(Time);
 ev.bsm_identity = "AnyChild";
 ev.randNum =  randNum;


In my example, a new value generates every time when the policy is activated by using GetDate() function. Pay attention to field called ev.bsm_identity. I’ll be referring to this field later on. For simplification this field has always value “AnyChild”.

Make note. Unlike TBSM OMNIbus event reader, the policy activating services, also policies themselves don’t have to be included in TBSM_BASE impact project.

Netcool/Impact policies give you freedom of reaching to any data source, via SQL or SOAP or XML or REST or command like or anyhow you like, and processing any data you can see useful to process in TBSM. The only requirement is passing that data to TBSM via PassToTBSM function.


TBSM data fetchers

TBSM datafetchers combine an Impact policy with DirectSQL function call and Impact policy activator service. Additionally, data fetchers have a mini concept of schedule, it means you can set their run time interval to specific hour and minute and run it once a day (i.e. 12:00 am daily). It also allows postponing or rushing next runtime in case of longer taking SQL queries.

Make note. Data fetchers can be SQL fetchers, ITM policy fetchers or any policy fetchers, unfortunately the TBSM GUI was never fully adjusted to reconfigure ITM policy fetcher and was never enabled to allow configuration of any policy fetchers and in case of the last two you’ve got to perform few manual and command-line level steps instead. Fortunately, the PassToTBSM function available in Impact 6.x policies can be used instead and the any-policy fetchers aren’t that useful anymore.

Every fetcher by default runs every 5 minutes.

Figure 5. Screenshot of data fetcher Heartbeat fetcher

Figure 5. Screenshot of data fetcher Heartbeat fetcher

In the example presented on the screenshot above data fetcher connects DB2 database and runs DB2 SQL query. The new value is ensured every time in this example by calling the native DB2 random function and the whole query is the following:

select 100000*rand() as "randNum", 'AnyChild' as "bsm_identity" from sysibm.sysdummy1

Pay attention to field bsm_identity. It always returns the same value “AnyChild”, just like the policy explained before.


Triggering status, numerical, text and formula rules

In the previous chapter I presented various rules triggering methods. It’s now time to show how those triggers work in real. I’ll create a template with 5 numerical and text rules (I don’t want to change any instance status, so I won’t create any incoming status rule this time) and additionally with 3 supportive formulas and I’ll present the output values of all those rules on a scorecard in Tivoli Integrated Portal. Below you can see my intended template with rules:

Figure 6. Screenshot of t_triggerrulestest template with rules

Figure 6. Screenshot of t_triggerrulestest template with rules

OMNIbus Event Reader-based rules

Let’s start from 2 rules utilizing the TBSM OMNibus Event Reader service. Like I said, I won’t use a status rule, but I’ll use one text and one numerical rule, in order to return the last event’s severity and the last event’s summary. Before I do it, let me configure my simnet probe that will be sending random events to my ObjectServer. My future service instance implementing the t_triggerrulestest template will be called TestInstance (or at least it will have such a value in its event identifiers), I want one event of each type and various probability that such event will be sent:

# Format:
#       Node Type Probability
# where Node        => Node name for alarm
#       Type        =>  0 - Link Up/Down
#                      1 - Machine Up/Down
#                      2 - Disk space
#                      3 - Port Error
#       Probability => Percentage (0 - 100%)

TestInstance 0 10
TestInstance 1 15
TestInstance 2 20
TestInstance 3 25


Let’s see if it works:

Figure 7. Screenshot of Event Viewer presenting test events sent by simnet probe.

Figure 7. Screenshot of Event Viewer presenting test events sent by simnet probe.

So now my two rules, I marked important settings with green. So my Data feed is ObjectServer, the source is SimNet probe, the field containing service identifiers is Node and the output value taken back is Severity:

Figure 8. Screenshot of the rule numr_event_lastseverity form

Figure 8. Screenshot of the rule numr_event_lastseverity form

Same here, this time the rule is Text rule and I have a fancy output expression, it is:

'AlertGroup: '+AlertGroup+', AlertKey: '+AlertKey+', Summary: '+Summary

The rule itself:

Figure 9. Screenshot of the rule txtr_event_lastsummary

Figure 9. Screenshot of the rule txtr_event_lastsummary

Fetcher-based rule

That was easy. Now something little bit more complicated, the data fetcher. I already have my datafetcher created and show above in this material, let’s check if it works fine, the logs shows the fetcher is fine, i.e. if it fetches data every 5 minutes:

1463089289272[HeartbeatFetcher]Fetching from TBSMComponentRegistry has started on Thu May 12 23:41:29 CEST 2016
1463089289287[HeartbeatFetcher]Fetched successfully on Thu May 12 23:41:29 CEST 2016 with 1 row(s)
1463089289287[HeartbeatFetcher]Fetching duration: 00:00:00s
1463089289412[HeartbeatFetcher]1 row(s) processed successfully on Thu May 12 23:41:29 CEST 2016. Duration: 00:00:00s. The entire process took 00:00:00s
1463089589412[HeartbeatFetcher]Fetching from TBSMComponentRegistry has started on Thu May 12 23:46:29 CEST 2016
1463089589427[HeartbeatFetcher]Fetched successfully on Thu May 12 23:46:29 CEST 2016 with 1 row(s)
1463089589427[HeartbeatFetcher]Fetching duration: 00:00:00s
1463089589558[HeartbeatFetcher]1 row(s) processed successfully on Thu May 12 23:46:29 CEST 2016. Duration: 00:00:00s. The entire process took 00:00:00s


And the data preview looks good too:

Figure 10. The Heartbeat fetcher output data preview

Figure 10. The Heartbeat fetcher output data preview

My next rule will be just one and it will be a numerical rule to return the randNum value, I marked important settings in green again, so I select HeartbeatFetcher as the Data Feed, I select bsm_identity as service event identifier and randNum as the output value:

Figure 11. Screenshot of numr_fetcher_randNum rule

Figure 11. Screenshot of numr_fetcher_randNum rule

Policy activated rules

Last but not least I will create two rules getting data from my policy activated by my custom Impact Service. I did show the policy and the service in the previous chapter, let’s just make sure they both work ok. This is how the service works, every 5 minutes I get my policy activated and every time it returns another value in the randNum field:

12 maj 2016 23:41:29,652: [TBSMTreeRulesHeartbeat][pool-7-thread-87]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=71648, timestamp=23:41:29, bsm_identity=AnyChild)
12 maj 2016 23:46:29,673: [TBSMTreeRulesHeartbeat][pool-7-thread-87]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=8997, timestamp=23:46:29, bsm_identity=AnyChild)
12 maj 2016 23:51:29,674: [TBSMTreeRulesHeartbeat][pool-7-thread-91]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=73560, timestamp=23:51:29, bsm_identity=AnyChild)
12 maj 2016 23:56:29,700: [TBSMTreeRulesHeartbeat][pool-7-thread-91]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=60770, timestamp=23:56:29, bsm_identity=AnyChild)
13 maj 2016 00:01:29,724: [TBSMTreeRulesHeartbeat][pool-7-thread-92]Parser log: (PollerName=TBSMTreeRuleHeartbeatService, randNum=55928, timestamp=00:01:29, bsm_identity=AnyChild)


Let’s then create the rules. I will have two rules again, one numerical and one text. The numerical rule will have the TBSMTreeRuleHeartbeatService as the Data Feed, the bsm_identiy field will be selected as the service event identifier field and randNum field will be my output:

Figure 12. Screenshot of numr_heartbeat_randNum rule

Figure 12. Screenshot of numr_heartbeat_randNum rule

Make note. Every time you add another field to your policy activated by your service, make sure that new field is mapped to the right data type in the Customize Fields form. You will need to add that field first:

Figure 13. Screenshot of CustomizedFields form

Figure 13. Screenshot of CustomizedFields form

And the second rule looks the following, this time it is a text rule and I return the timestamp value:

Figure 14. Screenshot of txtr_heartbeat_lasttime rule

Figure 14. Screenshot of txtr_heartbeat_lasttime rule

Formula rules

The last rules I’ll create will be three formula, policy-based, text rules. Each of them will go to another rules create previously and “spy” on their activity. Let’s see the first example:

Figure 15. Screenshot of nfr_triggered_by_events rule

Figure 15. Screenshot of nfr_triggered_by_events rule

This rule will use policy and will be a text rule. It is important to mark those fields before continuing, later the Text Rule field greys out and inactivates. After ticking both fields I click on the Edit policy button. All three rules will look the same at this level; hence I won’t include all 3 screenshots as just their names will differ. I’ll create another policy in IPL for each of them. Here’s the mapping:

Rule name

Policy name







Each policy of those three will look similar, it will just look after different rules created so far. The p_triggered_by_events policy will do this:

// Trigger numr_event_lastseverity
// Trigger txtr_event_lastsummary

Seconds = GetDate();
Time = LocalTime(Seconds, "HH:mm:ss");

Status = Time;
log("TestInstance triggered by SimNet events at "+Time);
log("Output value of numr_event_lastseverity: "+InstanceNode.numr_event_lastseverity.Value);
log("Output value of txtr_event_lastsummary: "+InstanceNode.txtr_event_lastsummary.Value);

Policy p_triggered_by_fetcher will do this:

// Trigger numr_fetcher_randnum
 Seconds = GetDate();
 Time = LocalTime(Seconds, "HH:mm:ss");
 Status = Time;
 log("TestInstance triggered by HeartbeatFetcher at "+Time);
 log("Output value of numr_fetcher_randnum: "+InstanceNode.numr_fetcher_randnum.Value);

And policy p_triggered_by_service this:

// Trigger numr_heartbeat_randnum
// Trigger txtr_heartbeat_lasttime
 Seconds = GetDate();
 Time = LocalTime(Seconds, "HH:mm:ss");
 Status = Time;
 log("TestInstance triggered by TBSMHeartbeatService at "+Time);
 log("Output value of numr_heartbeat_randnum: "+InstanceNode.numr_heartbeat_randnum.Value);
 log("Output value of txtr_heartbeat_lasttime: "+InstanceNode.txtr_heartbeat_lasttime.Value);

You can notice that each policy starts from a comment section. This is important. This is how the formula rules get triggered. It is enough to mention another rule by its name in a comment to trigger your formula every time that other referred rule returns another output value. This is why we have the randnum-related rules in every formula. Those rules are designed to return another value every time they run. Just the first rule isn’t the same, but I assume it will trigger every time a combination of Summary, AlertGroup and AlertKey fields value in the source event changes.

The trigger numerical or text rules are also mentioned later when these policies call them and obtain their output values in order i.e. to put those values into log file. But it is not necessary to trigger my formulas. I log those trigger text and numerical rules outputs for troubleshooting purposes only.

The purpose of these 3 policies and 3 formulas is to report on time when the numerical or text values worked for the last time.

Below you can see an example of one of the policies in actual form.

Figure 16. Screenshot of one of the policies text

Figure 16. Screenshot of one of the policies text

Testing the triggers

Now it’s time to test the trigger rules, the triggers and troubleshoot in case.

Triggering rules in normal operations

In order to do that we will need a service instance implementing our newly created template. I call it TestInstance and this is its base configuration:

Figure 17. Screenshot of configuration of service instance TestInstance – templates

Figure 17. Screenshot of configuration of service instance TestInstance – templates

It is important to make sure that the right event identifiers were selected in the Identification Fields tab. I need to remember what bsm_identity I set in all rules, it is mainly AnyChild (the policy and the fetcher) and TestInstance (the SimNet probe).

Figure 18. Screenshot of configuration of service instance TestInstance – identifiers

Figure 18. Screenshot of configuration of service instance TestInstance – identifiers

Make note. In real life your instance will have its individual identifiers like TADDM GUID or ServiceNow sys_id. It is important to find a match between that value and the affecting events or matching KPIs and if this is necessary to define new identifiers, which will ensure such a match.

Let’s see if it works in general. I created a scorecard and a page to present on all values of my new instance. I’ll put on top also fragments of my formula related policy logs to see if the data returned in policies and timestamps match:

Figure 19. Screenshot of the scorecard with policy logs on top

Figure 19. Screenshot of the scorecard with policy logs on top

Let’s take a closer look at the first section. Same event arrived just once but since formula is triggered by two rules it was triggered twice in a row. In general the last event arrived at 20:27:00 and its severity was 4 (major) and the summary was on Link Down. Both rules numr_event_lastseverity and txtr_event_lastsummary triggered m formula correctly.

The next section is about the fetcher, the latest random number is 16589,861 and the rule numr_fetcher_randnum triggered my formula correctly.

The last is the policy activated rule and formula, let’s see. This time I have two rules again and they both triggered the formula correctly. The last run was at 20:26:30. I have two different randnum values in both runs. This is caused by referring to numerical rules twice in my formula policy.

Triggering rules after TBSM server restart

I’ll now show a problem that TBSM has with rules that are not triggered by any trigger. Like I said in the previous chapters, TBSM needs rules to be triggered every now and then but also the value to change between triggers, in order to return the value again.

It causes some issues in TBSM server restart situations. If a value hasn’t changed before server restart and is still the same after the restart, TBSM may be unable to display or return it correctly, if the rule used to return it is not triggered. Server restart situation means clearing TBSM memory so no output values of no rules are preserved for after the server restart.

Here’s an example. I’ll create one new formula rule with this policy in my test template:

Status = ServiceInstance.NUMCHILDREN;
log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status);

Here’s the rule itself:

Figure 20. Screenshot of nfr_numchildren rule configuration

Figure 20. Screenshot of nfr_numchildren rule configuration

As next step, I add one more column to my scorecard to show the output of the newly created rule. I also created 3 service instances and made them a child to TestInstance instance.


Figure 21. Screenshot of the scorecard shows 3 children count

Also my formula policy log will return number 3:

13 maj 2016 12:17:56,664: [p_numchildren][pool-7-thread-34 [TransBlockRunner-2]]Parser log: Service instance TestInstance (TestInstance) has children: 3

Now, if I only restart TBSM server, the value shown will be 0 and I will see no new entry in the log:


Figure 22. Screenshot of the scorecard after server restart shows 0 children

I can change this situation by taking one of three actions:

  1. Adding new or removing old child instances from Testinstance
  2. Modifying the formula policy
  3. Introducing a trigger to the formula policy

However two first options don’t protect me from another server restart situation occurring.

Let’s say I add another child instance. This is how the scorecard will look like:

Before the restart

After the restart

TriggeringTBSMRules_23 TriggeringTBSMRules_24

Or I may want to modify my rule. After saving my changes, the value will display correctly. However another server restart will reset it back to 0 again anyway.

So let’s say I change my policy to this:

Status = ServiceInstance.NUMCHILDREN;
log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status);
log("Service instance ID: "+ServiceInstance.SERVICEINSTANCEID);

And my policy log now contains two entries per run:

13 maj 2016 12:56:07,023: [p_numchildren][pool-7-thread-4 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 12:56:07,023: [p_numchildren][pool-7-thread-4 [TransBlockRunner-1]]Parser log: Service instance ID: 163

But the situation before and after the restart is the same:

Before the restart

After the restart

TriggeringTBSMRules_23 TriggeringTBSMRules_24

It’s not a frequent situation though. If your rules are normally event-triggered rules or data fetcher triggered rules you can expect frequent updates to your output values even after your TBSM server restarts.  Just in case you want to present an output value in a rule that normally is not triggered, make sure you include a reference to a trigger in your rule. Let’s use one of the triggers we configured previously in my new formula policy:

 // Trigger by numr_fetcher_randnum     
 // Trigger by numr_heartbeat_randnum 
 Status = ServiceInstance.NUMCHILDREN;
 log("Service instance "+ServiceInstance.SERVICEINSTANCENAME+" ("+ServiceInstance.DISPLAYNAME+") has children: "+Status); 
 log("Service instance ID: "+ServiceInstance.SERVICEINSTANCEID);

You can already notice by following the log of the policy that there are many entries per every policy run, precisely as many entries as many times the formula was triggered by one of the trigger rules.

The first pair of entries was added after saving the rule. The next 2 pairs were added in result of the triggers working fine:

13 maj 2016 13:22:36,837: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:22:36,837: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163
13 maj 2016 13:24:12,833: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:24:12,833: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163
13 maj 2016 13:24:18,465: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance TestInstance (TestInstance) has children: 4
13 maj 2016 13:24:18,465: [p_numchildren][pool-7-thread-3 [TransBlockRunner-1]]Parser log: Service instance ID: 163

Let’s make the final test, so TBSM server restart:

Before the restart

After the restart

TriggeringTBSMRules_23 TriggeringTBSMRules_23

It seems working fine now!

This excercise ends my material for tonight. I’ll continue in another material on triggering the status propagation rules and numeric aggregation rules. See you soon!



Unique Grand Children Count in TBSM

May 10th, 2016


Tivoli Business Service Manager can calculate amazing things for you, if you only need them. This is thanks to the powerful rules engine being the key part of TBSM as well as the Netcool/Impact policies engine running just under the hood together with every TBSM edition. You can present your calculation results later on on a dashboard or in reports, depending if you think of a real time scorecard or historical KPI reports.
In this article, I’ll show how you can use TBSM rules engine to calculate unique children count for a grand parent level service instance. It is something that isn’t really documented at all and the case isn’t very popular but in case you need it, you can find it here in this material.
In this material I will use the following hierachy of three templates:

  • T_NetworkSite – acting as grandparent template level
  • T_Interface – acting as parent template level
  • T_Router – acting as child template level

Interface a parent to a Router? – You may ask. It is not really what’s being promoted in various documents, definitely not something documented here:

Well, this depends very much on what and how you want to present in TBSM dashboards. So it depends on what is your busienss service about. The example in the article I mention above is concentrating more on VPN services:


uniquegrandchildrencount_pic_01Figure 1. Source:

In my example, I’m concentraing on Layer 2 connectivity, in other words: I cannot connect to my network site or it is unavailable if all router interfaces are down. All router interfaces can be down and the routers themselves can be up – it doesn’t matter, it means the same thing for the service: an outage. Automatically, if whole routers get switched off, the interfaces will be switched off too so my network site will be unavailable too.

uniquegrandchildrencount_pic_02Figure 2. Templates hierarchy used in this material

The desired effect is the following:

  • There is one grandparent KrakowSite
  • There are 2 routers in total
  • There are 4 interfaces in total, 2 per each of routers
uniquegrandchildrencount_pic_03Figure 3. Access to Krakow network site – business service sample diagram

In other words, KrakowSite should report to run 4 installed interfaces but 2 router devices only. The next scorecard is something we will be building during this exercise.

uniquegrandchildrencount_pic_04Figure 4. Target scorecard to build

Before I continue, I will need to introduce a HeartBeat and PassToTBSM concept.

PassToTBSM and Heartbeat

PassToTBSM is an Impact function that can be used to send any data from Netcool/Impact policy straight to TBSM. It doesn’t have to be same Impact as Impact running jointly with TBSM on the same server, it can be a standalone Impact server too (but I haven’t tried that). It can also be both Impact 6.1 or Impact 7.1 (announced not to have PassToTBSM but I hear it’s still there, not tested by myself though).
A policy that sends data to TBSM with PassToTBSM function can be as follows:

Seconds = GetDate();
Time = LocalTime(Seconds, “HH:mm:ss”);
ev.timestamp = String(Time);
ev.bsm_identity = “AnyChild”;

So we construct an IPL policy in which we take the current time (it is important to have at least one changing value, I’ll explain why in another article on this blog) and specify service instance identifier that affected service instance is expected to have defined for its incoming status rules or numerical rules. Because I’m going to affect two routers: RouterA and RouterB, I specify something generic like “AnyChild”. I could also send two events to TBSM, one with ev.bsm_identity=”RouterA” and the other with ev.bsm_identity=”RouterB”. In a case of large implementations it is easier to specify something generic like AnyChild and add such an identifier to every service instance automatically during an import process via SCR API/XMLtoolkit.
Let me call the policy with TBSMTreeRulesHeartbeat.
Such a policy needs now to be called by an Impact service:

uniquegrandchildrencount_pic_05Figure 5. Impact service to run the heartbeat policy

Make note. Alternatively a data fetcher could be used, which also can be scheduled to run every 30 seconds or even once a day at 12:00 AM or at another time, however I wanted to show PassToTBSM function in action and also in large solution cases you may not want to involve an SQL SELECT statement against any database to simply run such a heartbeat function. Alternatively you could create a policy fetcher, but then you need more skills to do it since there’s no UI for that in TBSM.

Make note. Such a service doesn’t really needs to be added to any of Impact projects.
Now, in order to use such a service and policy in a numerical rule in TBSM, you do two things: you set that service as the data source and set mapping. I have created my HeartbeatRule in TBSM with the following settings:

uniquegrandchildrencount_pic_06Figure 6. Numerical Rule with heartbeat service as data feed

Then in Customize Fields form you should have:

uniquegrandchildrencount_pic_07Figure 7. Custom fields mapping

Save this rule to your LEAF template:

uniquegrandchildrencount_pic_08Figure 8. Heartbeat rule in the LEAF template definition

And the last thing: don’t forget to make sure your service instances have “AnyChild” instance identifier specified:

uniquegrandchildrencount_pic_09Figure 9. Adding new instance ID – AnyChild

Why is it for? You may ask.
The answer is: We will be calculating unique number of grand children in one of TBSM functions. All functions in TBSM need a trigger which is an input value that changes, in order to return a fresh value. If the input value doesn’t change, you’ll not see a new value on the output. It can be the same value, but your rule won’t work if you don’t trigger it from outside somehow.

Example? Sure:
On the next level in templates hierarchy there will be NumberOfRouters rule defined (and the heartbeat rule too):

uniquegrandchildrencount_pic_10Figure 10. T_Interface template’s rules list

Let’s see inside the NumberOfRouters rule:

uniquegrandchildrencount_pic_11Figure 11. NumberOfRouters rule definition

This rule will return the output value from the function NumberOfAllChildren defined in the policy NumericalAttributeFunctions.ipl every time the HeartbeatRule triggers it.
In other words, the number of routers below interfaces won’t change in output of this function, even if it really changes (grows, reduces) unless the rule is kicked again.
So you need that extra rule on the children level like HeartbeatRule running periodically every 30 seconds and returning a random timestamp every time to ensure a different output value every time it runs.

Why so much hassle, you may say?

Why not to use ServiceInstace.NUMCHILDREN inside a policy-based numerical formula?
Well, first of all, Numerical formula is also a rule that also needs a trigger to run. Every rule in TBSM needs a trigger to run. I can dedicate a special post to that topic.
Second of all, I do use ServiceInstance.NUMCHILDREN, check out my policy function:

function NumberOfAllChildren(ChildrenStatusArray, AllChildrenArray, ServiceInstance, Status) {
Status = ServiceInstance.NUMCHILDREN;

So this policy, I mean this function, will return the NUMCHILDREN value any time you trigger the rule.

The main reason for that hassle is that unfortunately but you cannot use NUMCHILDREN directly on a scorecard, you only can return it in rules. And rules need a trigger. NUMCHILDREN isn’t also an additional attribute, which could be shown directly in JazzSM dashboard.
Is it clear? I know, it’s bit weird, but just at the first sight.

You may also doubt: why am I using ServiceInstance.NUMCHILDREN? Is there any other attribute to return same value? Why am I using TIP, not JazzSM in my examples at all? The answers are: there’s no additional attribute that you could return in JazzSM straight, without wrapping it with a rule (and you cannot return an additional attribute without packing it in a rule in TIP) to return anything like number of children. So you have two choices:

  1. Use ServiceInstance object’s field NUMCHILDREN – see above
  2. Use a policy that will iterate through an array of children objects of your service instance and return the array’s length.

As you can see, still a policy, so still a numerical aggregation rule or a numerical formula rule must be used. So there’s no other way really. So rules are your way and you need to trigger them.


Recalculate correct number of objects after server restart

There’s an alternative to the Heartbeat rule, from TBSM 6.1.1 FP2 you can run this policy and associate it with the server start or run it from time to time manually or schedule it with an Impact service, there are two policies actually, one is for all nodes and the other just for leafs.

All nodes
log(“Recalc Leaf Node Only. Policy Start.” );
GetByFilter(Type, Filter, false);
log(“Recalc Leaf Node Only. Policy Finish.” );
log(“Recalc All Nodes. Policy Start.” );
GetByFilter(Type, Filter, false);
log(“Recalc All Nodes. Policy Finish.” );

This alternative is documented here:

The difference between my heartbeat solution and the policy documented above is that my heartbeat function is selective, I decide which elements of the service tree will be recalculated (not just leafs but also not the entire service tree) and when (not just during a restart but every now and then). This is important, because change in number of children on some intermediate levels may occur independently on changes in number of children on the leaf level and I still need to trigger that change. Same time it’s an effort for TBSM to recalculate the whole tree, especially in case I have 100k instances in my service tree.

That’s why I prefer to make it selective, so I use Heartbeat concept.


Unique grandchildren count rule

Now once we have the children count rule created and triggered, it’s time to get the unique grandchildren count rule.
What’s the difference?
It’s simple, you don’t want to take your children children count, because every Interface will report it has 1 parent, which gives you 4 parents while the true number is just 2.
So you need a smart Impact policy that will calculate that for you.
Since we’re clear on what rules need to be created on the Router level and the Interfaces level, it’s time to present rules on the NetworkSite template level:

uniquegrandchildrencount_pic_12Figure 12. Rules defined inside T_NetworkSite template

The NumberOfInterfaces rule is just to calculate the number of interface below the network site and inside of that rule the same function NumberOfAllChildren is being called from within NumericAttributeFunctions.ipl. The trigger should be the heartbeat rule again since number of interfaces inside the site may change independently. As you could see above, I defined a heartbeat rule inside the T_Interface template and I called it HeartbeatRuleIfc.

The more interesting rule is UniqueGrandChildren, which runs another function from the NumericAttributeFunctions policy, called NumberOfUniqueGrandChildren:

function NumberOfUniqueGrandChildren(ChildrenStatusArray, AllChildrenArray, ServiceInstance, Status) {
i = 0;
uniquegrandchildrenarray = {};
log(“MP: “+ServiceInstance);
while(i<length(ServiceInstance.CHILDINSTANCEBEANS)) {
child = ServiceInstance.CHILDINSTANCEBEANS[i];
log(“Child “+child.DISPLAYNAME+” of grand parent “+ServiceInstance.DISPLAYNAME+” was found.”);
j = 0;
while(j<length(child.CHILDINSTANCEBEANS)) {
grandchild = child.CHILDINSTANCEBEANS[j];
log(“Child “+grandchild.DISPLAYNAME+” of child “+child.DISPLAYNAME+” was found.”);
// Testing if currently analyzed child has already occurred
k = 0;
occurence = 0;
while(k<length(uniquegrandchildrenarray)) {
if(uniquegrandchildrenarray[k].SERVICEINSTANCEID == grandchild.SERVICEINSTANCEID) {
// if yes, mark occurred = 1 (true) and finish analyzing further, so exit this loop
occurence = 1;
// k = length(uniquegrandchildrenarray); //uncomment this line to speed up in case of large child arrays
log(“Duplicate found: “+uniquegrandchildrenarray[k].SERVICEINSTANCEID+” and “+grandchild.SERVICEINSTANCEID+”. Skipping.”);
if(occurence == 0) {
uniquegrandchildrenarray = uniquegrandchildrenarray + grandchild;
log(“Unique grand child found: “+grandchild.DISPLAYNAME+”. Added to the list.”);
j = j + 1;
i = i + 1;
Status = length(uniquegrandchildrenarray);
log(“Grand parent “+ServiceInstance.DISPLAYNAME+” has # grand unique children “+Status);

So basically the function will traverse the service tree two levels down to the grandchildren level and will start storing their number by tracking their name. For every reoccurring name a counter will be incremented by 1. For every new name, a new item will be added to an array. The size of the array is the returned value.

Is it simple? Not so much, but it’s probably one of those functions you implement once and use all times, so it’s worth to learn about it. Let’s see the rule at the end:

uniquegrandchildrencount_pic_13Figure 13. NumberOfUniqueGrandChildren rule

So this is your desired effect:

uniquegrandchildrencount_pic_14Figure 14. Unique GrandChildrenCount on the scorecard


I hope that you like this type of small hints on how to achieve something useful in TBSM, if so, please comment and I’ll try to post as man of this type of posts as I can. Thanks!

TBSM 6.1.1 FP4 released

April 29th, 2016

Fix pack 4 was released tonight, see the full list of APARs and improvements, also get the downloads here:

Please make note that there are few manual actions to take, apart from installing the fix, to fully benefit from few new features included, which are:

RADEVENTSTORE index creation – to prevent TBSM Event reader from hanging
Installation of the new right-click context menu item “Delete Service Instance”
SLAPrunePolicyActivator – Impact policy activator service to prune RADEVENTSTORE and 6 other tables in TBSM DB.
TIP version 3 installation – is now certified to use with TBSM


Total Event Count in TBSM

April 26th, 2016


Tivoli Business Service Manager can calculate amazing things for you, if you only need them. This is thanks to the powerful rules engine being the key part of TBSM as well as the Netcool/Impact policies engine running just under the hood together with every TBSM edition. You can present your calculation results later on on a dashboard or in reports, depending if you think of a real time scorecard or historical KPI reports.

In this article, I’ll show how to calculate a total event count throughout multi-level service tree. It is something that TBSM isn’t doing right after a fresh install because it doesn’t provide you with the right rules out of the box, however TBSM doesn’t have also any predefined service tree available to you so in order to see this working you’d need to do both: add the rules to your service templates and import or create by hand your service tree structure to test this.

In this material, I’ll create a simple, multi-level service tree consisting of 3 levels of instances and I’ll use my own defined template T_Regions, but in order to repeat this exercise you can also simply reuse the template SCR_ServiceComponentRawStatusTemplate, which comes with every TBSM installation and is widely used in integrations with Tivoli Application Dependency Discovery Manager (TADDM). They key thing is that your template is to:

  • Have at least 1 incoming status rule
  • Be in use across the whole service tree, so all service instances on all levels in your service tree implement that template.

totaleventcount_pic_01Figure 1. Template settings for this excercise

totaleventcount_pic_02Figure 2. Incoming Status Rule body used in this excercise

totaleventcount_pic_03Figure 3. Simple service tree used in this material

Make note. This document is trying to implement already existing functionality, means calculating the total number of events on every service tree level which result is stored in numRawEventsInt parameter. This parameter can be visible as the last value in the RAD_prototype widget being used typically on Custom Canvases on TBSM dashboards created in Tivoli Integrated Portal. But that parameter value isn’t accessible for numerical rules or policies for further processing.

totaleventcount_pic_04Figure 4. numRawEventsInt value used on RAD_prototype widget

The newest add-on to TBSM, the debug Spy tools, also offer a parameter per every service tree level, called Matching Events. However that value is correct too, it also isn’t accessible from numerical rules or policies.

Make note. There is a BSM Accelerator template, called BSMAccelerator_EventCount which was designed to present the correct number of events for every service instance, however it was tailored to BSM Accelerator needs and service tree structure and isn’t scalable for potentially endlessly high service trees. However, some of the concepts introduced in order to support the BSM Accelerator package, will be covered in this document. If you want to read more, see this document:

Make note. TBSM 6.1.1 FP3 is a prerequisite for all rules described in this material to work correctly. However it is highly recommended to install Fix Pack 4 or higher for ensuring the latest improvements.

What is the multi-level events count?

TBSM runs an Impact service called TBSMOMNIbusEventReader which comes with the product out of the box and is responsible for reading events in Netcool/OMNIbus on a regular basis (every 3000 miliseconds by default) and finding events to be processed by TBSM by using its special predefined filter.

Here’s the default filter:

(Class <> 12000) AND (Type <> 2) AND ((Severity <> RAD_RawInputLastValue) or (RAD_FunctionType = ”)) AND (RAD_SeenByTBSM = 0) AND (BSM_Identity <> ”)

All events which pass that filter get processed further by TBSM service template rules, actually their special kind called Incoming Status Rules. The most typical incoming status rule, predefined inside SCR_ServiceComponentRawStatusTemplate template, called ComponentRawEventStatusRule has a precondition, called a discriminator, which filters out all events filtered in previously by the event reader, which don’t have one of the following classes:

  • TPC Rules(89200),
  • IBM Tivoli Monitoring Agent(87723),
  • Predictive Events(89300),
  • IBM Tivoli Monitoring(87722),
  • Default Class(0),
  • TME10tecad(6601),
  • Tivoli Application Dependency Discovery Manager(87721),
  • Precision [Start](8000),
  • MTTrapd(300),
  • Precision [End](8049)

Make note. In my example my Incoming Status rule will simply expect just Default Class (0) in all my test events.

This is not the end. There’s one more filter. It is called event identification field and by default TBSM will look for its value in event’s field called BSM_Identity. Value that is expected in that field comes from every service instance event identifier, which by default is the same as service instance name. So the event identifiers for my simple service tree will be the following:

Service instance name Event identifier
Europe Europe
Poland Poland
Malopolska Malopolska

I will not discuss in this material about how to maintain event identifiers, how many event identifiers you can have, how to set up event identifiers in XMLtoolkit configuration files (if you’re interested in those topics, please see my private blog entry: I will also not discuss here on how the event severity may affect service instance status, I go defaults here in my example, but I will not focus on that area in this material at this time.

To sum it up: there are 3 filters your event has to pass before it affects your service instance:

  1. The TBSMOMNIbusEventReader’s filter
  2. The Incoming Status Rule discriminator / event class filter
  3. The event identifier

If your event made it through all the filters, you can call it a service instance affecting event.

It doesn’t have to mean your event has to change your service instance status, it only means that your event was processed by the Incoming Status Rule implemented in your service instance’s template. If you use TBSM 6.1.1 FP4, you can use Service Model Spy tool to see that your Incoming Status rule updated various attributes like Matching Events (number), Max Event Status (Event’s severity) and a timestamp of time when the rule processed the event.

The Matching Events parameter is what I’ll be calling in this material the EventCount.

Now, why Multilevel event count?

Every service instance can have its own individual EventCount. Every level of the service tree can contain more than one service instance and the best way to sum them up is to calculate their sum on their parent level. Then the parent service instance may also be used to implement a template with Incoming Status rule and therefore it can have its own individual EventCount. And then the parent service instance can be one of many parent service instances so the best way of summing them up would be calculating TotalEventCount on the grandparent service instance level. And so on. So the Multi-level event count is a feature to calculate the total number of events being processed by TBSM in the whole service tree.

Why would you need it? There are several use cases possible:

  • Your service tree consistency check and verification – in a development phase, to see if all levels of your service tree get processed correctly
  • Statistics – to see the current and true load on TBSM by source, class, alert type, any event field in order to perform some further analysis of event storms and their reasons
  • To monitor the operations – for example to compare total events count to total acknowledged events count to total count of events escalated by opening an incident etc.
  • To monitor service component qualities – especially important in case of service components are managed or provided by a 3rd party provider – you can assess how much trouble all of them give your company or your operations team

Once the use case is agreed, you may want to use this material to start collecting your Total event counts in order to present them on a dashboard or in a report. Let me now explain to you how to set it up.



As the first step let’s make sure I’m collecting the event count for each of my service tree elements. Let me create my new rule: OwnEvents count.

Make note. This step has a prerequisite: I need to have my Incoming Status rule already created.

This is perhaps not well documented, but every Incoming Status Rule can be used in a Numerical Formula rule to get the number of events processed. It is documented in this technote:

So let me do exactly what the technote does, this is my numerical formula, my rule called OwnEvents, which will return only non-clear events count via the default (since TBSM 6.1.1 FP1) Incoming Status Rule’s parameter NumEventsSevGE2. Whenever my Incoming Status Rule has processed another event with severity 1 or higher, the output of my numerical formula will refresh and increase by 1.

totaleventcount_pic_05Figure 5. OwnEvents rule settings

And on my scorecard:

totaleventcount_pic_06Figure 6. OwnEvents in a scorecard

Let’s send a test event to the last level now:

totaleventcount_pic_07Figure 7. Sending test event

totaleventcount_pic_08Figure 8. Test event settings

totaleventcount_pic_09Figure 9. OwnEvents after sending test event

As you could see the events severity was passed through the whole service tree up, that is why the icon in the Events column changed color to Purple from bottom level right to the top one.

After sending a critical event to the 2nd level the icons from the 2nd level to the top one changed their color to red.

totaleventcount_pic_10Figure 10. OwnEvents after sending 2nd test event

Make note. In order to perform this exercise, I haven’t created a status propagation rule. And I will not!

Take a look at the OwnEvents column. Even if status was propagated through the service tree from bottom to the top, the OwnEvents rule worked for every level individually. Europe shows bad Events noticed but OwnEvents column shows 0 events affected that level.

Now, let’s try to make every level aware of events happening on the level below it.

Prepare such a policy:

/* trigger_totalevents */
log(“Triggered: “+ServiceInstance.STATEMODELNODE.trigger_totalevents.Value);Status = 0;

si = ServiceInstance.SERVICEINSTANCENAME+” (“+ServiceInstance.DISPLAYNAME+”)”;

if(ServiceInstance.STATEMODELNODE.count_ownevents.Value <> NULL) {
Status =  Int(ServiceInstance.STATEMODELNODE.count_ownevents.Value);

log(“Service instance: “+si+” own events count: “+Status);

i = 0;
while (ServiceInstance.CHILDINSTANCEBEANS[i] <> NULL) {

grandChildEvents = 0;

if(ServiceInstance.CHILDINSTANCEBEANS[i].STATEMODELNODE.count_totalevents.Value <> NULL) {
grandChildEvents = Int(ServiceInstance.CHILDINSTANCEBEANS[i].STATEMODELNODE.count_totalevents.Value);
log(“Service instance: “+si+”, child: “+ci+” children events: “+grandChildEvents);

Status = Status + grandChildEvents;
} else {

childOwnEvents = 0;
if(ServiceInstance.CHILDINSTANCEBEANS[i].STATEMODELNODE.count_ownevents.Value <> NULL) {
childOwnEvents = Int(ServiceInstance.CHILDINSTANCEBEANS[i].STATEMODELNODE.count_ownevents.Value);
log(“Service instance: “+si+”, child: “+ci+” own events: “+childOwnEvents);

Status = Status + childOwnEvents;

log(“Service instance: “+si+”, child: “+ci+” children events: “+childOwnEvents);

i = i + 1;

log(“Service instance: “+si+” total events count: “+Status);

I called this policy count_totalevents_policy_1 and I saved it within numerical formula rule, called count_totalevents.

totaleventcount_pic_11Figure 11. TotalEvents rule settings

Same time, create another, numerical aggregation rule, in which you will point to the just created rule within the same template. Make sure you name your rule exactly same way as indicated in the header of the policy in the numerical formula just created a moment ago.

totaleventcount_pic_12Figure 12. TriggerTotalEvents rule settings

You should have by the end the following list of rules in your template:

totaleventcount_pic_13Figure 13. T_Regions template complete rules set

Make note. After creating a template rule pointing to the same template as a child template, the template will disappear from the templates list in the service navigator portlet. In order to fix it, add that template to any other template by associating via any type of status propagation rule:

totaleventcount_pic_14Figure 14. T_Regions template associated to templateFinder

And this is the result that should occur at the end in your scorecard:

totaleventcount_pic_15Figure 15. TotalEvents column in a scorecard

It looks like the concept works fine. Let’s try it further. Let’s send another event from every level, starting from Malopolska to Poland and to Europe.

totaleventcount_pic_16Figure 16. TotalEvents column after sending more test events

It looks correct, every level OwnEvent count increased by 1 and I have in total 5 events in the entire tree, just 2 on the leaf, another 2 in the middle and just 1 on the root level.

Let’s add a new level below Malopolska and call it Krakow. This will simulate expanding the service tree i.e. in case of a fresh import from TADDM or CMDB.

totaleventcount_pic_17Figure 17. OwnEvents and TotalEvents after adding a new child service

Let’s now send a new event, Severity 3 to Krakow:

totaleventcount_pic_18Figure 18. OwnEvents and TotalEvents after sending a test event to the new child service

The new event affected Krakow and was included in all level calculations of the TotalEvents count correctly. Let’s now create one level above the all, called Earth:

totaleventcount_pic_19Figure 19. OwnEvents and TotalEvents after adding a new root service

Adding Earth didn’t change the TotalEvents count of course, but the current max was reflected on the new top/root level. Let’s send another event to Poland:

totaleventcount_pic_20Figure 20. OwnEvents and TotalEvents after sending test events to the new root service

The total event count increased by 1 again. Only Europe’s OwnEvents column value increased by 1.

Let’s now remove Krakow from the Leaf level to see if the TotalEvents count will decrease by 1 now:

totaleventcount_pic_21Figure 21. OwnEvents and TotalEvents after removing the child service from the tree

So it is correct again, after removing Krakow with its 1 event the overall TotalEvents count dropped by 1 too and equals now 6.

This is is, if you like this post, let me know, also in case of questions.

Take care to the next one!


BSM and productivity

February 16th, 2016

Have you ever had a problem with productivity at your work? I’m sure you have. Bad day, malfunction in air conditioning, your boss’ bad day, maybe some private situations at home. How about your IT tools? Right – your desktop computer slowness, breaking network, your mail client blocked by mail size reaching up your tiny quota maximum, your support team not responding your requests immediately. Your dependencies. You have your pretty gantt chart for your project, created in your favourite PM tool and it’s all for nothing if something is impacting your productivity.

Productivity is – I dare to say so – everything in your work. It’s the key. You do want your work done. There’s always something to do, new tasks and new projects, so you better make it done as soon as possible. There’s life outside work too – your family and friends, your hobbies, your free time, your duties at home, your vacations. You want to deal with all weaknesses you have, all dependencies you don’t control and all tasks you simply can do but need time in the best manner and quick. You’ll feel passion to some of your tasks, you’ll feel mission or duty to some others, you’ll feel some bad things about the rest – but no matter what you feel, you’re paid to do your job. And get the bill.

Life is short. Life is quick. Life is brutal. You’ve got a high priority task to do – you won’t feel much interested in anything else. You’ll not admire your new laptop computer you’re given to upgrade from your old box until you’re done – who would like to stuck in a migration process in between using two computers: one too new to have all applications you need yet and the other too old and slow to be used anymore. It’s like you know you’re going to get your new car soon, the old scrap stinks in garage, you kind of don’t want to use the oldie but the new one isn’t really available yet. Stuck. That’s your feeling.

I don’t really want to spend much time on discussing productivity issues and solutions here. I think I made my point. We all know them. They’re everywhere, so in BSM.

BSM gives you information on your applications you’re responsible for (somehow, let’s assume that’s your role) before you really know something went bad. You want to know proactively. You want to know the root cause and have a solution. An automated script solving your application’s issues or a team in order to delegate to solving the issue for good. A process. I mean – you do not have got just one application to deal with, do you? There must be a pattern to follow every time new issue arise! There must be a way for sorting out those thousands critical events! You don’t want to open thousands of incidents!

How to make things easier? Is there a way? Is there a quick solution? Is there one for all times?

In order to be productive here, the answers must be Yes. But how? How to make sure my applications are good enough shape to serve my business services to users within SLA margins? Notice – I didn’t say: for 99%. I said: good enough. So people using them stay productive. Can do work they’re paid for. On time. Good enough. Does such a KPI exist? If yes, then how to measure it? Well, every technology has another methodology to measure performance. But does the sum of satisfactory KPIs equal to end user satisfaction? What if he complains? Will you say: it works for me?

BSM puts priority to services you offer. That’s what count. Is the client happy? Does he or she have any reason to complain about not met SLA conditions? Is he or she stuck? Does your infrastructure offer any redundancy? Have you started using it? How much are you reactive vs. proactive and how long has your application’s time to recovery been recently? How much have your application’s outages fit into allowed maintenance windows recently?

No wonders – the miracles won’t happen, you won’t be 100% ready in 100% cases. But then – can you explain what happened, present a recovery plan or a solution or an improvement to avoid issues in future?

Can you do all of that? Can you do that with one tool? One UI at least? Do you maintain your service or application model up to date to know its gaps and weaknesses? Do you have your event catalog? An overall outage report? Not resource oriented but the whole service oriented? Are you service oriented? Are you productive yourself in learning about status of your service you’re responsible for?

This is BSM. It might be referred to as an APM in some cases. Whatever you call it: it needs to address your use cases but since there’s no magical way to put all service dependencies together but by creating a service model – you need one. You need a template in case your model is something repeatable to use. You need to spend some time on bringing one to life. And find a way to keep it up to date, so it works for you, it informs you of all aspects of your applications quick and precise. So you stay productive.


TBSM A.D.2016

January 28th, 2016

Hi everyone and Happy New Year 2016!

I hope that everyone celebrated his Christmas or days off happily, with family and/or dear friends and had time for other cool things, not just IT! 😉

Today, a quick look at IBM TBSM the product’s future  in 2016. It’s no change – TBSM remains important product to do relationship driven calculations of status and other metrics and everything you need to use it is still plugging it into a CMDB-kind of source and monitoring kind-of source (means KPIs monitoring connect to TBSM or event sources connect to OMNIbus which connects to TBSM). Still, the latest version of TBSM is 6.1.1 FP3. There’s next fix pack 4 expected in Q1.

What is important to keep saying (it’s been truth already since a while) you need not just TBSM to make a successful BSM solution working. Since a while my projects consist of:

– TBSM as a core calculation engine

– CMDB – as the service model source, it can be TADDM 7.3 with new modeling capabilities, it can be full-blood CMDB like Maximo CCMDB or ServiceNow or HP uCMDB

– Netcool/Impact – which is part of TBSM anyway, but can be a standalone server too, especially for balancing load, i.e. it can be dedicated to event analytics and correlations, also dashboards content drive

– JazzSM/DASH – for dashboards

– JazzSM/TCR/Cognos – for reports

– Netcool/OMNIbus – for events management and plugging in all other event sources like IBM Monitoring or Microsoft SCOM or IBM Network Manager or IBM TotalStorage Productivity Center or SAP Solution Manager or Oracle Enterprise Manager or others.

– any other source of measurements to take into account, it can be performance data source of any sort, up to your scope of work

TBSM will also need a self-monitoring capability, built in conjunction with but not limited to the KR9 agent.

That’s your BSM, more less. TBSM itself is powerfull but needs partner products. And your ideas for service models at the end.



TBSM multiple event identifiers

August 14th, 2015

This blog entry will explain on how to set up and use multiple event identifiers in IBM Tivoli Business Service Manager 6.1.1 FP3 and previous releases.



The official documentation doesn’t say much about multiple event identifiers.


So let’s do a quick summary of what we can:
– we can set multiple event identifiers in incoming status rule
– we can set multiple event identifiers in EventIdentifiersRules.xml or any artifact of category eventidentifiers in XMLtoolkit


But how to make sure they would match and do know they do match?



Multiple event identifiers in Incoming status rules are logically associated like there was logical AND operator between them:



















So this definition of CAM_FailedRequestsStatusRule_TDW rule should be understood: all rows returned by CAM_RRT_SubTrans_DataFetcher will affect my service if data returned in the following fields has the following values:



Same time, if I have multiple values for same label, it means OR.
For incoming status rule like:

































So my instance expects that CAM_BSM_Identity_OMNI rule can catch all events with two alternative BSM_Identity values:

– MyApp#t#MyTrans#s#MySubTrans OR
– MySubTrans(5D21DD108FD43941892543AA0872D0EA)-cdm:process.Activity


If we’re looking at EventIdentifierRules.xml, there’s a concept of policies and rules, for example:

<Policy name=”ITM6.x”>
<Rule name=”ManagedSystemName”>
<Token keyword=”ATTR” value=”cdm:ManagedSystemName”/>
<Mapping policy=”ITM6.x” class=”%” />

You can have many policies mapped on many classes (which can be mapped on many templates) and you can have many rules within every policy.


In our case, for ITCAM Tx subtransactions class we have one policy with many rules:

                         <Policy name=”CAM_SubTransaction_Activity”>
<Rule name=”CAM_GetBSM_Identity”>
<Token keyword=”ATTR” value=”cdm:ActivityName” />
<condition operator=’like’ value=’%#s#%’ />
<Rule name=”CAM_GetApplicationName” field=”APPLICATION”>
<Relationship relationship=’cdm:uses’
<Relationship relationship=’cdm:federates’
<Token keyword=”ATTR” value=”cdm:ActivityName” />
<Rule name=”CAM_GetTransactionName” field=”TRANSACTIONS”>
<Relationship relationship=’cdm:uses’
<Token keyword=”ATTR” value=”cdm:Label” />
<Rule name=”CAM_GetSubTransactionName” field=”SUBTRANSACTION”>
<Token keyword=”ATTR” value=”cdm:Label” />
<condition operator=’like’ value=’%#s#%’ />


and one mapping of that policy on a class:
<Mapping policy=”CAM_SubTransaction_Activity” class=”cdm:process.Activity” />


But one class has many policies mapped on them:
<Mapping policy=”CAM_Transaction_Activity” class=”cdm:process.Activity” />
<Mapping policy=”CAM_SubTransaction_Activity” class=”cdm:process.Activity” />
<Mapping policy=”CAM_TT_Object” class=”%” />


Means, every mapping of a policy on a class is like element of logical OR operation. And every rule is a logical element of logical AND operation with other rules within same policy.

It is all conditional, because here comes additional aspect of field parameter of <Rule> tag.


The field parameter.

The field parameter in rule in policy in EventIdentifier enables that rule will be used only in case of having such field with such a name also in Incoming status rule specified as service instance name field.


So there’s no AND operator between those rules in policy in EventIdentifierRules.xml which haven’t been specified in Incoming Status Rule in template.


On another hand, there won’t be any value assigned to service instance name fields selected in Incoming Status Rule in Template A if corresponding fields haven’t been configured in rules of policies mapped on class (mapped on Template A in CDM_TO_TBSM4x_MAP_Templates.xml) in EventIdentifierRules.xml/



You need two places to go to and configure your event identifiers:
Templates and incoming status rules / numerical rules / text rules – Service Instance Name Fields

  1. XMLtoolkit artifact EventIdentifierRules.xml (or any custom artifact from category eventidentifiers) – field parameters in rules defined within policies

Additionally, don’t forget: your policies defined in eventidentifiers artifacts must be mapped on CDM or custom classes that have mapping definition stored in CDM_TO_TBSM4x_MAP_Templates.xml and map on the same template that has the incoming status rule (numerical/text rule) you want with the Service Instance Name fields you want.

Otherwise your events or KPIs fetcher in fetchers won’t affect your service tree elements and you will not be showing correct status or availability on dashboards and your outage reports will also miss data and will generate false monthly results!

Drop me a note if you experience any troubles reading this article or trying to apply what’s written here, thanks!


Event identifiers – default BSM_Identity vs custom event identifiers.

August 14th, 2015

TBSM is still quite powerful tool to track your service components availability and calculate your service outage durations and report on them.

One of the key elements to make it all happen is mapping events or KPIs on the service tree items.

There’s a default mechanism for that, or rather a field called BSM_Identity.

I have made a finding that I wanted to share with you. It’s not documented too well in the official documentation.

In short: BSM_Identity is event identifier. Name of field in event or SELECT statement that is expected to contain a value that will identify one and just one service instance in TBSM service tree. You can have one or more event identifiers (sometimes you need more same time in order to achieve uniqueness) and they can have other names than BSM_Identity, but BSM_Identity is the one which is default, as stated here:

But how does it work in real life?

We do have to specify name of event identifiers in Incoming Status Rules. We basically can choose from all existing fields in alerts.status in case of ObjectServer being the data feed or all fields returned in SQL SELECT in a data fetcher or all fields returned via policy fetcher. There are no defaults. We have to choose one or more fields.

On another hand, if we integrate with TADDM or CMDBs via SCR API, we don’t have to specify any name of a field in eventidentifiers artifacts if we don’t want to. In such a case, by default their names will be BSM_Identity.

It means if we want to work with defaults, we can only achieve it if we ensure that we’re:
a) adding BSM_Identity field to alerts.status table in ObjectServer,

b) using BSM_Identity alias SQL SELECT statements in data fetchers

c) using BSM_Identity as label in Impact DSA used in policy fetchers

Simple? 😉

I can post more examples if people are interested in it.

TBSM, JazzSM and eDayTrader

March 1st, 2015

Technically today.

The eDayTrader sample application for testing TBSM and JazzSM has been around, from even before JazzSM era, when we did want to demonstrate TBSM, TADDM and ITM/ITCAM capabilities all together. The eDayTrader application was deployed onto WAS and DB2, and thanks to TADDM you could discover its components, import to TBSM and then monitor with ITM. I remember such a TADDM class at least, long time ago, nevertheless, the eDayTrader J2EE application can give you quite a good sense of what BSM methodology might be about when it comes to service modeling.

Today you can download the eDayTrader sample code for TBSM and, without a need to have that application deployed onto your WAS, you can generate a service dependency tree in TBSM and run the predefined data fetchers to fetch the eDayTrader component service instances metric data from a sample DB2 table and present that data right on your JazzSM/DASH dashboards. This is very handful, first, what I find difficult for our users as such is – still, after years – it helps with understanding of how TBSM and BSM products (today it’s NOI – Netcool Operations Insight, TADDM, TBSM, JazzSM and OMNIbus/Monitoring) can be really used, means not only for monitoring applications but also for monitoring some business metrics. It helps to some extent of course, as the eDayTrader templates and fetchers are taken out of context, not sure if even any real, although surely based on some real J2EE application examples. But it is still a good exercise to play around it in TBSM. Secondly, the eDayTrader sample code is handy now, because it familiarizes the users with JazzSM/DASH, the first really new, post-TBSM custom canvas era dashboards editor. So I truly encourage all of the readers who haven’t had a chance to play around that exercise, to try, spend some time and see it. You can download the thing here, from developer works (you’ll need IBM ID to do that):

The PDF describing what might be done with the contents in JazzSM after you deploy them to TBSM, is here:


I found few minor issues that might be crossed on some readers’ way to complete the exercise.

  1. The database setup on Windows. I do have one of my TBSMs on Windows and I did use that particular one for installation of eDayTrader contents. I perhaps did misread the instructions and did something wrong, but I believe that the case of the administrative user used might have been bit better documented. Namely, you’re supposed to install the eDayTrader database with an administrative user that will be used for accessing the database tables from withing the TBSM data fetchers. On Windows, the default administrative user is db2admin, and on Linux/Unix it is db2inst1. I passed db2admin everywhere but I executed the *.bat script in my administrative DB2 CLI windows, which is something that I open as my another user (I will not unveil its name here). So what happened – the tables were created fine, the schema was set fine, I could see the tables and data via SELECT statements fine, but I couldn’t do same selects in TBSM / Impact queries. The JDBC using db2admin user couldn’t see them. I did receive SQLCODE=-204, SQLSTATE=42704 in my data fetcher View Data dialog, means invalid schema in query. Almost went crazy about it, I found few useful commands to check what’s up with the schema and here I share it:

>db2 “select substr(GRANTEE,1,16),GRANTEETYPE,substr(tabschema || ‘.’ || tabname,1,64), CONTROLAUTH, ALTERAUTH, DELETEAUTH, INDEXAUTH, INSERTAUTH, REFAUTH, SELECTAUTH, UPDATEAUTH from syscat.TABAUTH where tabschema like ‘DAYTRADE'”

should give you something like:

1                GRANTEETYPE 3
—————- ———– —————————————————
————- ———– ——— ———- ——— ———- ——- —-
—— ———-
Y           G         G          G         G          G       G
Y           G         G          G         G          G       G
Y           G         G          G         G          G       G

3 record(s) selected.

so DB2ADMIN is the grantee. If you’ve got another user there, use its name in your TBSM JDBC connection setup, or, rerun the DayTrade database creation with this change to create_daytrade_schema.sql:

CONNECT TO DAYTRADE USER db2admin USING <yourpasswordhere>;

So in this case simply drop daytrade database (use FORCE APPLICATION ALL to release all connections if DB2 doesn’t allow you dropping it because of available connections) and rerun your installation script (Dashboards-Sample-Data_v1.1-Artifacts\Step2_TBSMConfiguration>daytrade_configuration.bat).

  1. I had a nasty issue with locked contents of Impact: service list and data source list. As part of the eDayTrader package installation you’ve got to install TBSM/Impact project and if you have any SVN locks on any Impact data like lists, that step will fail. When you run the second step of the installation process, you may see friendly messages like:

[exec] Please make sure no locks exist in “IMPACT_HOME\etc\<servername>_versioncontrol.locks”

which is ok, unless you see:

[echo] Importing using nci_import
[exec] Please make sure no locks exist in “IMPACT_HOME\etc\<servername>_ver
[exec] Check impactserver.log for more details
[exec] Final result – Failure : 1

Also, if you’ve got errors and exceptions like:

File etc/TBSM_datasourcelist is locked by user SYSTEM

you’re in trouble.

How to deal with it – not sure what eventually worked, I went to TBSM/Impact UI and unlocked all data sources and services. Then I found an SVN command to use:

C:\Program Files\IBM\tivoli\tbsm\platform\win\svn\bin>svn cleanup “C:\Program Files\IBM\tivoli\tbsm\etc”

but it didn’t help. At the end I went to version control locks file and removed all entries like:

#This file was written by server.
#Sat Feb 28 23:26:45 CET 2015


#This file was written by server.
#Sat Feb 28 23:26:45 CET 2015

and nothing else.

It didn’t help either, so I restarted TBSM WAS profile and that did help. So probably the combination of the server restart and releasing the locks from the file or SVN command did work out.


So here we go. I did install the eDayTrader package from the DW site fine and can move on to playing around it more and show to my customers if necessary. Nice.


New TBSM 6.1.1 fix pack available – FP2

February 26th, 2015

You can find the installation files on Fix Central:


FP1 and FP2 solve a number of OutOfMemory issues as well as NetcoolTimeoutExceptions related to massive use of numerical rules and numerical aggregations and many others. It’s worth of installation not only in response to some of your PMRs but in general.