Software Engineering

WLDF Smart Rules in Weblogic

Nicolas Fonnegra
Nicolas Fonnegra

WLDF is a diagnostic framework inside Weblogic, which provides several functionalities to measure, track and monitor a domain at runtime. With WLDF, administrators can set alarms on specific incidents and automatically execute some tasks when certain events are triggered.

One good starting point to learn about WLDF is Radu Dobrinescu’s blog. It provides a compressive tutorial on how to monitor common events in Weblogic. The newest version of Weblogic (12.2.1) has made some changes and improvements to the WLDF Framework with the inclusion of the new smart rules.

Smart rules provide a set of predefined rules that represent the most common monitored events in Weblogic domains. This blog is going to show how to monitor stuck threads and overloaded datasources and send an email if one of these events occurs.

1. Creating a mail session

The first task required is to create a mail session in order to reference it in the WLDF watches. The following steps explain how to define a mail session.

  • In the domain structure go to Services -> Mail Sessions -> New
  • Enter a name, a JNDI name and your user account in Weblogic. In the JavaMail properties enter at least the host and port of the SMTP server and the sending email account, like this:

mail.smtp.port=2525

mail.smtp.from=“soa_admin@example.com“

mail.smtp.host=localhost

  • Press “Next”.
  • Select the server where you want to target this mail session and press “Finish”.

2. Create a Diagnostic Module

A diagnostic module is and administrative unit that contains collected metrics, instrumentations, policies and actions. The following steps show how to define a diagnostic module.

  • In the domain structure go to Diagnostic Modules -> New
  • Enter a name and press “OK”.
  • Click on the newly created diagnostic module and select the “Targets” tab.
  • Select the server where you want to target this mail session and press “Save”

 

3. Create an action

Actions are tightly bound to policies. They are the processes that get executed if a policy is triggered. In our case we are going to define an action to send an email if there stuck threads or overloaded datatsources are detected.

  • In the domain structure go to Diagnostic Modules and click on the diagnostic module created on step 2.
  • Go to “Configuration” -> Policies and Actions -> Action tab and press “New”
  • Select “SMTP (Email)” and press “Next”
  • Enter a name for the action and press “Next”
  • Select the mail section from step 1, enter a mail recipient and press “Finish”

WLS

4. Create a policy for stuck threads

The new smart rules policies provide several predefined events that can be monitored. There is one policy called “High Average Stuck Threads” that will be raised if the average number of stuck threads in a defined period of time is reached. The following steps explain how to create this kind of rule:

  • In the domain structure go to Diagnostic Modules and click on the diagnostic module created on step 2
  • Go to “Configuration” -> Policies and Actions -> Policies tab and press “New”
  • Enter a name, choose “Smart Rule” and press “Next”
  • Choose “High Average Stuck Threads” and press “Next”
  • Enter “30 seconds” for the sampling rate, “10 minutes” for sample retention period and “5” for the Average Stuck Threads. This configuration can be interpreted as “ Trigger if the average value of StuckThreadCount is greater than or equal to 5 over a 10 minute window, collected at 30 second intervals”. Press “Next”
  • Choose “Every N seconds”. Press “Next”
  • Enter “15” for “Repeat”. Press “Next”
  • Choose “Use an automatic reset alarm” and enter “300”. In our case, this will mean that an email will be sent every 300 seconds if the policy was triggered. Press “Next”

  • Select the action from step 3 in “Diagnostic Actions” and press “Finish”

5. Create a policy for overloaded datasources

Overloaded datasources are the ones that cannot give any more connectios to the request process because they don’t have available connections in their pool. The attribute “WaitingForConnectionHighCount” from the JDBCDataSourceRuntime MBean can be queried in order to determine if there were process waiting for a connection because none was available. A smart rule can be used for this policy again, but this time we are going to choose a generic rule called “Generic Metric Smart Rule” that can be used to query MBeans.

  • In the domain structure go to Diagnostic Modules and click on the diagnostic module created on step 2
  • Go to “Configuration” -> Policies and Actions -> Policies tab and press “New
  • Enter a name, choose “Smart Rule” and press “Next”
  • Choose “Generic Metric Smart Rule” and press “Next”
  • Enter “com.bea:Name=LocalSvcTblDataSource,ServerRuntime=AdminServer,Type=JDBCDataSourceRuntime” for Instance Name Pattern, “WaitingForConnectionHighCount” for Attribute Expression, „>“ for Comparision Opertor, “0” for Threshold Value, “30 seconds” for Sampling Rate and “10 Minutes” for Sample Retention Period. This configuration can be interpreted as „Trigger if the average value of WaitingForConnectionHighCount on the JDBCDataSourceRuntime, over the last 10 minutes, is greater than 0 on this server instance, collected at 30 second intervals.“
  • Choose “Every N seconds”. Press “Next”
  • Enter “15” for “Repeat”. Press “Next”
  • Choose “Use an automatic reset alarm” and enter “300”. In our case, this will mean that an email will be sent every 300 seconds if the policy was triggered. Press “Next”
  • Select the action from step 3 in “Diagnostic Actions” and press “Finish”

6. Triggering a rule

In our case we are going to generate a stuck thread condition in our Weblogic server. Once we reach the raise condition, we will get an email like this one:

 

Conclusion WLDF smart rules

These two rules are just few examples of what can be achieved with the new WLDF smart rules. In previous versions it was necessary to define rules that monitored the Weblogic logs in order to determine if there was an interesting event. This rules were sometimes very complicated because the administrator had to know exactly how the error was written in order to parse it effectively. Now, with smart rules, it is easier to monitor standard events like stuck threads or even use the MBeans provided by Weblogic in order  to trigger alert actions.

References

Dealing with Stuck Threads in WebLogic


https://docs.oracle.com/middleware/1221/wls/WLDFC/appendix_smartrules.htm#WLDFC649