Thursday, 6 December 2012

Exception handling in SOA Suite 10g and SOA Suite 11g

Introduction

Sometimes, the longer you think about how to solve a problem, the less complex the solution becomes. Error handling in SOA Suite 11g is one of those examples. It is tempting to implement an own mechanism for exception/error handling (for example http://javaoraclesoa.blogspot.com/2012/05/re-enqueueing-faulted-bpel-messages.html), although there already is an extensive fault management framework part of the SOA Suite. In this post I describe the method used in SOA Suite 10g to implement fault-policies using a custom Java class. I implement a similar exception handling mechanism in Oracle SOA Suite 11g.

Marcel Bellinga has provided most code in the below example.

Challenges to tackle

Some of the challenges involved when implementing exception handling;
- how do I make it easy for the people monitoring and maintaining the application to detect and recover from errors?
- how do I make sure no messages are lost?
- how do I make sure the order in which messages are offered to the application, does not change when exceptions occur?
- how do I prevent 'hammering' a system (continuously retrying faulted messages)

With these questions in mind, the following solution provides a good option.

A bit of background

Oracle BPEL 10g has the option to use fault-policies and fault-bindings (and use custom Java classes in the policies), which are put on the application server and referred to by a bpel process in the bpel.xml. See; http://docs.oracle.com/cd/E14101_01/doc.1013/e15342/bpelrn.htm#BABCHCED.

Oracle SOA Suite 11g has (in addition to the method described above) the option to deploy custom Java classes, fault-policies and fault-bindings as part of the composite to the application server. This mechanism makes it easier to use the fault management framework on a per-composite basis. See http://docs.oracle.com/cd/E12839_01/integration.1111/e10224/bp_faults.htm

Keep in mind, when using the fault management framework that the fault-policies get triggered before a catch branch as defined in a BPEL process. If you want the catch branch to be activated, the action to rethrow the fault, needs to be part of the policy.

Solution in short

The solution for handling faults while taking into account the above questions, will use the following method;
- in Oracle BPEL 10g, a custom Java class and a specific policy xml-file is deployed on the application server
- the bpel.xml file will refer to the policy defined in the specific policy XML file
- the custom Java class will first deactivate the activation agents of the process and then retire the process (avoiding the issue that messages are picked up while the process is already retired causing loss of messages)
- the faulted message is put in manual recovery mode so the error hospital can be used to recover (retry) the message after the problem is fixed
- if the problem is fixed, the process can be activated again
- the ORABPEL schema tables can be monitored for messages which can be recovered or to trigger someone something has gone wrong and a recovery action is required

In Oracle SOA Suite 11g the method is similar, however, the activation agents do not need to be deactivated explicitly, the API calls are a bit different (due to the SCA implementation) and the error handling is deployed as part of the composite (in this example, see http://mazanatti.info/index.php?/archives/75-SOA-Fault-Framework-Creating-and-using-a-Java-action-fault-policy.html for an example on how to deploy custom Java code centrally on the server).

Implementation

Implementation BPEL 10g exception handling

Custom Java action

Create a new Java project and include the orabpel.jar from your BPEL distribution in the root folder of your project. Update the project libraries to include the library.

Create a new Java class. I've used the following;

package testapi;

import com.oracle.bpel.client.BPELProcessMetaData;
import com.oracle.bpel.client.IBPELProcessConstants;
import com.oracle.bpel.client.IBPELProcessHandle;
import com.oracle.bpel.client.Locator;
import com.oracle.bpel.client.config.faultpolicy.IFaultRecoveryContext;
import com.oracle.bpel.client.config.faultpolicy.IFaultRecoveryJavaClass;

public class RetireProcess implements IFaultRecoveryJavaClass {
    public RetireProcess() {
    }

    /**     * This method is called by the BPEL Error Hospital framework when this    
     * * action is selected as retrySuccessAction (with the retry option) or    
     * * when this action is selected as successor in the human intervention    
     * * screen in the BPEL Console.     *     
     * * @param iFaultRecoveryContext     */
    public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) {
        System.out.println("RetireProcess RetrySucces start");
        setLifeCycle(iFaultRecoveryContext,
                     IBPELProcessConstants.LIFECYCLE_ACTIVE);
        System.out.println("RetireProcess RetrySucces einde");
    }

    /**     * This method is called by the BPEL Error Hospital framework when this    
     * * class is configured as action in the fault handling policy     *     
     * * @param iFaultRecoveryContext   
     * * @return String that can be used to influence choice for next action (not used in this case)     */
    public String handleBPELFault(IFaultRecoveryContext iFaultRecoveryContext) {
        System.out.println("RetireProcess HandleFault start");
        setLifeCycle(iFaultRecoveryContext,
                     IBPELProcessConstants.LIFECYCLE_RETIRED);
        System.out.println("RetireProcess HandleFault Einde");
        return null;
    }

    private void setLifeCycle(IFaultRecoveryContext iFaultRecoveryContext,
                              int status) {
        IBPELProcessHandle procHandle = null;
        Locator loc = null;
        BPELProcessMetaData bpelProcessMetadata = null;
        String processName;
        String revision;

        try {
            processName = iFaultRecoveryContext.getProcessId().getProcessId();
            revision = iFaultRecoveryContext.getProcessId().getRevisionTag();
            /*
                 * get Locator Instance
                */
            loc = iFaultRecoveryContext.getLocator();
            /*
                 * Lookup Process. Revision optional.
                */
            if (revision == null || revision.trim().equals("")) {
                procHandle = loc.lookupProcess(processName);
            } else {
                procHandle = loc.lookupProcess(processName, revision);
            }
            if (procHandle == null) {
                throw new Exception("Unable to find process: " + processName);
            }

            System.out.println("RetireProcess set lifecycle to retired");
            /*
                 * Get Metadata of the process.
                */
            bpelProcessMetadata = procHandle.getMetaData();
            if (bpelProcessMetadata.getLifecycle() != status) {
                /*
                 * Set Lifecycle to Retired.
                 * Use setState(IBPELProcessConstants.STATE_OFF) to change process state to off.
                 */
                bpelProcessMetadata.setLifecycle(status);
                System.out.println("RetireProcess Lifecycle set to retired");

                /*
                 * Stop activation agents
                 */
                if (status == IBPELProcessConstants.LIFECYCLE_RETIRED) {
                    procHandle.stopAllActivationAgents();
                } else {
                    procHandle.startAllActivationAgents();
                    ;
                }
                /*
                 * Finally update the process with the modified metadata.
                 */
                procHandle.updateMetaData(bpelProcessMetadata);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}



Noteworthy here are the method to retire the process; obtain a locator, use the locator to get a processhandle, use the processhandle to get to the metadata, update the metadata. The processhandle can also be used to stop the activation agents. Compile the project using JDK 1.5.0.06.

Place this class in;
[ORACLE_HOME]/bpel/system/classes/

Fault policy and fault binding

Create a fault policy like for example

<?xml version="1.0" encoding="UTF-8"?>
<faultPolicy version="2.0.1" id="RetireProcessPolicy" xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://schemas.oracle.com/bpel/faultpolicy" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <!-- This section describes fault conditions. Build more conditions with faultName, test and action -->
    <Conditions>
        <!-- Fault ALL -->
        <faultName>
            <condition>
                <action ref="RetireProcess"/>
            </condition>
        </faultName>
    </Conditions>
    <Actions>
        <!-- This action will attempt 8 retries at increasing intervals of 2, 4, 8, 16, 32, 64, 128, and 256 seconds. -->
        <Action id="ora-retry">
            <retry>
                <retryCount>8</retryCount>
                <retryInterval>2</retryInterval>
                <retryFailureAction ref="ora-terminate"/>
                <exponentialBackoff/>
            </retry>
        </Action>
        <!-- This is an action will cause a replay scope fault-->
        <Action id="ora-replay-scope">
            <replayScope/>
        </Action>
        <!-- This is an action will bubble up the fault-->
        <Action id="ora-rethrow-fault">
            <rethrowFault/>
        </Action>
        <!-- This is an action will mark the work item to be "pending recovery from console"-->
        <Action id="ora-human-intervention">
            <humanIntervention/>
        </Action>
        <!-- This action will cause the instance to terminate-->
        <Action id="ora-terminate">
            <abort/>
        </Action>
        <Action id="RetireProcess">
            <javaAction className="testapi.RetireProcess" defaultAction="ora-human-intervention"/>
        </Action>
    </Actions>
</faultPolicy>


Place the fault policy in
[ORACLE_HOME}/bpel/domains/{domain}/config/fault-policies

Create a reference to the faultpolicy in the bpel.xml of the process like;
(below </activationAgents>)
     <faultPolicyBindings>
         <process faultPolicy="
RetireProcessPolicy"/>
         <partnerLink faultPolicy="
RetireProcessPolicy"/>
      </faultPolicyBindings>


Noteworthy in this policy is the defaultAction. My custom Java class returns null. This triggers the defaultAction which is set to ora-human-intervention. This causes the invoke to be visible in the error hospital (Activities tab in the process manager). From the error hospital it is also possible to specify an on retry success method to be executed (by clicking the specific error).







Result

When an error occurs, the failed messages arrive (in order) in the error hospital (usually a small number before the process is retired). The process instances which have faulted, remain open. The process is retired. You can retry the activities to check whether the error is fixed. In the error is fixed, the process can be activated again resuming normal action. This way the order of messages is guaranteed, there is no useless hammering and retrying the action which has failed. The process can be activated when the problem is fixed avoiding a lot of manual re-offering of messages.

Implementation BPEL 11g Exception handling

The 11g implementation is very similar to the 10g implementation. Deployment does not require any server side configuration. You can download the example project here; http://dl.dropbox.com/u/6693935/blog/TestExceptionHandling.zip. If you encounter errors deploying the project, you should remove the MDS entry in .adf\META-INF\adf-config.xml causing the issue. The example project requires the setup as described in; http://javaoraclesoa.blogspot.com/2012/05/re-enqueueing-faulted-bpel-messages.html. Also mind that when importing the project, your MDS configuration might differ. Remove the entries not relevant for your configuration from the .adf/META-INF/adf-config.xml file.

Custom Java class


I've used the following Java class (created in SCA-INF/src). No additional project configuration (like including libraries) is required in 11g.


package ms.testapp.exceptionhandling;
import com.collaxa.cube.engine.fp.BPELFaultRecoveryContextImpl;
import java.util.logging.Logger;
import oracle.integration.platform.faultpolicy.IFaultRecoveryContext;
import oracle.integration.platform.faultpolicy.IFaultRecoveryJavaClass;
import oracle.soa.management.facade.Composite;
import oracle.soa.management.facade.Locator;
import oracle.soa.management.facade.LocatorFactory;
public class RetireProcess implements IFaultRecoveryJavaClass {
    private final static Logger logger = Logger.getLogger(RetireProcess.class.getName());
    public RetireProcess() {
        super();
    }
    public void handleRetrySuccess(IFaultRecoveryContext iFaultRecoveryContext) {
    }
    public String handleFault(IFaultRecoveryContext iFaultRecoveryContext) {
            System.out.println("handleFault started");
            BPELFaultRecoveryContextImpl bpelCtx =
                (BPELFaultRecoveryContextImpl)iFaultRecoveryContext;
        try{
            Locator loc = LocatorFactory.createLocator();
            System.out.println("locator obtained");
            Composite comp = loc.lookupComposite(bpelCtx.getProcessDN().getCompositeDN());
            System.out.println("compisite found");
            comp.retire();
            //bpelCtx.addAuditTrailEntry("retired " + comp.getDN());
            System.out.println("process retired");
            logger.info("retired " + comp.getDN());
        } catch (Exception e) {
            System.out.println("fault in handler");
            //bpelCtx.addAuditTrailEntry("Error in FaultHandler " + RetireProcess.class.getName());
            logger.severe("Error in FaultHandler " + RetireProcess.class.getName());
            e.printStackTrace();
        }
        return null;
        }
}

Fault policy and fault binding

My fault-policy file is called fault-policies.xml (the composite.xml picks that file by default but a different file can be specified in the composite.xml file if required) and it looks like;

<?xml version="1.0" encoding="UTF-8"?>
<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <faultPolicy version="2.1.3" id="ConnectionFaults">
    <Conditions>
      <faultName>
        <condition>
          <action ref="handle-fault-through-custom-java"/>
        </condition>
      </faultName>
    </Conditions>
    <Actions>
      <Action id="handle-fault-through-custom-java">
        <javaAction className="
ms.testapp.exceptionhandling.RetireProcess"
                    defaultAction="ora-human-intervention">
        </javaAction>
      </Action>
      <Action id="ora-rethrow-fault">
        <rethrowFault/>
      </Action>
<Action id="ora-human-intervention">
 <humanIntervention/></Action>
    </Actions>
  </faultPolicy>
</faultPolicies>


My fault-bindings.xml looks like;



<?xml version="1.0" encoding="UTF-8"?>
<faultPolicyBindings version="2.0.1"
                     xmlns="http://schemas.oracle.com/bpel/faultpolicy"
                     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <composite  faultPolicy="ConnectionFaults"/>
</faultPolicyBindings>



These files are placed in the same folder as the composite.xml.

Result

The behavior in 11g is similar to the behavior described in 10g in both examples. One thing to notice is that the API works on composite level and I've not found a way to directly stop of start the activation agents. I did however not encounter the 10g error that the JCA adapter tried to start a retired process.
First the correct situation. Use the testscript to enqueue a message.

DECLARE
  queue_options DBMS_AQ.ENQUEUE_OPTIONS_T;
  message_properties DBMS_AQ.MESSAGE_PROPERTIES_T;
  recipients          DBMS_AQ.aq$_recipient_list_t;
  message_id RAW(16);
  message SYS.XMLType;
BEGIN
  recipients(1) := sys.aq$_agent('EXCEPTIONTEST', NULL, NULL);
  message_properties.recipient_list := recipients;
  message := sys.XMLType.createXML('<itemCollectionArray xmlns:msg_out="http://test.ms/itemcollections" xmlns="http://test.ms/itemcollections"><msg_out:itemsCollection><msg_out:item><msg_out:name>Name</msg_out:name><msg_out:value>Piet</msg_out:value></msg_out:item></msg_out:itemsCollection></itemCollectionArray>');
  DBMS_AQ.ENQUEUE( queue_name => 'TESTUSER.TEST_SOURCE_QUEUE',
                   enqueue_options => queue_options,
                   message_properties => message_properties,
                   payload => message,
                   msgid => message_id);
  COMMIT;
END;

The result is a correct execution of the process;
Next disable the TEST_TARGET_QUEUE


Again submit a test message and confirm the error handler has activated in the Enterprise Manager.

Conclusion

Error handling in SOA Suite 11g is more extensive (has more options) then error handling in SOA Suite 10g. Also SOA Suite 11g provides options for implementing fault handling on a per process basis. This was absent in SOA Suite 10g. For accessing the API, there have been many changes going from 10g to 11g. The most significant changes have been caused by the implementation of the SCA framework. SOA Suite 11g makes it a lot easier to use the Java API.

Also a lesson learned is to think about error handling very early on in a project and not start with the implementation which seems logical to a single developer but discuss the different options and requirements with the customer and other developers. In this case a relatively simple solution using standard Oracle functionality causes many requirements to be met. However if the purpose is to make as many hours as possible and tackling every requirement as a new change, then this solution is not for you!

Oracle SOA 11g 11.1.1.6 Performance Tuning of BPEL Processes:BPEL Properties Set Inside a Composite

Oracle SOA 11g 11. 1.1.6 Performance Tuning of BPEL Processes

BPEL Properties Set Inside a Composite

This section lists the config properties of some sections of the deployment descriptor. For each configuration property parameter, a description is given, as well as the expected behavior of the engine when it is changed.
All the properties set in this section affect the behavior of the component containing the BPEL process only. Each BPEL process can be created as a component of a composite. These properties can be modified in composite.xml or in the System MBean Browser of Oracle Enterprise Manager Fusion Middleware Control. 
Some Concepts First
As a general practice, it is better to design your BPEL processes as transient instead of durable if performance is a concern. Note that this may not always be possible due to the nature of your process, but keep the following points in mind.

The dehydration store is uses to maintain long-running asynchronous BPEL instances storing state information as they wait for asynchronous callbacks. This ensures the reliability of these processes in the event of server or network loss.
 Oracle BPEL Process Manager supports two types of processes; transient and durable.

Transient Processes
Transient processes do not incur dehydration during their process execution. If an executing process experiences an unhandled fault or the server crashes, instances of a transient process do not leave a trace in the system. Thus, these instances cannot be saved in-flight regardless if they complete normally or abnormally. Transient processes are typically short-lived, request-response style processes. Synchronous processes are examples of transient processes.

Durable Processes
Durable processes incur one or more dehydration points in the database during execution. Dehydration is triggered by one of the following activities:
  • Receive activity
  • OnMessage branch in a pick activity
  • OnAlarm branch in a pick activity
  • Wait activity
  • Reply activity
  • checkPoint() within a <bpelx:exec> activity

 inMemoryOptimization

This property indicates to Oracle BPEL Server that this process is a transient process and dehydration of the instance is not required. When set to True, the completionPersistPolicy is used to determine persistence behavior. This property can only be set to True for transient processes or processes that do not contain any dehydration points such as receive, wait, onMessage and onAlarm activities. The inMemoryOptimization property is set at the BPEL component level. When set to False, dehydration is disabled which can improve performance in some use cases.
Values:
This property has the following values:
  • False (default): instances are persisted completely and recorded in the dehydration store database.
  • True: The completionPersist policy is used to determine persistence behavior. 


completionPersistPolicy

This property configures how the instance data is saved. It can only be set at the BPEL component level. The completionPersistPolicy property can only be used when inMemoryOptimization is set to be True (transient processes). Note that this parameter may affect database growth and throughput (due to reduced I/O).
ValueDescription
On (default)The completed instance is saved normally
DeferredThe completed instance is saved, but with a different thread and in another transaction.
FaultedOnly the faulted instances are saved.
Note: When an unhandled fault occurs, regardless of these flags, audit information of the instance is persisted within cube_instance table.
OffNo instances of this process are saved.
<component name="BPELProcess">
   <implementation.bpel src="BPELProcess.bpel" />

   <property name="bpel.config.completionPersistPolicy">faulted</property>
   <property name="bpel.config.inMemoryOptimization">true</property>
   ...
</component>

auditLevel

You can set the audit level for a BPEL process service component. This setting takes precedence over audit level settings at the SOA Infrastructure, service engine, and SOA composite application levels.
Set the bpel.config.auditLevel property to an appropriate value in the composite.xml file of your SOA project as shown in the example below:
<component name="BPELProcess">
<implementation.bpel src="BPELProcess.bpel" />
<property name="bpel.config.auditLevel">Off</property>
</component>
ValueDescription
InheritInherits the audit level from infrastructure level.
OffNo audit events (activity execution information) are persisted and no logging is performed; this can result in a slight performance boost for processing instances.
MinimalAll events are logged; however, no audit details (variable content) are logged.
ErrorLogs only serious problems that require immediate attention from the administrator and are not caused by a bug in the product. Using this level can help performance.
ProductionAll events are logged. The audit details for assign activities are not logged; the details for all other activities are logged.
DevelopmentAll events are logged; all audit details for all activities are logged.

Partner Link Property

You can dynamically configure a partner link at runtime in BPEL. This is useful for scenarios in which the target service that BPEL wants to invoke is not known until runtime. The following Partner Link properties can be tuned for performance:

idempotent

An idempotent activity is an activity that can be retried (for example, an assign activity or an invoke activity). Oracle BPEL Server saves the instance after a nonidempotent activity. This property is applicable to both durable and transient processes.
Values:
This property has the following values:
  • False: Activity is dehydrated immediately after execution and recorded in the dehydration store. When idempotent is set to False, it provides better failover protection, but may impact performance if the BPEL process accesses the dehydration store frequently.
  • True (default): If Oracle BPEL Server fails, it performs the activity again after restarting. This is because the server does not dehydrate immediately after the invoke and no record exists that the activity executed. Some examples of where this property can be set to True are: read-only services (for example, CreditRatingService) or local EJB/WSIF invocations that share the instance's transaction.

A BPEL invoke activity is by default an idempotent activity, meaning that the BPEL process does not dehydrate instances immediately after invoke activities. Therefore, if idempotent is set to true and Oracle BPEL Server fails right after an invoke activity executes, Oracle BPEL Server performs the invoke again after restarting. This is because no record exists that the invoke activity has executed. This property is applicable to both durable and transient processes.
If idempotent is set to false, the invoke activity is dehydrated immediately after execution and recorded in the dehydration store. If Oracle BPEL Server then fails and is restarted, the invoke activity is not repeated, because Oracle BPEL Process Manager sees that the invoke already executed.
When idempotent is set to false, it provides better failover protection, but at the cost of some performance, since the BPEL process accesses the dehydration store much more frequently. This setting can be configured for each partner link in the bpel.xml file.



nonBlockingInvoke

By default, Oracle BPEL Process Manager executes in a single thread by executing the branches sequentially instead of in parallel. When this property is set to True, the process manager creates a new thread to perform each branch's invoke activity in parallel. This property is applicable to both durable and transient processes.
Consider setting this property to True if you have invoke activities in multiple flow or flow n branches. This is especially effective if the parallel invoke activities are two-way, but some benefits can be realized for parallel one-way invokes as well.
Note:
Invocations to the same partner link will happen in sequence and not in parallel. If you invoke different partner links each time with nonBlockingInvoke set to True, then each link will work in parallel even if all of the partner links point to the same source.
Values:
  • True: Oracle BPEL Server spawns a new thread to execute the invocation.
  • False (default): Oracle BPEL Server executes the invoke activity in the single process thread.

validateXML

Enables message boundary validation. Note that additional validation can impact performance by consuming extra CPU and memory resources.
Values:
  • True: When set to True the engine validates the XML message against the XML schema during <receive> and <invoke> for this partner link. If the XML message is invalid thenbpelx:invalidVariables run time BPEL Fault is thrown. This overrides the domain level validateXML property.
  • False (default): Disables XML validation.


General Recommendations:
1. If your Synchronous process exceed, say 1000 instances per hour, then its better to set inMemoryOptimization to true and completionPersistPolicyto faulted, So that we can get better throughput, only faulted instances gets dehydrated in the database, its goes easy on the purge (purging historical instance data from database)
2. Do not include any settings to persist your process such as (Dehydrate, mid process receive, wait or Onmessage)
3. Have good logging on your BPEL Process, so that you can see log messages in the diagnostic log files for troubleshooting.
What should you do?
  • If the design of the process allows it, design your BPEL processes as short-lived, synchronous transactions.
  • If the design of the process allows it, avoid the activities listed above.
Any time your process is dehydrated to the dehydration store, this naturally impacts the performance of the process, and becomes a concern particularly in high volume environments.

Oracle SOA 11g 11.1.1.6 Performance Tuning of BPEL Processes:BPEL Threading Model

Oracle SOA 11g 11.1.1.6 Performance Tuning of BPEL Processes:BPEL Threading Model

When the dispatcher must schedule a dispatch message for execution, it can enqueue the message into a thread pool. Each dispatch set can contain a thread pool (java.util.concurrent.ThreadPoolExecutor). The BPEL thread pool implementation notifies the threads when a message has been enqueued and ensures the appropriate number of threads are instantiated in the pool.

Dispatcher System Threads

The dspSystemThreads property specifies the total number of threads allocated to process system dispatcher messages. System dispatcher messages are general clean-up tasks that are typically processed quickly by the server (for example, releasing stateful message beans back to the pool). 

Typically, only a small number of threads are required to handle the number of system dispatch messages generated during run time.

The minimum number of threads for this thread pool is 1 and it cannot be set to 0 a or negative number.
The default value is 2. Any value less than 1 thread is changed to the default.

Dispatcher Invoke Threads

The dspInvokeThreads property specifies the total number of threads allocated to process invocation dispatcher messages. Invocation dispatcher messages are generated for each payload received and are meant to instantiate a new instance.
If the majority of requests processed by the engine are instance invocations (as opposed to instance callbacks), greater performance may be achieved by increasing the number of invocation threads. Higher thread counts may cause greater CPU utilization due to higher context switching costs.

The minimum number of threads for this thread pool is 1 and it cannot be set to 0 a or negative number.
The default value is 20 threads. Any value less than 1 thread is changed to the default.



Dispatcher Engine Threads

The dspEngineThreads property specifies the total number of threads allocated to process engine dispatcher messages. Engine dispatcher messages are generated whenever an activity must be processed asynchronously. If the majority of processes deployed are durable with a large number of dehydration points (mid-process receive, onMessage, onAlarm, and wait activities), greater performance may be achieved by increasing the number of engine threads.


Note that higher thread counts can cause greater CPU utilization due to higher context switching costs.

The minimum number of threads for this thread pool is 1 and it cannot be set to 0 a or negative number.
The default value is 30 threads. Any value less than 1 thread is changed to the default.



Dispatcher Maximum Request Depth

The dspMaxRequestDepth property sets the maximum number of in-memory activities to process within the same request. After processing an activity request, Oracle BPEL Process Manager attempts to process as many subsequent activities as possible without jeopardizing the validity of the request. Once the activity processing chain has reached this depth, the instance is dehydrated and the next activity is performed in a separate transaction.
If the request depth is too large, the total request time can exceed the application server transaction time out limit.This process is applicable to durable processes.

The default value is 600 activities.








Oracle SOA 11g 11.1.1.6 Performance Tuning of BPEL Processes:Audit

AuditLevel

The auditLevel property sets the audit trail logging level. This configuration property is applicable to both durable and transient processes. This property controls the amount of audit events that are logged by a process. Audit events result in more database inserts into the audit_trail table which may impact performance. Audit information is used only for viewing the state of the process from Oracle Enterprise Manager Console.
ValueDescription
InheritInherits the audit level from infrastructure level.
OffNo audit events (activity execution information) are persisted and no logging is performed; this can result in a slight performance boost for processing instances.
MinimalAll events are logged; however, no audit details (variable content) are logged.
ErrorLogs only serious problems that require immediate attention from the administrator and are not caused by a bug in the product. Using this level can help performance.
ProductionAll events are logged. The audit details for assign activities are not logged; the details for all other activities are logged.
DevelopmentAll events are logged; all audit details for all activities are logged.

AuditDetailThreshold

The auditdetailthreshold property sets the maximum size (in kilobytes) of an audit trail details string before it is stored separately from the audit trail. If an audit trail details string is larger than the threshold setting, it is not immediately loaded when the audit trail is initially retrieved; a link is displayed with the size of the details string. Strings larger than the threshold setting are stored in theaudit_details table, instead of the audit_trail table.
The details string typically contains the contents of a BPEL variable. In cases where the variable is very large, performance can be severely impacted by logging it to the audit trail.
The default value is 50000 (50 kilobytes).

AuditStorePolicy

This property specifies the strategy to persist the BPEL audit data.
ValueDescription
syncSingleWrite (default)AuditTrail and dehydration are persisted to DB in one transaction.
syncMultipleWriteAuditTrail and dehydration are persisted in the same thread but separate transactions.
asyncAuditTrail and dehydration are persisted by separate threads and separate transactions.
By default, audit messages are stored as part of the main BPEL transaction. A BPEL instance holds on to the audit messages until the flow reaches dehydration. In some use cases, for example when you have a large loop, and there is no dehydration point in the loop, a large number of audit logs are accumulated. This could lead to an out-of-memory issue and BPEL main transaction can experience timeout errors. You may consider using syncMultipleWrite or async to store the audit message separately from the main transaction.
When you use syncMultipleWrite and async auditStorePolicy, there are a few other properties that need to be considered. Please see the sections below.


AuditFlushByteThreshold

This property controls how often the engine should flush the audit events, basically after adding an event to the current batch, the engine checks to see if the current batch byte size is greater than this value or not.
Consider tuning this property when async or syncMultipleWrite audit strageties are used. This size needs to be tuned based on the application.


AuditFlushEventThreshold

This property controls how often the engine should flush the audit events, basically when it reaches this limit of the number of events, the engine would trigger the store call.
Consider tuning this property when async or syncMultipleWrite audit strageties are used. This size needs to be tuned based on the application.

Oracle SOA 11g 11.1.1.6 Performance Tuning of BPEL: Significant Tips

OneWayDeliveryPolicy

The oneWayDeliveryPolicy is from the Oracle 10g configuration property deliveryPersistencePolicy.
The new configuration property name is bpel.config.oneWayDeliveryPolicy.
The oneWayDeliveryPolicy property controls database persistence of messages entering Oracle BPEL Server. By default, incoming requests are saved in the delivery service database tabledlv_message. These requests are later acquired by Oracle BPEL Server worker threads and delivered to the targeted BPEL process. This property persists delivery messages and is applicable to durable processes.
When setting the oneWayDeliveryPolicy property to async.cache, if the rate at which one-way messages arrive is much higher than the rate at which Oracle BPEL Server delivers them, or if the server fails, messages may be lost. In addition, the system can become overloaded (messages become backlogged in the scheduled queue) and you may receive out-of-memory errors. Consult your own use case scenarios to determine if this setting is appropriate.
One-way invocation messages are stored in the delivery cache until delivered. If the rate at which one-way messages arrive is much higher than the rate at which Oracle BPEL Server delivers them, or if the server fails, messages may be lost.
ValueDescription
async.persist (Default)Delivery messages are persisted in the database. With this setting, reliability is obtained with some performance impact on the database. In some cases, overall system performance can be impacted.
async.cacheIncoming delivery messages are kept only in the in-memory cache. If performance is preferred over reliability, this setting should be considered.
syncDirects Oracle BPEL Server to bypass the scheduling of messages in the invoke queue, and invokes the BPEL instance synchronously. In some cases this setting can improve database performance.


MaximumNumberOfInvokeMessagesInCache

This property specifies the number of invoke messages that can be kept in the in-memory cache. Once the engine hits this limit, it would push the message to dispacther in-memory cache, instead it would save the message in the db and these saved messages can be recovered using recovery job. You can use value -1 to disable.
The default value is 100000 messages.



StatsLastN

The StatsLastN property sets the size of the most-recently processed request list. After each request is finished, statistics for the request are kept in a request list. A value less than or equal to 0 disables statistics gathering. To optimize performance, consider disabling statistics collection if you do not need them.
This property is applicable to both durable and transient processes.
The default value is -1.


LargeDocumentThreshold

The largedocumentthreshold property sets the large XML document persistence threshold. This is the maximum size (in kilobytes) of a BPEL variable before it is stored in a separate table from the rest of the instance scope data.
This property is applicable to both durable and transient processes.
Large XML documents impact the performance of the entire Oracle BPEL Server if they are constantly read in and written out whenever processing on an instance must be performed.
The default value is 10000 (100 kilobytes).


Validate XML

The validateXML property validates incoming and outgoing XML documents. If set to True, the Oracle BPEL Process Manager applies schema validation for incoming and outgoing XML documents. Nonschema-compliant payload data is intercepted and displayed as a fault.
This setting is independent of the SOA composite application and SOA Infrastructure payload validation level settings. If payload validation is enabled at both the service engine and SOA Infrastructure levels, data is checked twice: once when it enters the SOA Infrastructure, and again when it enters the service engine
CAUTION: Enabling XML payload validation can impact performance.
This property is applicable to both durable and transient processes.
The default value is False.



SyncMaxWaitTime

The SyncMaxWaitTime property sets the maximum time the process result receiver waits for a result before returning. Results from asynchronous BPEL processes are retrieved synchronously by a receiver that waits for a result from Oracle BPEL Server.
The default value is 45 seconds.


InstanceKeyBlockSize

The InstanceKeyBlockSize property controls the instance ID range size. Oracle BPEL Server creates instance keys (a range of process instance IDs) in batches using the value specified. After creating this range of in-memory IDs, the next range is updated and saved in the ci_id_range table.
For example, if instanceKeyBlockSize is set to 100, Oracle BPEL Server creates a range of instance keys in-memory (100 keys, which are later inserted into the cube_instance table as cikey). To maintain optimal performance, ensure that the block size is larger than the number of updates to the ci_id_range table.
The default value is 10000.


MaxRecoverAttempt

You can configure the number of automatic recovery attempts to submit in the same recoverable instance. The value you provide specifies the maximum number of times invoke and callback messages are recovered. Once the number of recovery attempts on a message exceeds the specified value, a message is marked as nonrecoverable.
When a BPEL instance makes a call to another server using invokeMessage, and that call fails due to a server down, validation error, or security exception, the invokeMessage is placed in a recovery queue and BPEL attempts to retry those messages. When there are many messages, and a majority of them are being sent to the same target, the target can become overloaded. Setting the appropriate value of MaxRecoveryAttempt will prevent excessive load on servers that are targeted from BPEL web service calls.




Oracle SOA Suite 11.1.1.6: Monitoring SOA Composite Behavior with SQL Queries

Oracle SOA Suite 11.1.1.6: Monitoring SOA Composite Behavior with SQL Queries


The standard information obtained from Oracle Enterprise Manager Fusion Middleware Control might not be sufficient and adequate for fine grained monitoring. By querying some core product tables in the [PREFIX]_SOAINFRA schema such as the COMPOSITE_INSTANCE, CUBE_INSTANCE, and MEDIATOR_INSTANCE tables, you can get detailed metrics that include success/failure counts, composite instance performance, and durations of invokes as well. Here, we provide two main
queries to obtain performance metrics on BPEL processes and Mediator services, specifically the duration of time that each component took. Though Oracle typically does not recommend querying the product tables directly (since the structure of the tables may change after a patch or upgrade), note that these queries below run fine on Oracle SOA Suite 11g PS3 (11.1.1.4), PS4 (11.1.1.5), and PS5 (11.1.1.6).


The following query outputs a list of all BPEL component instances, their state,
average, minimum, and maximum durations, as well as counts:

SELECT DOMAIN_NAME PARTITION,COMPONENT_NAME,
DECODE(STATE,'1','RUNNING','5','COMPLETED','6',
'FAULTED','9','STALE') STATE,
TO_CHAR(AVG((TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATE-CREATION_DATE),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),18,4))),'999990.000') AVG,
TO_CHAR(MIN((TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATE-CREATION_DATE),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),18,4))),'999990.000') MIN,
TO_CHAR(MAX((TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATE-CREATION_DATE),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(MODIFY_DATECREATION_
DATE),18,4))),'999990.000') MAX,
COUNT(1) COUNT
FROM CUBE_INSTANCE
GROUP BY DOMAIN_NAME, COMPONENT_NAME, STATE
ORDER BY COMPONENT_NAME, STATE


The following query displays a list of all Mediator component instances, their state,
average, minimum, and maximum durations, as well as counts:

SELECT SUBSTR(COMPONENT_NAME, 1, INSTR(COMPONENT_NAME,'/')-1)
PARTITION,
SUBSTR(COMPONENT_NAME, INSTR(COMPONENT_NAME,'/')+1,
INSTR(COMPONENT_NAME,'!')-INSTR(COMPONENT_NAME,'/')-1) COMPONENT,
SOURCE_ACTION_NAME ACTION,
DECODE(COMPONENT_STATE,'0','COMPLETED','2',
'FAULTED','3','ABORTED','4','RECOVERY
NEEDED','8','RUNNING','16','STALE') STATE,
TO_CHAR(AVG((TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIME-CREATED_TIME),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),18,4))),'999990.000') AVG,
TO_CHAR(MIN((TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIME-CREATED_TIME),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),18,4))),'999990.000') MIN,
TO_CHAR(MAX((TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),12,2))*60*60) +
(TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIME-CREATED_TIME),15,2))*60) +
TO_NUMBER(SUBSTR(TO_CHAR(UPDATED_TIMECREATED_
TIME),18,4))),'999990.000') MAX,
COUNT(1) COUNT
FROM MEDIATOR_INSTANCE
GROUP BY COMPONENT_NAME, SOURCE_ACTION_NAME, COMPONENT_STATE
ORDER BY COMPONENT_NAME, SOURCE_ACTION_NAME, COMPONENT_STATE