Thursday, December 12, 2013

First steps into the Oracle Database Cloud

Oracle provides a Database Cloud Service. In a previous post I've looked at the Oracle Java Cloud Service ( The database is of course also an important component used in most applications. In this blog post I'll describe my first experiences with the Oracle Database Cloud service. I've used two methods to connect to the Oracle Database Cloud service. The first one from SQLDeveloper. Next I created a webservice, deployed it to the Oracle Java Cloud service and fetched data from the Oracle Database Cloud service with it.

There are of course other methods to interact with the Oracle Database Cloud service. It is for example possible using SQLWorkshop from the Apex interface to expose RESTful services to access the database. See for example on how to create these services. By default, calls to the Oracle Cloud are encrypted. See on how to call services.

Friday, December 6, 2013

Oracle BAM. Looking at different integration options

In order to monitor process flow, Oracle Business Activity Monitoring (BAM) can be used. Oracle BAM is part of the Oracle SOA Suite. What are the options for implementing Oracle BAM in the context of Oracle SOA composites? When should you use which method to integrate with BAM? In this post I'll describe the different methods of integration with Oracle BAM and the work required in order to get it working. The below screenshot was taken from; to illustrate what an Oracle BAM dashboard looks like.

Sunday, December 1, 2013

First steps into the Oracle Java Cloud

Integration on premise has been there for quite a while. Oracle SOA Suite provides many tools to help customers accomplish that. The area of integration however is expanding. More customers start using cloud services. Integration with cloud services differs in several aspects from on premise integration. Management of servers/accounts differs. Usually there is a limited interface which the cloud provider offers to customize the behavior/scaling of virtual servers/services. Also development differs. You deploy 'to the cloud' and not to a local (on the same network) server. Automation of a business process expands to beyond the borders of the business firewall so security and identity management become more important.

Since this is my first blog post about the Oracle Cloud, I will not go into much detail here but will describe my experience with creating a trial account for the Oracle Java Cloud, deploying a simple helloworld webservice and calling it from outside the cloud.

Friday, November 15, 2013

Analyzing instances in an Oracle SOA environment: linking composite instances via references

To analyse a running environment, it is useful to know which process calls which other process and how often it does so (which instance is initiated by which reference from which other process). This provides insight in how composites are linked and thus how process flows are implemented. In a previous post I've looked at Oracle Business Transaction Management (BTM) to achieve this insight ( Oracle BTM can't analyse local invocations and requires quite some work to install/setup correctly. Also a license is required, for example SOA Management Pack Enterprise Edition, in order to use this product. To make monitoring composite instances and their relations possible without this product, the dehydration store, can be queried. In this post I'll describe a method on how this can be done. Mind that this method is not fully tested and certainly not supported by Oracle. Use with caution!

Saturday, November 2, 2013

A first look at Oracle Business Transaction Management (BTM)

Monitoring and debugging Oracle SOA Suite environments is often a topic not paid much attention to. The people who write the software are often not much involved in the maintenance of the running software and the maintenance people do not have much application knowledge.

Usually Oracle SOA applications are composed of several components which interact with each other. These components are often of different technologies. Personally I tend to use databases, BPEL processes and Java Webservices a lot. For the maintenance people it is difficult to understand all application call chains and debug these in case of problems.

I was curious if the Oracle BTM product would provide a solution for this; BTM is part of the SOA Management Pack Enterprise Edition. It used to be a product from AmberPoint before Oracle bought them in 2010. I've used SOA Suite with BTM on an XE 11r2 Db for this tryout. I did not read much documentation about this product or followed any courses, so it would also be a test of intuitiveness.

Saturday, October 19, 2013

Java Unit testing: Mocking JNDI, final classes, statics and private parts

To improve quality of an application, it is a good practice to provide unit tests together with the code which is written. This makes it easier to determine the impact of changes and fix them early during development.

Writing unit tests can be challenging. Often applications run inside an application server which provides services and a unit test runs outside of the server. Applications can depend on the services provided by the application server which makes testing outside of this scope difficult. One of those services is JNDI. JNDI makes it possible for an application to look up for example a datasource. Luckily JNDI can be provided outside of application server scope.

To make mocking methods in unit tests easier, the Mockito framework ( can be used. Mockito however uses a subclassing mechanism which does not work when dealing with final classes. Also static and private methods cannot be mocked because of this. PowerMock provides solutions for those issues;

Monday, September 23, 2013

Sending HTML reports of JMeter integration test results from Maven

In this post I'll describe how I automated executing integration tests by using JMeter for several environments and sending integration test reports (HTML).

To accomplish this I've used Maven with several plugins;

- jmeter-maven-plugin ( for executing JMeter tests from Maven
- xml-maven-plugin ( for transforming the resulting JTL file to a readable HTML report
- exec-maven-plugin ( for calling scripts
- maven-postman-plugin ( for sending e-mails

I've also used conditional profiles which act on the result from previous build steps to only send a mail in case of errors. In the following post it is described you shouldn't be doing this; ('Using a profile should not result in a different artifact than building without the profile.') but I did it anyway because it seemed useful in this case.

I encountered several challenges which I will describe in this post and how I fixed them. The sample code can be downloaded here;

Friday, September 6, 2013

Inform the DBA if Weblogic can't connect to a database. Automated by using Jenkins, Maven and WLST.

As described in a previous post (, by using a WLST script it is possible to check if a Weblogic server can connect to all databases which are configured in datasources. In this post I'll describe how this script can be used in Jenkins to automatically inform the DBA to fix their database issues. The focus will be on the left part of the below image.
The SSHExec Ant task is used to connect to a Linux/Unix machine where the Weblogic server is running. The SSHExec Ant task then executes the WLST script which checks datasource/database availability (for every managed server). To call the task with different parameters for the different environments, Maven profiles are used in a wrapper POM file. This POM is then called by using the Maven invoker plugin for every profile. In order to not make the build fail on the first database which is down, the log file in Jenkins is parsed afterwards using the Logparser plugin (

The Maven project, script and log parsing rules can be downloaded here;

Friday, August 16, 2013

WLST; obtaining parameters, recovering JDBC database user passwords and testing database connections

WLST (Weblogic Scripting Tool) is very powerful. Most of the things (and more) which can be done with the Weblogic console can also be done by means of WLST scripting.

I've already written a post describing two options for datasource monitoring; The methods described in that post have some drawbacks;
- by using a servlet, you are exposing server datasource status and you are using a custom developed servlet to achieve functionality. the servlet does not show connection errors, just OK or NOK. Also it does not take into account different managed servers and datasource targets.
- by using WLST as in the example in the post, you're not actually testing creating a connection but are just monitoring current status

What the script should do

At a customer I noticed database availability was an issue. This often caused data sources to go to 'Suspended' state. Also when the database was available again, we often encountered connection errors like 'ORA-12514: TNS:listener does not currently know of service requested in connect', 'ORA-011033: ORACLE initialization or shutdown in progress' and 'ORA-01017: invalid username/password; logon denied'. The customer used a multitude of datasources. We wanted a quick way to resume every one of them and determine connection exceptions in order to inform the DBA to fix it. Since the script would run on several machines, it should be able to determine the IP to connect to and the required paths (to for example SerializedSystemIni.dat) on it's own.

In the below image, the 'call chain' is illustrated. This post focusses on the WLST script. In a second post ( I'll describe how we automated the process of calling the script over SSH by using an Ant plugin in Maven on several environments so we could schedule this to run every morning in Jenkins and automatically mail the DBA's to go and fix their DB's.
How the script is implemented

Obtaining local information
Since the script needed to be as environment neutral as possible, I obtained several pieces of information from the machine the script runs on.

Obtaining the physical interface IP address
I obtained the IP address of the local machine by using the ip command. See;

intf = 'eth0'
intf_ip = commands.getoutput("/sbin/ip address show dev " + intf).split()
intf_ip = intf_ip[intf_ip.index('inet') + 1].split('/')[0]
print 'Using IP: ',intf_ip

Obtaining the path to SerializedSystemIni.dat
I needed to obtain the path to SerializedSystemIni.dat for decrypting passwords (see later in this post). I obtained the path by (after connecting);


secdir is the directory where the SerializedSystemIni.dat file is usually located.

Obtaining parameters
I wanted my script to be flexible. I wanted to have the option to use a properties file (useful for the development environment), to use command-line arguments (useful when calling the script from Maven) and I wanted the script to ask for login details otherwise (useful for a DBA executing the script). This required 3 methods of obtaining parameters. The logic used was as followed;

- do we have a file? if so, use it else continue
- do we have command line arguments. if so, use them. if not, ask for them

Command line arguments
In Weblogic 10.3.6, Python/Jython 2.2.1 is used ( This limits the use of several of the more recent Python libraries to make working with parameters more easy (such as optparse and argparse). We can however use getopt;

The above described library also covered asking for the parameters if they were not supplied. If this failed for whatever reason, raw_input can be used.

Properties file
Using a properties file was relatively easy. See for example;

Because we are using Jython with WLST, we can use the Properties class from java.util. No need to reinvent the wheel here.

Obtaining login details and testing connections
When a datasource is suspended and the connection pool is tested, you can the following exception;
Connection test failed with the following exception: weblogic.common.resourcepool.ResourceDisabledException: Pool testuserNonXa is Suspended, cannot allocate resources to applications.

When however a connection cannot be made, you can resume the datasource and test it and nothing will appear to be wrong. In the log file however you might see one of the previously mentioned exceptions; you won't be able to use the datasource to get to the database. So testing just the datasource from the Weblogic console is not enough to confirm it is working.

To determine if a connection could be made, I wanted to create a new connection to the database while not using the datasource but with the same connection details. For this I needed to obtain the connection information the datasource was using. Then I encountered the following challenge; the cleartext passwords could not be read due to Weblogic server policies; "Access to sensitive attribute in clear text is not allowed due to the setting of ClearTextCredentialAccessEnabled attribute in SecurityConfigurationMBean". Also see;

I did not want to change this policy as it would introduce a security vulnerability. I had to decode the passwords. Based on several blog posts such as; I could recover the database user passwords (which is of course very useful...). The following which I found online is also interesting; As you can see in the code below, you can uncomment a line to display the passwords in plain text on the console.

First I specified the path where the SerializedSystemIni.dat file could be found. Then I used that to decrypt the encrypted passwords I obtained from the MBeans. Then I used zxJDBC to connect to the database using the obtained credentials;

The script

Mind the indentation! It's Jython. You should execute this with the script in your Weblogic server installation.

import commands
import os
import traceback
import sys
import getopt

from com.ziclix.python.sql import zxJDBC
from import FileInputStream

intf = 'eth0'
intf_ip = commands.getoutput("/sbin/ip address show dev " + intf).split()
intf_ip = intf_ip[intf_ip.index('inet') + 1].split('/')[0]
print 'Using IP: ',intf_ip

        fh = open("", "r")
        print 'Using'
        propInputStream = FileInputStream("")
        configProps = Properties()
except IOError:
                opts, args = getopt.getopt(sys.argv[1:], "", ["username=", "password="])
                for o, a in opts:
                        if o == "--username":
                                print 'User: ',var_user
                        elif o == "--password":
                                print 'Pass: ',var_pass
                                assert False, "unhandled option"
        except getopt.GetoptError, err:
                print 'No -u and -p commandline arguments and no'
                var_user = raw_input("Enter user: ")
                var_pass = raw_input("Enter pass: ")

if (len(allServers) > 0):
  for tempServer in allServers:
    print 'Processing: ',tempServer.getName()
    jdbcServiceRT = tempServer.getJDBCServiceRuntime();
    dataSources = jdbcServiceRT.getJDBCDataSourceRuntimeMBeans();
    if (len(dataSources) > 0):
                for dataSource in dataSources:
                        #print 'Resuming: ',dataSource.getName()
                        cd('/JDBCSystemResources/' + dataSource.getName() + '/JDBCResource/' + dataSource.getName() + '/JDBCDriverParams/' + dataSource.getName() + '/Properties/' + dataSource.getName())
                        #print 'User: ',dbuser
                        cd('/JDBCSystemResources/' + dataSource.getName() + '/JDBCResource/' + dataSource.getName() + '/JDBCDriverParams/' + dataSource.getName())
                        #print 'DbUrl: ',dburl
                        dbpassword_decrypted=str(ces.decrypt("".join(map(chr, dbpassword))))
                        #print 'DbPassword: ',dbpassword_decrypted
                        #print 'DbDriverName: ',dbdriver
                                result=cursor.execute('select sysdate from dual')
                                print 'ERROR: Url: ',dburl,' User: ',dbuser

Saturday, August 10, 2013

Book review; Oracle SOA Suite 11g Performance Tuning Cookbook

When implementing Oracle SOA 11g, performance is an often encountered topic. I have been involved at several customers in performance taskforces and I've read several books and a lot of Oracle documentation related to the topic. As such I was very interested in the book; Oracle SOA Suite 11g Performance Tuning Cookbook ( I was curious as to whether this book would contribute much to the topic and if I would learn new and interesting things.

I can be a bit of a nit-picker at times. I personally have not written any books yet. For every chapter I've noted something positive and something which I probably would have done differently (or maybe when I know the reason why things are done a certain way I might agree on it). This is not meant as personal criticism to the writers as they clearly know what they are writing about.

Review per chapter

Chapter 1 Identifying problems
The first chapter is focused on how to identify problems. Problems can be high CPU usage (due to for example garbage collection) and for example slow database queries.

VisualVM was new to me and the analyses of stuck threads I might also use in the near future. JRockit Mission Control was already familiar ( but it will be useful for others.

Because the book is in cookbook format and every recipe stands more or less on it's own, there is some repetition in for example how to use jps and jstat. I would have combined the recipes. What I was missing a bit here is concrete examples. How does the output of commands look. Also 'the survivor spaces and Eden' are introduced shortly which is a major topic in Chapter 5. The discussion on how to find problems with datasources could have been moved to Chapter 7.

Chapter 2 Monitoring Oracle SOA Suite
This chapter describes how VMWare's VFabric Hyperic HQ can be used to monitor Weblogic instances. Hyperic has an open source version;

The choice for Hyperic is well supported by arguments and the description on how to get it installed and configured is thorough. It looks like an interesting tool and works by installing an agent on the system to monitor. I'm currently using Nagios ( to monitor environments on a per (web)service level. I would have to try out Hyperic in order to determine if it can do that as well.

Monitoring is focused on OS/Weblogic health and not specific Oracle SOA Suite. The dms servlet is also discussed, however shortly. I would have left this out of the chapter and focus on Hyperic. If discussed, I would have expanded it a little such as for example add how it can be purged (for example at the end of

Chapter 3 Performance testing
JMeter is used to do performance testing on webservices.

Interesting topics were mainly distributed tests and the part on testing Amazon cloud services. Both were new to me.

I'm used to using SOAP UI for webservice tests and HermesJMS for JMS tests. Both are not discussed. It's most likely a matter of taste. The choice for JMeter is not supported by arguments. I missed performance testing of JMS queues here.

Chapter 4 JVM Memory
This chapter describes how the JVM uses memory and how this can be monitored/tuned.

This chapter contains a lot of explanation on the memory settings. This is something which I found often lacking in other books. 'A good rule of thumb is to stick to powers of 2', which is a good tip. Also; 'Note that by default the flag JAVA_USE_64BIT is set to false in %WL_HOME%\common\bin\commEnv.cmd.' is something I'll check out. The picture on JVM memory was clarifying. The division in JVM settings for JRockit and Hotspot is also useful.

This chapter also contains parts on JRockit Mission Control and VisualVM. Although these are very useful, it could have been combined with the part in Chapter 1.

Chapter 5 JVM Garbage Collection Tuning
The different garbage collection strategies for JRockit and Hotspot are explained here and how they can be tuned.

Disabling the RMI garbage collector was something new to me. Also disabling explicit garbage collection will be interesting to try and see if it makes a difference. Although applications shouldn't be using the System.gc() command, there are always programmers who sneak such code in.

Little to remark here. I liked reading the chapter. There was a bit more focus on Hotspot then on JRockit.

Chapter 6 Platform Tuning
This chapter describes options for tuning the platform. It is difficult to find a common denominator or focus point for the topics discussed in this chapter. The title fits.

JTA max transactions, HTTP accept backlog,  Stuck Thread Time, HugePages are settings I'll try out tuning at customers. Reducing OS swappiness is interesting as I've found that swap file usage can reduce performance dramatically. Also I got reminded that the OS also requires specific settings. I'll check if my customers adhere to the Oracle suggested settings. Of course there are also the more often mentioned settings such as tuning EJB timeouts present in this book.

First the stack is explained, SOA Suite runs on application server, application server runs on JVM, JVM runs on OS, OS runs on hardware. Then the chapter does not follow a top-down or bottom-up approach describing tuning options (or I've missed this). Suggesting to upgrade the JVM and use for example Java 1.7 is dangerous as I'm not sure Oracle supports that in combination with Oracle SOA Suite 11g (see; Describing tuning options on OS level could have been expanded a bit. For example I'm curious as to specific tuning options for different Linux distributions.

Chapter 7 Data Sources and JMS
This chapter described tuning of JMS and datasources.

Good to know; the settings for datasource testing can reduce stale connection issues. Also when tuning XA datasources, the database has to be tuned for this also; distributed_lock_timeout. The part on JMS tuning was relatively new to me but I am certain to use the knowledge when the need arises. Also a lot on JDBC datasource tuning was already familiar to me but useful for others.

It would have helped if the error messages which can occur when certain settings are incorrectly/sub-optimally configured, would have been added. This might make finding the right recipe to use when a certain problem occurs easier.

Chapter 8 BPEL and BPMN Engine Tuning
This chapter describes tuning of the BPEL and BPMN engine.

Not much new here for me. The most important settings are covered in my opinion. The tuning of the Db settings is something I'll go check at customers as the SOAINFRA performance is often a bottleneck.

Purging is a hot topic and could have been more thoroughly explained. It would have been nice if the enhancements were also discussed but since it is relatively new, one can't blaim the writers for that. SOAINFRA tablespace defragmentation is not described (the rebuild index, shrink space, enable row movement kind of statements. for example There were no BPMN engine specific tuning suggestions.

Chapter 9 Mediator and BAM
This chapter describes tuning options for the Mediator and BAM.

BAM tuning is not described often. A lot of familiar settings for the Mediator (also from but useful for people who are new to it.

It is a relatively short chapter and could have been combined with Chapter 10.

Chapter 10 Rules and Human Workflow
Optimization options for business rules and human workflow are discussed here. I have little experience with the components.

Choosing the right client to access the workflow component is interesting. Also how the rule engine can be used efficiently is something I will take with me when the need arises to use the component.

Not much to remark here except the same as in Chapter 9.

Chapter 11 SOA Application Design
Application design has a great impact on performance. Several suggestions are given in this chapter which can help.

Although this chapter does not appear to have been the main focus of the book, it does touch several of the more important areas to look for performance improvements.

I could write an entire book on the topic if time would allow me. Also I have done several measures in order to check whether specific settings actually increase performance. See for example The training I've recently followed also gave several best practices on the topic; What I found lacking was programming practices. Which patterns to use and which to avoid and for example how to process large batches efficiently. I missed how to code processes with keeping transaction bounderies in mind (

Chapter 12 High Performance Configuration
This chapter gives suggestions for how a configuration can help increase performance. It is focused on OHS, JMS, Virtualization and hardware.

The parts on virtualization and hardware required I will use as arguments to talk to the people responsible for that part at my current customer that some improvements might be a good idea.

The JMS part could have been put in the JMS chapter. Clustering issues are barely touched such as local optimization settings and issues with the EIS adapters. I would have expected to find them in this chapter.


I liked this book. I have learned some new things I can use at customers. This book also does not repeat the manuals and other books much which also makes it interesting. It is the first book I have seen with focus completely on performance and tuning of Oracle SOA Suite 11g. Not only does the book contain the usual performance tips and several new ones but it also provides some suggestions for tuning in a virtualized environment (including a suggestion on how to measure cloud performance) and hardware.

I am no fan of the recipe format and would have grouped some things differently to describe a more bottom up or top down approach to tuning a SOA Suite installation. Especially JVM tuning gets a lot of attention. It is clear the authors know what they write about. Also the writing style makes this book an easy and enjoyable read. I did not get bored.

I missed a bit tuning of the OSB. Maybe some BPEL programming practices or patterns which help improving performance. What I also missed was tuning from a high level down to a single BPEL process; how to bridge the gap from JVM measures to BPEL process instances. There is not much focus on dehydration store maintenance which can also be an important factor in improving performance. Clustering issues are barely touched.

On the whole this was an interesting read with a lot of useful suggestions.

Thursday, July 25, 2013

Oracle SOA Blackbelt training June 2013 Berlin

The Oracle SOA Blackbelt training in Berlin in June of 2013 has provided me with some valuable new insights into various topics related to Oracle SOA Suite. Below are some examples of things which I found interesting to share. These are mostly not literally from the slides but written down in my own words. Some examples have been expanded a bit by additional resources I've found. This not a complete list as the training covers quite a lot of material. I've focused on topics which can relatively easily be implemented or considered. The training also covered a lot of background which is harder to summarize and make concrete in practices/suggestions. The topics are various and not written down in a particular order.

SOA Best practices

The presentation on SOA best practices contained a lot of good suggestions. These are a few of them.

Problem; Over usage of dehydration causes much overhead. Examples; synchronous non-idempotent services, multiple mid-process receives, dehydrate/wait activities in processes.
Recommendations; Avoid chattiness, design services to be idempotent, if possible avoid asynchronous services (callbacks cause thread/transaction overhead)

Problem; Usage of FlowN where N is unconstrained can cause resource problems and lack of control.
Recommendation; Do not base N in FlowN on the data. Design the process using the driver / worker pattern (driver hands small chunks to the worker and the worker processes this). This can for example be implemented by using queues for decoupling/performance.

Problem; Asynchronous services cause overhead. This can become a problem if there are large numbers of asynchronous processes waiting for a response since for every callback, a new thread/transaction is needed and a callback needs to be matched to a correlation table which takes longer if there are a lot of open processes.
Recommendation; Design processes to be synchronous as much as possible. avoid nesting of asynchronous processes. also avoid synchronous processes calling asynchronous processes

Problem; A single BPEL process does batch processing of a large amount of messages. This takes a lot of memory and causes a lot overhead for storing audit information.
Recommendations; Put the work to be done in a separate BPEL process and optimize this process.  design for worst case scenario's. implement retry mechanisms in fault-policies. implement your own scheduling mechanism to spread the load. if no message level processing is needed, ODI might be an option.

Problem; Scope variables are dehydrated and when the variables become large, this causes overhead.
Recommendations; Use local variables whenever possible. assign portions of the message to scope variables.

BPEL is meant for service orchestration. It's not a procedural programming language.
Recommendations; use declarative constructs instead of elaborate custom constructions. use the skip condition instead of if statements. use assertions before and after invokes. use pick activities to time responses

Problem; Identifying BPEL processes can be difficult due to lack of business content in the EM views.
Recommendation; Set the composite title to a business value. it is possible to search for this name. business transaction keys and sensors (both have to be custom implemented) can be used to identify a flow instead of only the ECID since the ECID can not always be traced back to business context.


I usually tend to look in the log files and in the Enterprise Manager if something goes wrong. There are however several other options;
- creating dumps. the following provides a nice overview;
- collect info from the MBean browser;; Server/bpel:CubeDispatcher ReadXMLDispatcherTrace and; Server/BPELEngine:SyncProcessStats and AsyncProcessStats

Database growth (BPEL/BPM)

A concern for managing SOA Suite installations is the growth in database size. It is essential to think about a cleaning strategy. What I learned during the training is that with some programming practices the amount of information saved can also be reduced.

What causes growth;
- creating process instances
- updates to a message payload (workflow)
- asynchronous operations
- process scopes, task assignments. looping back to tasks
- audit (entry and exit of scope and model elements)

Thus it can be more efficient to build larger processes instead of multiple smaller ones. Also scoping and the use of a lot of model elements causes the database size to increase. For example, it could be (untested) more efficient (database space wise) to create a single assign activity with a lot of actions in it instead of several smaller assign activities.

Authorization, authentication and policies

Oracle Platform Security Services provides an abstraction layer to authorization/authentication/role/group providers. In Oracle BPM, users/groups are used but also application roles. The users/groups can be stored by LDAP providers. If there are multiple LDAP providers, these providers can be virtualized by Oracle Virtual Directory (OVD). When OVD is not available at a customer, libOVD can be used to provide a limited lightweight alternative. See for example Application roles are stored in a policy store which can also be LDAP based. See for example; Another option is to have it database based. See for example; Identities can be queried via the browser, for example at; http://localhost:7001/integration/services/IdentityService/identity.


The Blackbelt training contained a presentation on the BPEL Engine internals. This provided some additional points to pay attention to when developing. There are 4 'types' of BPEL processes. These 4 types can be categorized by 2 properties; synchronous or asynchronous and durable or transient. The different types behave differently in respect to transactions and threads. This has consequences for exception handling/propagation. The following should be avoided; a transient asynchronous process and a durable synchronous process. Transaction semantics can also have consequences for performance. See for example;

The Event Delivery Network

Events can be published with different settings; guaranteed delivery and once-and-only-once (OAOO). With the guaranteed delivery setting, local transactions are used (non-XA), the EDN_EVENT_QUEUE is used and there is the possibility of duplicate messages (in case of catastrophic failures). Also, the dequeue transaction is committed when all subscribers have received the message. Retries are not possible in case one subscriber fails to pickup the message. Once and only once setting works differently. A second queue is used; EDN_OAOO_QUEUE. Each subscriber picks up the message in it's own transaction. An XA connection is used with global transactions and the dequeue action can be retried. The EDN can be debugged in the following ways; by using the EDN servlet; http://<host_name>:<port_number>/soa-infra/events/edn-db-log (when EDN is AQ based (which is the default)). This servlet uses the EDN_LOG_MESSAGES table in the SOAINFRA schema. The following loggers are related oracle.integration.platform.blocks.event, oracle.integration.platform.blocks.event.saq, oracle.integration.platform.blocks.event.jms. In the Enterprise Manager, the log level can be tuned. The delivery of messages can be paused by setting the 'Paused' property of in the System MBean browser.

Local Invocation Optimization

If certain criteria are met, the SOAP/HTTP layer can be skipped when calling a service. The criteria are; the processes have to be on the same server, client/server policies must allow it (this is a property in the policy file). The same server requirement has implications for the use of loadbalancers in cluster configurations. Check the following part of the documentation for more details on how the 'same server check' is performed; You should also check out for some more information on how to make sure local optimization is used in case of clustering/load balancing setup's. A recommendation is to avoid too many small processes since this increases complexity (in order to achieve local optimization) and overhead.

BPEL fault handling best practices

The following best practices were mentioned in the training (I've rephrased them for brevity);
- always have a catch-all block (selectionFailure for example cannot be caught by using a fault policy)
- use named exceptions for business faults
- when using fault policies, always have a default action
- rethrow faults from fault policies in order to catch them in a BPEL process (when no fault handling action has been defined in the policy)
- notify the source system something has gone wrong
- think about how enable automatic recovery can have impact on transactions and business functionality. this can also be disabled;
- for asynchronous processes, after a mid-process receive, check the response (which can contain business faults) and terminate the process on error after sending a message to a notification service


An efficient way to use the MDS is to use a local filebased repository during development and use the database based MDS during runtime. MDS configuration is stored in adf-config.xml files. MDS files can be referred to by a path ; <Store_Root>/<Partition>/<Namespace>/<Resource>. When an MDS object changes, all dependant resources need to be recompiled. The MDS can be used to avoid server startup issues due to dependencies. See;


The Mediator is the only product which has an out of the box resequencer to provide ordening of messages. See for example; for details. Also it currently is the only component supporting Schematron based validations (see for example; The Mediator has an Hearbeat infrastructure; if in a clustered environment one instance of a Mediator fails (for whatever reason) this is detected and the other node in the cluster will process the message. Information on message 'lease' is stored in the table; MEDIATOR_CONTAINERID_LEASE. A message is locked when processing starts and released afterwards. The Heartbeat framework can be configured. See; Sequential routing rules are executed in a single transaction and thread. Parallel routing rules use 3 threads; inbound threads, locker threads and worker threads. Each worker thread uses it's own transaction. Parallel routing rules can be debugged by looking at the MEDIATOR_DEFERRED_MESSAGE. One row is one message in a parallel routing rule.

Oracle Service Bus

Local transport can be used for internal service chaining (for example when calling reusable components). Local transport services can not be invoked from outside the service bus and are not published to a UDDI. See for example The split/join pattern which can be implemented in the OSB uses a BEA BPEL implementation and works in-memory (not persistent). The service bus is minimalistic; by default, most features are turned off and the focus is on performance/high throughput. Oracle BPEL for example is more maximalistic and if you want high performance there you should turn things off.

Cube engine internals

Work items for the same instance are not allowed to execute concurrently. This implicates that the parallel execution in for example for-each/while loops is 'simulated' and not truly parallel. This knowledge helps understanding the behaviour of the nonBlockingInvoke setting. See; The nonBlockingInvoke setting creates new threads for invocations but the invocations are still executed in sequence. In practice, this leads to performance degradation.

Tuesday, June 25, 2013

Oracle SOA 11g BPEL transaction semantics and performance

In this post I'll provide a simple integration example and provide some suggestions to optimize the performance. Optimization suggestions are focused on transaction semantics. It's purpose is to indicate the importance to take into account various settings related to transaction management.

The base and inspiration for this post are the presentations and material from SOA Blackbelt training which was given by Oracle in Berlin this year from the 11th to the 14th of June. The training covered a lot of material in great depth. If you have the chance to follow it, I highly recommend it!

Test setup

First I enqueue 2000 messages on an Oracle AQ and take a timestamp which I write in a separate table in the same transaction. After a COMMIT, a BPEL process is triggered and picks up the messages (one instance per message). This process puts the message in a table. The moment of insertion is determined by having a default value on a field in the table. I then determine the time difference between the last message put in the table in the batch and the moment of insertion in the AQ. To avoid the overhead of audit logging, I turn this off for the specific process. The code used can be downloaded at the end of this post.

I will vary the bpel.config.oneWayDeliveryPolicy. I will try sync and async.persist (the default). async.persist will first put dequeued messages in the DLV_MESSAGE table before they are further processed in a separate transaction. sync will not do this and will invoke the BPEL process synchronously. I use three different datasource settings for this test. I will try both oneWayDeliveryPolicy settings with an XA datasource (which uses a 2-phase commit) and two non XA datasources. For the non XA datasource I will test with and without Global Transaction support. I will test this using different datasources and the same datasource. I will also test all combinations with the bpel.config.transaction setting to required and requiresNew.

Summarized; four different settings are varied in all combinations of the other settings. Two measures are taken with each combination.
- bpel.config.oneWayDeliveryPolicy (async.persist, sync)
- bpel.config.transaction (required, requiresNew)
- different datasource settings; XA, NonXA, NonXA no global transaction support
- using the same and different datasources with the same settings

I created 6 datasources with the three different settings;
testuserXa, testuserXa2
testuserNonXa, testuserNonXa2
testuserNonXaGlobal, testuserNonXaGlobal2

Next I created 6 connection factories for the DbAdapter and 6 connection factories for the AqAdapter. I varied the datasources in the JCA files in the BPEL process I created. For every test I primed the datasources/engine with 100 messages. Next I took 2 measures of 2000 messages.


The * indicates the process failed with the following exception;
DBWriteInteractionSpec Execute Failed Exception.
insert failed. Descriptor name: [WriteToStore.TestStore].
Caused by java.sql.SQLException: Cannot call Connection.commit in distributed transaction.  Transaction Manager will commit the resource manager when the distributed transaction is committed..

As can be seen, using an XA datasource decreased performance. Also using the sync property increased performance. In this example, little effect was visible when changing the transaction property. There was little difference in using two different datasources instead of a single datasource for processing. The NonXa datasource with global transaction support executed an explicit commit (apparently) which conflicted with the distributed nature of the transaction. This happened in all cases when performing an insert action using this datasource. This also indicates the transaction was distributed in all cases (even when using a Non Xa datasource without global transaction support).


When using the oneWayDeliveryPolicy setting of sync, the entire process is processed in a single transaction. When using async.persist, 2 transactions are involved. One to write to the DLV_MESSAGE table and one to call the DB insert.

The performance impact of writing to DLV_MESSAGE and the extra transaction was measurable. When calling a process synchronously, the effect would have been greater since then it would have been 4 transactions when using the async.persist setting.

Using an XA datasource means a two-phase commit is used. This has a slight overhead which is measurable in this example. It also provides a difference in behavior, mainly in the event of a fault. See for example;

Because of the process setup, I could not measure much effect on the transaction setting since the process initialization by the Aq adapter always starts a new transaction. When a process is called from another process, I would have expected to see a performance gain with the 'required' setting since then I woukld have expected less transactions.

I would have liked to see if the DbAdapter and AqAdapter behaved differently when different datasources on the same database schema were used to connect instead of the same datasource. The only difference found however was that when using async.persist, requiresNew and using the same NonXA datasource without global transaction support, errors occurred. These errors did not occur when using different datasources (also NonXA without global transaction support). Apparently with these settings, an explicit commit is executed when performing an insert by using the DbAdapter. Also the incoming message was read using the AqAdapter and the result was written using the DbAdapter. These adapters both have their own connectionpools. Using the same adapter might have caused different behavior. Using different datasources might make it possible to have more open incoming connections. This however in this case was most likely limited by other settings such as invoker threads.

Something to mind when considering the transaction related settings is it's impact on fault handling. This is however not the focus of this post. See for example; and

Of course many other settings can be tuned to increase performance. The effects found when changing the settings might also differ with the nature of the process tested. In this case, invoker threads can be increased to allow messages to be picked up faster and the maximum number of connections allowed by the datasources can be increased. The process can also be made transient. The list of other possible options to make this specific process go faster are numerous. The purpose of this example is however to indicate the impact transactions can have on the performance of a process so the other factors are kept constant and the default settings from the Oracle SOA 11g PS5 image are used;

You can download the used code and test results below;

Wednesday, June 12, 2013

Oracle BPEL and Java. A comparison of different interaction options

When building BPEL processes in Oracle SOA Suite 11g, it sometimes happens some of the required functionality can't easily be created by using provided activities, flow control options and adapters. Often there are Java libraries available which can fill these gaps. This blog post provides an evaluation of different options for making interaction between BPEL and Java possible.

In the examples provided in this post, I'm using the JsonPath library ( inside a BPEL process. A usecase for this could be that a webclient calls the BPEL process with a JSON message and BPEL needs to extract fields from that message.

The Java code to execute, is the following;

package ms.testapp;

import com.jayway.jsonpath.JsonPath;

public class JsonPathUtils {
    public String ExecuteJsonPath(String jsonstring, String jsonpath) {
        String result =, jsonpath).toString();
        return result;
    public JsonPathUtils() {

Of course small changes were necessary for the specific integration methods. I provided code samples at the end of this post for every method.

Integration methods


Oracle has provided an extension activity for BPEL which allows Java embedding. By using this activity, Java code can be used directly from BPEL. The JsonPath libraries to use in the embedding activity can be put in different locations such as the domain library directory or be deployed as part of the BPEL process. Different classloaders will be involved. To check whether this matters I've tried both locations.

The Java call happens within the same component engine. Below are measures from when using JSON libraries deployed as part of the BPEL process (in SCA-INF/lib).
Below are measures from when putting the libraries in the domain library folder.
As you can see from the measures, the performance is very comparable. The location where the BPEL process gets it's classes from has no clear measurable consequences for the performance.

Re-use potential
When the libraries are placed in the domain lib folder, they can be reused by almost everything deployed on the applications server. This should be considered. When deploying as part of the composite, there is no re-use potential outside the composite except possibly indirectly by calling the composite.

Maintenance considerations
Embedded Java code is difficult to maintain and debug. When deployed as part of a BPEL process, changes to the library require redeployment of the process. When libraries are put in the domain library directory, changes to it, impact all applications using it and might require a restart.


XPath extension functions can be created and used in BPEL (+ other components) and JDeveloper. This is nicely described on;

The custom XPath library is included as part of the SOA infrastructure and does not leave this context. As can be seen, the performance is comparable to the Java embedding method.

Re-use potential
The reuse potential is high. The custom XPath library can be used in different locations/engines, dependent on the descriptor file.

Maintenance considerations
Reuse by different developers in JDeveloper requires minimal local configuration, but this allows GUI support of the custom library. There are no specific changes to BPEL code thus low maintenance overhead. Changing the library on the application server requires a restart of the server.

Spring component

The Java code can be called as a Spring component inside a composite. Here another component within the composite is called. The Java code is executed outside the BPEL engine.


Re-use potential
The following blog posts links to options with the Spring Framework; When deployed inside a composite, reuse is limited to the composite. It is possible to define global Spring beans however, increasing re-use. The code can be created/debugged outside an embedding activity.

Maintenance considerations
The Spring component is relatively new to Oracle SOA Suite so some developers might not know how to work with this option yet. It's maintenance options are better then for the BPEL embedding activity. It is however still deployed as part of a composite.

External webservice

Java code can be externalized completely by for example providing it as a JAX-WS webservice or an EJB.

Performance is poor compared to the solutions described above. This is most likely due to the overhead of leaving soa-infra and the layers the message needs to pass to be able to be called from BPEL.

Re-use potential
Re-use potential is highest for this option. Even external processes can call the webservice so re-use is not limited to the same application server.

Maintenance considerations
Since the code is completely externalized, this option provides the best maintenance options. It can be developed separately by Java developers and provided to be used by an Oracle SOA developer. Also it can be replaced without requiring a server restart or composite redeploy.


The technical effort required to implement the different methods is comparable. Depending on the usecase/requirements, different options might be relevant. If performance is very important, embedding and XPath expressions might be your best choice. If maintenance and reuse are important, then externalizing the Java code in for example an external webservice might be the better option.

Summary of results
This is of course a personal opinion.

The used code with examples of all embedding options can be downloaded here;

Sunday, May 19, 2013

How to deal with services that don't support concurrency? Offer requests one at a time.

When developing service orchestrations using Oracle SOA Suite, an often encountered problem is dealing with unreliable services. This can be services which cannot handle multiple simultaneous requests (don't support concurrency) or don't have an 100% availability (usually due to nightly batches or scheduled maintenance). One way to work with these services is having a good error handling or retry mechanism in place. For example, I've previously described a fault handling mechanism based on using Advanced Queues (AQ); Using this mechanism, you can maintain the order of processing for messages and retry faulted messages. It would however be better if we can avoid faults. In case a service does not support concurrency (because of for example platform limitations or statefulness), messages will have to be offered one at a time.

If the service has a quick response time, you can make a process picking up messages from a AQ, synchronous and thus have only one running process at a time. This has been described at; It's a recommended read.

In this blog post I'll describe a mechanism which can be used if a synchronous solution would not suffice, for example in long running processes. The purpose of this blog post is to illustrate a mechanism and it's components. It should not be used as-is in a production environment but tweaked to business requirements.


I'll describe a database based mechanism which consists of several components;
- A database table holding process 'state'. In this example, CREATED, ENQUEUED, RUNNING, DONE
- A DBMS_SCHEDULER job which polls for changes. In my experience this is more stable then using the DbAdapter to do the same.
- A priority AQ to offer messages to BPEL in a specific order and allow loose coupling/flexibility/error handling mechanisms. In my experience this is very reliable.
- A BPEL process consuming AQ messages and calling the service which doesn't support concurrency. There should be only one running instance of this process at a time.

I've created a process state table which holds the process states and provides state history. I've also created a view on this table which only displays the current state. There is a column in the table PROC_NAME. This corresponds to the subscriber used in the BPEL process.

A database job polls for records every minute with state CREATED. If found and no other processes are in state ENQUEUED or RUNNING, a new message is enqueued. I've split the states ENQUEUED and RUNNING to be able to identify which messages have been picked up by the BPEL process and which haven't. There should only be one process in state RUNNING at a time.

I've created a simple HelloWorld BPEL process. This process polls for messages on the AQ. It picks up the message and informs the database that it has picked up a message (set the state to RUNNING). Next I've stubbed calling a service with a wait of one minute. After the period is over, the state is set to DONE. The process looks as followed;

At the end of this post you can download the code. To run the example however, the database needs to have a user TESTUSER with the correct grants to alllow queueing/dequeueing (see supplied script). Also in Weblogic server, there needs to be a JDBC datasource configured and a connection factory (eis/AQ/testuser) defined in the AqAdapter. You can find an example for configuring the DbAdapter at Configuration for the AqAdapter is very similar.

Running the example

First you need to create the table, trigger, AQ, package, DBMS_SCHEDULER job. This can be done by executing the supplied script.

To start testing the mechanism you can execute the following;

insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 1');
insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 2');
insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 3');

This will insert 3 records in the process table. These messages will be picked up in order. For implementations in larger applications I recommend using the PROC_SEQ field in the process table to obtain  required information for processing.

After a couple of minutes, you can see the following in the process state table;

As you can see, the messages were created at approximately the same time. The messages are picked up in order of insertion (based on ProcessId). Also as can be seen from the table, when a process is running (the period between state RUNNING and DONE), no other processes are running; there is no overlap in time.

After processing, the process view indicates the latest process state for every process. All processes are done.

In the Enterprise Manager, three processes have been executed and completed.

AQ in a clustered environment

In a clustered environment you have to mind that in an 11.2 database, AQ messages can be picked up twice from the same queue under load. Since this would break the mechanism, I suggest taking the below described workaround.

Bug: 13729601
Added: 20-February-2012
Platform: All
The dequeuer returns the same message in multiple threads in high concurrency environments when Oracle database 11.2 is used. This means that some messages are dequeued more than once. For example, in Oracle SOA Suite, if Service 1 suddenly raises a large number of business events that are subscribed to by Service 2, duplicate instances of Service 2 triggered by the same event may be seen in an intermittent fashion. The same behavior is not observed with a database or in an 11.2 database with event10852 level 16384 set to disable the 11.2 dequeue optimizations.

Workaround: Perform the following steps:

    Log in to the 11.2 database:

    Specify the following SQL command in SQL*Plus to disable the 11.2 dequeue optimizations:
    SQL> alter system set event='10852 trace name context forever,
    level 16384'scope=spfile;


The mechanism described can be used to avoid parallel execution of processes. Even when the processes are long running and synchronous execution is not an option.


The mechanism contains polling components; the DBMS_SCHEDULER job and the AqAdapter. This has two major drawbacks;
- it will cause load even when the system is idle
- it allows a period between finishing of a process and starting of the next process

You could consider starting the BPEL process actively from the database (thus avoiding polling) by using for example UTL_DBWS (see for example This however requires that the URL of the BPEL process is known in the database and that the ACL (Access Control List) is configured correctly. Also error handling should be reconsidered. The overhead of polling is minor. If a delay of 1 minute + default AqAdapter polling frequency is acceptable, a solution based on the described mechanism can be considered. Also, the DBMS_SCHEDULER job polling frequency can be reduced and the AqAdapter polling behavior can be tweaked to reduce the lost time between polls.


Ending the process with a polling action -> initiation of the next message is not advisable since it raises several new questions;
- what to do if there are no messages waiting? having a polling mechanism together with this mechanism might break the 'only one process is running at the same time'-rule
- what to do in case of errors -> when the chain is broken


I've tried a mechanism which would retire a process at the start and then reactivate it after completion. This would disallow more then one process to be running at the same time. This appeared not to be a solid mechanism. Retiring and activating a process takes time in which new messages could be picked up. Also using the Oracle SOA API during process execution adversely effects performance.

Efficiently determining the current state

I've not tested this solution with large number of processes. I think in that case I should reconsider on how to keep a process history and get to the current state efficiently in a polling run. Most likely I'd use two tables. One for the current state which I would update and another separate table for the history which I would fill with PL/SQL triggers on the current state table.


You can download the BPEL process here;

The databasecode can be downloaded here (you might want to neatify it if for example you like CDM);

Wednesday, May 1, 2013

Cleaning up unused namespaces in Oracle SOA 11g BPEL processes by using a Python script

Composites are often created and after creating, they are changed/expanded to implement functionality or bugfixes. When adding new partnerlinks and variables, removing them, importing new XSD, removing them, etc it often occurs that there are namespace definitions inside for example BPEL processes which are no longer relevant because they are not used anymore. This can adversely effect performance/memory usage and increases the chance of errors when XSD's are changed, removed or added (such as inconsistent duplicate namespace definitions).

I took this issue as a nice opportunity to learn myself a bit of Python. Python is a popular scripting language ( and used by several software vendors such as Oracle for Weblogic server; WLST ( and ESRI ( for GIS related programming. There is an official Python tutorial available on and I've also used to learn some basics.

First impression of the Python language
- I like the usage of indentation compared to the use of brackets or end statements
- code completion is far from perfect with the PyDev plug-in for Eclipse when using Python 3.3 (I needed to Google a lot for API documentation)
- even without background in Python, you can quickly get something working after reading some tutorials (although I have some experiences with other scripting languages like PERL, PHP, JavaScript. I usually use PERL for my regular scripting needs).



I wanted to create a script which would cleanup unused namespaces in XML files. I started with a BPEL file as an example (the same example as used in; It can be downloaded here; The BPEL file had the following contents;

<?xml version = "1.0" encoding = "UTF-8" ?>
  Oracle JDeveloper BPEL Designer
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
<process name="CallHelloWorld"

    <import namespace="" location="CallHelloWorld.wsdl" importType=""/>
        List of services participating in this BPEL process              
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    <partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld"

      List of messages and XML documents used within this BPEL process
    <!-- Reference to the message passed as input during initiation -->
    <variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <variable name="InvokeHelloWorld_process_InputVariable"
    <variable name="InvokeHelloWorld_process_OutputVariable"

     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
  <sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <assign name="Assign1">
    <invoke name="InvokeHelloWorld"
            partnerLink="HelloWorld" portType="ns1:HelloWorld"
    <assign name="Assign2">
    <!-- Generate reply to synchronous request -->
    <reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>

The ora namespace was not used in this process but it is specified in the process tag.


Based on the following I started with lxml; since it had a function to easily clean unused namespaces and on there were many posts on lxml. To install lxml for Windows, I had to first download and install it from


My first try was the following;

import lxml.etree as et
tree = et.parse(filename_in)

In the output I noticed that although my process definition had been cleaned up from

<process name="CallHelloWorld"


<process xmlns="" xmlns:bpelx="" name="CallHelloWorld" targetNamespace="">

Several namespaces were removed which were in use such as the ns1 namespace which was used in a partnerlink definition as part of an attribute value;

<partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>

My conclusion was that using prebuild functions like the above would most likely not help solve this problem. I did not find a way to limit the functionality to specific namespaces.

Determining used namespaces

I tried a different approach; determine namespaces used in the root element and try to find them on different locations in the BPEL file. Then rewriting the root element.

import lxml.etree as et
import copy
tree = et.parse(filename_in)
nsmapnew= copy.deepcopy(nsmap)

#print (nsmap.values())
print ("Namespaces found: "+str(len(nsmap)))
for nsitem in nsmap:
    if nsitem != None:
        print ("Processing prefix: "+nsitem+" Namespace: "+nsmap.get(nsitem))
        #processing all elements
        walkAll = tree.getiterator()
        for elt in walkAll:
            #check element
            if eltns==nsmap.get(nsitem):
                #print("Found namespace as element namespace")
            if str(elt.text).find(nsitem+":") != -1:
                #print("Found prefix as part of element text")
            #check attributes
            for attribute in elt.attrib:
                if attribute.startswith("{"+nsmap.get(nsitem)+"}"):
                    #print ("Found namespace as attribute name namespace")
                if str(elt.attrib[attribute]).find(nsitem+":") != -1:
                    #print ("Found prefix as part of attribute value")
        #default namespace not removing
    if found==0:
        print("Not found")
        del nsmapnew[nsitem]
print ("Namespaces remaining: "+str(len(nsmapnew)))

new_root = et.Element(root.tag, attrib=root.attrib,nsmap=nsmapnew)
new_root[:] = root[:]

#to add the top level comment

tree = et.ElementTree(new_root)

tree.write(filename_out, xml_declaration=True, encoding='utf-8',pretty_print=True) 

This rewrote my XML like I wanted it to. Even if namespaces were used in XPATH expressions in the BPEL file, they were being recognized as being used. One drawback however was the namespace prefix which was added for all the subelements even though these elements were in the default namespace. I considered using a transformation to fix this. This would however remove the comments from the file so I decided not to. The script output was as followed;

Namespaces found: 6
Processing prefix: ns1 Namespace:
Processing prefix: ora Namespace:
Not found
Processing prefix: client Namespace:
Processing prefix: bpel Namespace:
Processing prefix: bpelx Namespace:
Namespaces remaining: 5

The created output file was as followed;

<?xml version='1.0' encoding='UTF-8'?>
  Oracle JDeveloper BPEL Designer
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
<bpel:process xmlns:ns1="" xmlns:client="" xmlns:bpel="" xmlns:bpelx="" xmlns="" name="CallHelloWorld" targetNamespace=""><bpel:import namespace="" location="CallHelloWorld.wsdl" importType=""/>
        List of services participating in this BPEL process              
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    <bpel:partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <bpel:partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>

      List of messages and XML documents used within this BPEL process
    <!-- Reference to the message passed as input during initiation -->
    <bpel:variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <bpel:variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_InputVariable" messageType="ns1:HelloWorldRequestMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_OutputVariable" messageType="ns1:HelloWorldResponseMessage"/>

     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
  <bpel:sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <bpel:receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <bpel:assign name="Assign1">
    <bpel:invoke name="InvokeHelloWorld" partnerLink="HelloWorld" portType="ns1:HelloWorld" operation="process" inputVariable="InvokeHelloWorld_process_InputVariable" outputVariable="InvokeHelloWorld_process_OutputVariable" bpelx:invokeAsDetail="no"/>
    <bpel:assign name="Assign2">
    <!-- Generate reply to synchronous request -->
    <bpel:reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>

The ora namespace had been removed from the process attribute. The file had also become smaller. The ora namespace is however a default Oracle namespace and to make sure I didn't break anything, I tried to compile and deploy the altered process. This was successful. I also tested it with more complex processes. The resulting BPEL file was still fully functional.

Possible followups could be;

- expand the script and allow recursively processing of multiple files and filetypes
- determine used and unused namespaces for every element, not just the root element
- link the results of unused namespaces within a project to XSD's which could also be removed from a project if they are not used by any file in the project
- if for example problems start to occur with XPATH expressions after removing default Oracle namespaces; exclude specific namespaces from cleaning
- remove the namespace prefix for the default namespace