Wednesday, May 1, 2013

Cleaning up unused namespaces in Oracle SOA 11g BPEL processes by using a Python script

Composites are often created and after creating, they are changed/expanded to implement functionality or bugfixes. When adding new partnerlinks and variables, removing them, importing new XSD, removing them, etc it often occurs that there are namespace definitions inside for example BPEL processes which are no longer relevant because they are not used anymore. This can adversely effect performance/memory usage and increases the chance of errors when XSD's are changed, removed or added (such as inconsistent duplicate namespace definitions).

I took this issue as a nice opportunity to learn myself a bit of Python. Python is a popular scripting language (http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html) and used by several software vendors such as Oracle for Weblogic server; WLST (http://docs.oracle.com/cd/E14571_01/web.1111/e13813/quick_ref.htm) and ESRI (http://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=224&moduleID=0) for GIS related programming. There is an official Python tutorial available on http://docs.python.org/3.3/tutorial/ and I've also used http://www.vogella.com/articles/Python/article.html to learn some basics.

First impression of the Python language
- I like the usage of indentation compared to the use of brackets or end statements
- code completion is far from perfect with the PyDev plug-in for Eclipse when using Python 3.3 (I needed to Google a lot for API documentation)
- even without background in Python, you can quickly get something working after reading some tutorials (although I have some experiences with other scripting languages like PERL, PHP, JavaScript. I usually use PERL for my regular scripting needs).

Implementation

Purpose

I wanted to create a script which would cleanup unused namespaces in XML files. I started with a BPEL file as an example (the same example as used in; http://javaoraclesoa.blogspot.nl/2013/04/soa-suite-ps6-11117-service-loose.html. It can be downloaded here; https://dl.dropboxusercontent.com/u/6693935/blog/HelloWorldTokens.zip). The BPEL file had the following contents;

<?xml version = "1.0" encoding = "UTF-8" ?>
<!--
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  Oracle JDeveloper BPEL Designer
 
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
-->
<process name="CallHelloWorld"
               targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
               xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns:ora="http://schemas.oracle.com/xpath/extension"
               xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
         xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
         xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld">

    <import namespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" location="CallHelloWorld.wsdl" importType="http://schemas.xmlsoap.org/wsdl/"/>
    <!--
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
        PARTNERLINKS                                                     
        List of services participating in this BPEL process              
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    -->
  <partnerLinks>
    <!--
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    -->
    <partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld"
                 partnerRole="HelloWorldProvider"/>
  </partnerLinks>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
      VARIABLES                                                       
      List of messages and XML documents used within this BPEL process
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <variables>
    <!-- Reference to the message passed as input during initiation -->
    <variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <variable name="InvokeHelloWorld_process_InputVariable"
              messageType="ns1:HelloWorldRequestMessage"/>
    <variable name="InvokeHelloWorld_process_OutputVariable"
              messageType="ns1:HelloWorldResponseMessage"/>
  </variables>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <assign name="Assign1">
      <copy>
        <from>$inputVariable.payload/client:input</from>
        <to>$InvokeHelloWorld_process_InputVariable.payload/ns1:input</to>
      </copy>
    </assign>
    <invoke name="InvokeHelloWorld"
            partnerLink="HelloWorld" portType="ns1:HelloWorld"
            operation="process"
            inputVariable="InvokeHelloWorld_process_InputVariable"
            outputVariable="InvokeHelloWorld_process_OutputVariable"
            bpelx:invokeAsDetail="no"/>
    <assign name="Assign2">
      <copy>
        <from>$InvokeHelloWorld_process_OutputVariable.payload/ns1:result</from>
        <to>$outputVariable.payload/client:result</to>
      </copy>
    </assign>
    <!-- Generate reply to synchronous request -->
    <reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>
  </sequence>
</process>


The ora namespace was not used in this process but it is specified in the process tag.

lxml

Based on the following I started with lxml; http://lxml.de/api/lxml.etree-module.html#cleanup_namespaces since it had a function to easily clean unused namespaces and on StackOverflow.com there were many posts on lxml. To install lxml for Windows, I had to first download and install it from http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

cleanup_namespaces

My first try was the following;

import lxml.etree as et
filename_in='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel'
filename_out='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel.out'
tree = et.parse(filename_in)
et.cleanup_namespaces(tree)
tree.write(filename_out)

In the output I noticed that although my process definition had been cleaned up from

<process name="CallHelloWorld"
               targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
               xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns:ora="http://schemas.oracle.com/xpath/extension"
               xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
         xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
         xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld">

To

<process xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable" xmlns:bpelx="http://schemas.oracle.com/bpel/extension" name="CallHelloWorld" targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld">

Several namespaces were removed which were in use such as the ns1 namespace which was used in a partnerlink definition as part of an attribute value;

<partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>

My conclusion was that using prebuild functions like the above would most likely not help solve this problem. I did not find a way to limit the functionality to specific namespaces.

Determining used namespaces

I tried a different approach; determine namespaces used in the root element and try to find them on different locations in the BPEL file. Then rewriting the root element.

import lxml.etree as et
import copy
filename_in='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel'
filename_out='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel.out'
tree = et.parse(filename_in)
root=tree.getroot()
nsmap=root.nsmap
nsmapnew= copy.deepcopy(nsmap)

#print (nsmap.values())
namespaces=set(nsmap.values())
print ("Namespaces found: "+str(len(nsmap)))
for nsitem in nsmap:
    found=0
    nscount=0;
    if nsitem != None:
        print ("Processing prefix: "+nsitem+" Namespace: "+nsmap.get(nsitem))
        #processing all elements
        walkAll = tree.getiterator()
        for elt in walkAll:
            #check element
            eltns=elt.xpath('namespace-uri(/*)')
            if eltns==nsmap.get(nsitem):
                found=1
                #print("Found namespace as element namespace")
                break
            if str(elt.text).find(nsitem+":") != -1:
                found=1
                #print("Found prefix as part of element text")
                break
            #check attributes
            for attribute in elt.attrib:
                if attribute.startswith("{"+nsmap.get(nsitem)+"}"):
                    found=1
                    #print ("Found namespace as attribute name namespace")
                    break
                if str(elt.attrib[attribute]).find(nsitem+":") != -1:
                    found=1
                    #print ("Found prefix as part of attribute value")
                    break
    else:
        #default namespace not removing
        found=1
    if found==0:
        print("Not found")
        del nsmapnew[nsitem]
print ("Namespaces remaining: "+str(len(nsmapnew)))

new_root = et.Element(root.tag, attrib=root.attrib,nsmap=nsmapnew)
new_root[:] = root[:]

#to add the top level comment
try:
    firstcomment=root.getprevious()
    new_root.addprevious(firstcomment)
except:
    None

tree = et.ElementTree(new_root)

tree.write(filename_out, xml_declaration=True, encoding='utf-8',pretty_print=True) 


This rewrote my XML like I wanted it to. Even if namespaces were used in XPATH expressions in the BPEL file, they were being recognized as being used. One drawback however was the namespace prefix which was added for all the subelements even though these elements were in the default namespace. I considered using a transformation to fix this. This would however remove the comments from the file so I decided not to. The script output was as followed;

Namespaces found: 6
Processing prefix: ns1 Namespace: http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld
Processing prefix: ora Namespace: http://schemas.oracle.com/xpath/extension
Not found
Processing prefix: client Namespace: http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld
Processing prefix: bpel Namespace: http://docs.oasis-open.org/wsbpel/2.0/process/executable
Processing prefix: bpelx Namespace: http://schemas.oracle.com/bpel/extension
Namespaces remaining: 5


The created output file was as followed;

<?xml version='1.0' encoding='UTF-8'?>
<!--
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  Oracle JDeveloper BPEL Designer
 
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
-->
<bpel:process xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld" xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable" xmlns:bpelx="http://schemas.oracle.com/bpel/extension" xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable" name="CallHelloWorld" targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"><bpel:import namespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" location="CallHelloWorld.wsdl" importType="http://schemas.xmlsoap.org/wsdl/"/>
    <!--
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
        PARTNERLINKS                                                     
        List of services participating in this BPEL process              
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    -->
  <bpel:partnerLinks>
    <!--
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    -->
    <bpel:partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <bpel:partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>
  </bpel:partnerLinks>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
      VARIABLES                                                       
      List of messages and XML documents used within this BPEL process
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <bpel:variables>
    <!-- Reference to the message passed as input during initiation -->
    <bpel:variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <bpel:variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_InputVariable" messageType="ns1:HelloWorldRequestMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_OutputVariable" messageType="ns1:HelloWorldResponseMessage"/>
  </bpel:variables>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <bpel:sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <bpel:receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <bpel:assign name="Assign1">
      <bpel:copy>
        <bpel:from>$inputVariable.payload/client:input</bpel:from>
        <bpel:to>$InvokeHelloWorld_process_InputVariable.payload/ns1:input</bpel:to>
      </bpel:copy>
    </bpel:assign>
    <bpel:invoke name="InvokeHelloWorld" partnerLink="HelloWorld" portType="ns1:HelloWorld" operation="process" inputVariable="InvokeHelloWorld_process_InputVariable" outputVariable="InvokeHelloWorld_process_OutputVariable" bpelx:invokeAsDetail="no"/>
    <bpel:assign name="Assign2">
      <bpel:copy>
        <bpel:from>$InvokeHelloWorld_process_OutputVariable.payload/ns1:result</bpel:from>
        <bpel:to>$outputVariable.payload/client:result</bpel:to>
      </bpel:copy>
    </bpel:assign>
    <!-- Generate reply to synchronous request -->
    <bpel:reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>
  </bpel:sequence>
</bpel:process>


The ora namespace had been removed from the process attribute. The file had also become smaller. The ora namespace is however a default Oracle namespace and to make sure I didn't break anything, I tried to compile and deploy the altered process. This was successful. I also tested it with more complex processes. The resulting BPEL file was still fully functional.

Possible followups could be;

- expand the script and allow recursively processing of multiple files and filetypes
- determine used and unused namespaces for every element, not just the root element
- link the results of unused namespaces within a project to XSD's which could also be removed from a project if they are not used by any file in the project
- if for example problems start to occur with XPATH expressions after removing default Oracle namespaces; exclude specific namespaces from cleaning
- remove the namespace prefix for the default namespace

2 comments:

  1. Hi ,

    I am dealing with a scenario where i have to interact with some web service which entertain XML argument input and return XML output only,(my service is on weblogic and the service which i am trying to call is on glassfish)
    ----------------------------------------------------------------------------------------------------------------------------------------------
    example:- my BPEL's inputs and outputs corresponding webservice inputs and outputs are like.

    @ BPEL input and outputs
    ---------------------------------------------
    *Input to my BPEL-->>
    request ---username type "string"
    ---password type "string"


    *output from my BPEL-->>
    response ---status type=" string"
    ---code type ="string"



    @ the webservice which i am trying to call: input and outputs

    inputXML


    outputXML
    ----------------------------------------------------------------------------------------------------------------------------------------------
    the reference side have only one input i.e input XML.. and also after successful login it should give out the response in such a format so that my BPEL can interact with that so until and unless i did not found a solution to convert my input to the BPEL (input xsd) to XML i will not able to do so .I have tried the following -- Using the XPath function-- ora:parseEscapedXML() to take an XML string and convert it to DOM format for processing in BPEL but getting Xpath error. Please Please help me out if you have any solutions.

    #Secondly
    To escape the above deadlock we place a wrapper layer between both the service my services are running on weblogic and the service which i am trying to call is on glassfish ,with this approach we are Successfully hitting the service but not getting any response but instead getting error-----"error getting response;java.net.SocketTimeoutException:Read timed in weblogic" please help me out if you have any solutions..

    Thanks and Regards
    Abhishek Ajral

    ReplyDelete
    Replies
    1. Hi,

      Your questions/problems are not very clear to me. If you want to convert a string to an XML, you can use ora:parseEscapedXML() and assign the resulting XML to a variable of the same type. 'getting Xpath error' is not very specific. Which error are you getting and when are you getting it?

      If you want to convert an XML element to a string you can use ora:getContentAsString().

      I also do not understand your second question. Can you provide an image of what you are trying to accomplish? What do you want to do with the wrapper?

      Also you can't directly post XML in comments on this blog but have to use HTML escape characters (http://www.theukwebdesigncompany.com/articles/entity-escape-characters.php)

      With kind regards,
      Maarten

      Delete