Friday, October 26, 2012

Flexibility in generating PDF's from BPEL by using IText and XHTML

I needed a quick but flexible way to generate PDF files from BPEL. The project I was on, was using the iText PDF library; http://itextpdf.com/. I encountered this library before in the Oracle SOA Suite 11g Handbook from Lucas Jellema. Here the library was used amongst other things to demonstrate the Spring component. I decided to use this library but took a different approach as to what Lucas describes in his book since I did not want to hardcode the layout of my PDF in Java code.

The IText library has code to convert XHTML to a PDF. XHTML can be manipulated like any other XML in BPEL by using a transformation. This way I could put the logic for the layout in an XSL file and make layouting easy because there are few programmers who don't know HTML.

Implementation

You can download the code/samples by using the links in the Example part of the post.

GeneratePdf webservice

First I created a JAX-WS webservice to write my PDF on the file system and a servlet to download the PDF. The servlet is not secure as it allows downloading files from the filesystem if you know the filename. Do not use this in production environments!

I included the following JAR files for this;
itextpdf-5.3.1.jar
itextpdfa-5.3.1.jar
itext-xtra-5.3.1.jar
xmlworker-1.1.5.jar

The code I used was the following;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.XMLWorkerHelper;

import java.io.File;
import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.io.StringReader;

import java.io.StringWriter;

import javax.jws.WebMethod;
import javax.jws.WebParam;
import javax.jws.WebResult;
import javax.jws.WebService;

@WebService
public class GeneratePdf {
  public GeneratePdf() {
    super();
  }
 
  @WebMethod
  @WebResult(name = "status")
  public String HtmlToPdf(@WebParam(name = "pdfpath")String pdfpath, @WebParam(name = "xhtml")String html) {
    Document document = new Document();
    PdfWriter writer;
    long size=0;
    try {
      writer = PdfWriter.getInstance(document, new FileOutputStream(pdfpath));
      document.open();
      XMLWorkerHelper.getInstance().parseXHtml(writer, document, new StringReader(html));
      document.close();
      size=new File(pdfpath).length();
    } catch (Exception e) {
      return "NOK: "+stackTraceToString(e);
    }
    return "OK: "+"File created. Size: "+size;
  }
 
  @WebMethod(exclude=true)
  public static String stackTraceToString(Throwable e) {
    StringWriter sw = new StringWriter();
    e.printStackTrace(new PrintWriter(sw));
    return sw.toString();

  }
  /*
  public static void main(String[] args) {
    GeneratePdf myPdf = new GeneratePdf();
    System.out.println(myPdf.HtmlToPdf("c:\\temp\\output3.pdf", "<html><head/><body><p>Hello world</p></body></html>"));
  }
  */
}


DownloadPdf servlet

The code I used for the servlet to download the PDF is the following;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class DownloadPdf extends HttpServlet {
  private static final long serialVersionUID = 4537516508557609572L;

  public DownloadPdf() {
    super();
  }
 
  public void doGet(HttpServletRequest request, HttpServletResponse response) 
  throws ServletException, IOException, FileNotFoundException  { 

   System.out.println("parameter filename: "+request.getParameter("filename"));
   long length = new File(request.getParameter("filename")).length();
   System.out.println("size: "+length);
   InputStream is = new FileInputStream(new File(request.getParameter("filename"))); 
   
   response.addHeader("content-disposition", "attachment; filename=mypdf.pdf;");

      response.setContentType("application/pdf");
   response.setContentLength(Long.valueOf(length).intValue());
    int read =0; 
   byte[] bytes = new byte[1024];
     OutputStream os = response.getOutputStream(); 
    while((read = is.read(bytes)) != -1) 
   {    os.write(bytes, 0, read); 
   } 
   os.flush(); 
   os.close(); 
  } 
}


Converting to XHTML

The above webservice and servlet are preparation to be able to be flexible in BPEL with generating PDF's. To be able to transform messages in BPEL to XHTML, it's quite useful XHTML has XSD's. See for example http://www.w3.org/2002/08/xhtml/xhtml1-strict.xsd.

Of course this XSD is quite complex and it did not immediately work in JDeveloper so I had to make some alterations. I removed the following import;

<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/xml.xsd"/>

And I removed all references to the xml namespace. These were only 2 lines with a small number of occurences;
<xs:attribute ref="xml:lang"/>
<xs:attribute ref="xml:space" fixed="preserve"/>

Now I was able to define a variable of the type specified in the XHTML XSD. I used the variable for a transformation. In this example I created a Hello [name] XHTML. This XHTML I used to call the webservice to create the PDF. The below example contains the XSL transformation I created.

<?xml version="1.0" encoding="UTF-8" ?>
<?oracle-xsl-mapper
  <!-- SPECIFICATION OF MAP SOURCES AND TARGETS, DO NOT MODIFY. -->
  <mapSources>
    <source type="WSDL">
      <schema location="../HelloWorldPDF_BPEL.wsdl"/>
      <rootElement name="process" namespace="http://xmlns.oracle.com/PdfUtils/HelloWorldPDF/HelloWorldPDF_BPEL"/>
    </source>
  </mapSources>
  <mapTargets>
    <target type="WSDL">
      <schema location="../HelloWorldPDF_BPEL.wsdl"/>
      <rootElement name="html" namespace="http://www.w3.org/1999/xhtml"/>
    </target>
  </mapTargets>
  <!-- GENERATED BY ORACLE XSL MAPPER 11.1.1.6.0(build 111214.0600.1553) AT [FRI OCT 26 09:44:31 CEST 2012]. -->
?>
<xsl:stylesheet version="1.0"
                xmlns:bpws="http://schemas.xmlsoap.org/ws/2003/03/business-process/"
                xmlns:xp20="http://www.oracle.com/XSL/Transform/java/oracle.tip.pc.services.functions.Xpath20"
                xmlns:mhdr="http://www.oracle.com/XSL/Transform/java/oracle.tip.mediator.service.common.functions.MediatorExtnFunction"
                xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
                xmlns:oraext="http://www.oracle.com/XSL/Transform/java/oracle.tip.pc.services.functions.ExtFunc"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns:dvm="http://www.oracle.com/XSL/Transform/java/oracle.tip.dvm.LookupValue"
                xmlns:hwf="http://xmlns.oracle.com/bpel/workflow/xpath"
                xmlns:plnk="http://docs.oasis-open.org/wsbpel/2.0/plnktype"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:med="http://schemas.oracle.com/mediator/xpath"
                xmlns:ids="http://xmlns.oracle.com/bpel/services/IdentityService/xpath"
                xmlns:bpm="http://xmlns.oracle.com/bpmn20/extensions"
                xmlns:xdk="http://schemas.oracle.com/bpel/extension/xpath/function/xdk"
                xmlns:xref="http://www.oracle.com/XSL/Transform/java/oracle.tip.xref.xpath.XRefXPathFunctions"
                xmlns:client="http://xmlns.oracle.com/PdfUtils/HelloWorldPDF/HelloWorldPDF_BPEL"
                xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                xmlns:ns1="http://www.w3.org/1999/xhtml"
                xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/"
                xmlns:bpmn="http://schemas.oracle.com/bpm/xpath"
                xmlns:ora="http://schemas.oracle.com/xpath/extension"
                xmlns:socket="http://www.oracle.com/XSL/Transform/java/oracle.tip.adapter.socket.ProtocolTranslator"
                xmlns:ldap="http://schemas.oracle.com/xpath/extension/ldap"
                exclude-result-prefixes="xsi xsl plnk client xsd ns1 wsdl bpws xp20 mhdr bpel oraext dvm hwf med ids bpm xdk xref bpmn ora socket ldap">
  <xsl:template match="/">
    <ns1:html>
      <ns1:body>
        <ns1:p>
          <xsl:value-of select="concat('Hello ',/client:process/client:input)"/>
        </ns1:p>
      </ns1:body>
    </ns1:html>
  </xsl:template>
</xsl:stylesheet>



You can also observe in the XSL that the format used for the actual layout of the PDF is XHTML.

Example

Now putting it all together;
Download and deploy the PDF webservice/servlet: https://dl.dropbox.com/u/6693935/blog/PdfUtils.zip

Download and deploy the sample BPEL project; https://dl.dropbox.com/u/6693935/blog/PdfUtilsBPEL.zip. Keep in mind that you will most likely need to change the URL of the PDF webservice.

Test the webservice

Check the PDF is created by looking at the /tmp folder of your server running the SOA Suite. I used /tmp since I used an Oracle Enterprise Linux SOA Suite installation. You might want to change the path if you're running the server on Windows.

After you've determined the filename, you can look at the result. Download the PDF by going to (in my case); http://soabpm-vm:7001/PdfUtils//DownloadPdf.do?filename=/tmp/PDF2012-10-26T00:51:19.pdf

 
Final thoughts

The conversion from XHTML to PDF is not perfect. Complex layouts cause problems. I did some try-outs with tables and they were only partially succesful. For more complex layouts, it's worthwhile to look at Apache FOP; http://xmlgraphics.apache.org/fop/ and base your PDF generation on that. Apache FOP can also be used with Oracle APEX. A drawback is that it introduces it's own layouting language.

If you need to convert newlines to <br/> statements (for example when converting texts), you can look at; http://www.danrigsby.com/blog/index.php/2008/01/03/preserving-line-breaks-in-xml-while-transforming-to-html-with-xslt/ for an XSL template to use.