Showing posts with label monitor. Show all posts
Showing posts with label monitor. Show all posts

Sunday, April 29, 2018

A simple dashboard to monitor HTTP endpoints

To monitor different environments, it is not unusual to use a monitoring dashboard to obtain information about the status of different servers. This blog describes some considerations for implementing a simple monitoring dashboard and some of the challenges I encountered. The simple-dashboard I've used in this blog runs solely from a browser and does not have a server side component.

Saturday, November 2, 2013

A first look at Oracle Business Transaction Management (BTM)


Monitoring and debugging Oracle SOA Suite environments is often a topic not paid much attention to. The people who write the software are often not much involved in the maintenance of the running software and the maintenance people do not have much application knowledge.

Usually Oracle SOA applications are composed of several components which interact with each other. These components are often of different technologies. Personally I tend to use databases, BPEL processes and Java Webservices a lot. For the maintenance people it is difficult to understand all application call chains and debug these in case of problems.

I was curious if the Oracle BTM product would provide a solution for this; http://www.oracle.com/technetwork/oem/btm-496775.html. BTM is part of the SOA Management Pack Enterprise Edition. It used to be a product from AmberPoint before Oracle bought them in 2010. I've used SOA Suite 11.1.1.7 with BTM 12.1.0.4.1 on an XE 11r2 Db for this tryout. I did not read much documentation about this product or followed any courses, so it would also be a test of intuitiveness.

Wednesday, March 27, 2013

Monitoring Oracle SOA Suite 11g composites using HTTP polling

This post is about efficient vendor neutral monitoring of Oracle SOA composites by providing a mechanism which allows HTTP polling for the state/mode on a per composite basis. First I'll describe what could be the reason to monitor composite process state/mode. Then I'll provide code which can be used to allow HTTP polling of the process state/mode in order to link it to various monitoring tools or custom dashboards.

What is composite state?

There are two properties related to the behavior of a composite in relation to accepting requests and processing already running instances; composite mode and composite state. The mode can be active or retired and the state can be on or off. If the mode is retired, running instances will finish but no new instances can be created. Active means the service is ready to accept requests. If the state is off, the composite is not loaded (shutdown) and as such will not process running requests or pickup new requests. If the state is on, the composite is loaded. The mode and state can be manually changed from Enterprise Manager Fusion Middleware Control. They can also be changed by accessing the API for example as part of an exception handling mechanism.

In the below image you can see the state/mode of all the composites in a partition from Fusion Middleware Control;

In the below image you can see the buttons in Fusion Middleware control to retire/shutdown a composite. Once this is done, buttons for activating/starting the composite become visible.


Why can composite state be important to monitor?

Starting the server

When an Oracle SOA Suite server starts, resources can be pulled in and loaded as part of a composite. Such can be the case when external webservices are called and remote XSD's and WSDL's are used (references). If resources can not be loaded this way, the loading of the entire server might stall. In the Weblogic console however, the server will have the state RUNNING. The SOA Server however will not allow much actions from the webinterface since it is still starting. How to check the SOA serverstate with a servlet is illustrated on http://javaoraclesoa.blogspot.nl/2012/11/soa-suite-cluster-deployments-and.html. If the server is starting, usually the composite which is being started will have state off. Composites which have not been started yet will have state unknown. If the server hangs during start, this can help identifying the problem. Sometimes a manual undeploy is required of a faulty process. This is described on; http://shrikworld.blogspot.nl/2011/04/how-to-undeploy-composite-manually.html. Monitoring the state/mode of individual composites might provide an indication of start-up issues in addition to monitoring the entire SOA Server state.

Exception handling

When using certain methods of exception handling in Oracle SOA Suite 11g, processes can get the retired state in order to stop processing messages after a fault situation. For example, when using the method described on; http://javaoraclesoa.blogspot.nl/2012/06/exception-handling-fault-management-and.html. Usually an error event is triggered linked to sending an e-mail to the person responsible for solving the problem. This person however can take some time before taking action or checking/responding to his mail. In the mean time, the number of requests which need to be processed can increase, requiring more time after activating the composite again before the system is capable of processing new requests. Sometimes this mechanism might be sufficient but a little redundancy in informing people doesn't hurt.

Common monitoring tools

Organizations (especially the larger ones) often use monitoring tooling to determine the state of their server park. If a critical system fails, these monitoring tools immediately inform the person responsible for maintaining server stability, even at night or in weekends. This avoids the problem which can arise when the mail indicating a problem is overlooked. Also usually these monitoring tools have dashboards at which more people look, so the chance the problem is solved quickly increases.

Integrating the raising of the error event from a composite with specific monitoring tooling causes a direct dependency (tight coupling) between the monitoring tooling and the composite error handling mechanism. Also it will often require different disciplines/departments to achieve. Because of this, it is often not advisable to do.

The monitoring tools are not always from the same software vendor as the system which need to be monitored and thus specific components and states/modes are usually not fully supported. Certain common vendor neutral monitoring mechanisms are often used to allow monitoring of diverse systems. Usually these tools allow HTTP polling mechanisms. For example HP SiteScope (http://www8.hp.com/us/en/software-solutions/software.html?compURI=1174244#.UVKxsleyJh4).

Monitoring composite state by using HTTP polling

To monitor composite state via a HTTP polling mechanism, a HttpServlet can be deployed on the Oracle SOA server. The servlet provided in this post accepts an HTTP POST or GET request with the following parameters; name, partition, revision.

It then selects the composite and returns it's state in the response. It's code is based on the previously mentioned sample on; http://javaoraclesoa.blogspot.nl/2012/11/soa-suite-cluster-deployments-and.html. There the code is also explained.

You can test the process after deployment like (where of course hostname needs to be replaced and you should refer to a  composite which is present in your environment);
http://soabpm-vm:7001/DetermineBPELProcessStatus/determinebpelprocessstatus?name=HelloWorld&revision=1.0&partition=default

The output in my case is State: on, Mode: active

To test it's function, you can retire the composite and confirm the servlet returns; State: on, Mode: retired. You can also check the presence of a composite. If you provide a name/revision/partition which is not a valid composite, the State and Mode field will remain empty.

You can download the code here; https://dl.dropbox.com/u/6693935/blog/DetermineCompositeStatus.zip

Friday, September 7, 2012

Monitoring DataSources on Weblogic


As an Oracle SOA developer, I've often heard the phrase; 'BPEL doesn't work!'. Almost always the cause can be found in backend systems which do not function as expected. This error becomes visible when executing a service which uses a specific resource. When people start complaining about BPEL, usually this is an indication you should work on process feedback and error handling so the responsible party can quickly be identified. A trial and error mechanism is however often not what you want. A dashboard or script to monitor backend databases can help prevent such issues.

Often development and system test databases are not monitored as thoroughly as acceptance test or production environments. To be able to quickly identify for example a database which is malfunctioning (for example put down for maintenance without informing the developers) it is useful to have some tools and scripts available which you can run for the occasion. Usually this is quicker then using the Enterprise Manager. This is especially useful in complex environments where multiple systems are linked. In these scripts/tools, it is not a good idea to have the databases/users/passwords hardcoded, because that would require maintenance of the scripts in case of changes and as a lazy developer you of course don't want that.

In this article I will describe two possible options for monitoring DataSources on Weblogic servers.

- The first option is a servlet which uses JNDI to obtain JDBC DataSources. This has the drawback that if the DataSource is not loaded correctly or has been disabled, it cannot be looked up using JNDI and is not visible. It can however also be used when the Db/Aq adapter is not used. A servlet can be accessed by anyone, reducing the amount of technical knowledge required to monitor the databases.

- The second option is by using WLST to obtain DataSources defined in the DbAdapter/AqAdapter and provide statistics. This is specific to the Db/Aq adapter and it's a WLST script, so a Middleware installation and login credentials to the server are required in order to execute it.

Both methods query for available DataSources. The DataSource is used so no usernames/passwords/hostnames/sids etc are required.

Implementation

Java

The below servlet does a JNDI lookup of JDBC DataSources and does a 'select sysdate from dual' on them. If the DataSource is not available (can not be looked up via JNDI), it will not appear in the list. If for example a tablespace is full or an account is locked, you will however see it in the list as NOK (short for Not OK). It has not been extensively tested in error situations!

Output of the servlet can be for example;

When I lock the testuseraccount and reset the connectionpool;


Below is the servlet code. It can of course easily be improved (some people like colors and nice layouts while I tend to focus on functionality).

package ms.testapp;

import java.io.IOException;
import java.io.PrintWriter;
import java.io.StringWriter;

import java.sql.Connection;
import java.sql.Statement;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Date;
import java.util.Hashtable;

import javax.naming.Binding;
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.naming.NamingEnumeration;
import javax.naming.NamingException;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import javax.sql.DataSource;

public class CheckDb extends HttpServlet {

    @SuppressWarnings("compatibility:-5693855291723951046")
    private static final long serialVersionUID = 1L;
    public CheckDb() {
        super();
    }
    public void doGet(HttpServletRequest request,
        HttpServletResponse response) throws ServletException,
            IOException {
        response.setContentType("text/html");
        PrintWriter out = response.getWriter();
        out.println("<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 " +
                   "Transitional//EN\">\n" +
                "<HTML>\n" +
                "<HEAD><TITLE>Datasource status</TITLE></HEAD>\n" +
                "<BODY>\n" +
                listJDBCContextTable
                () + "</BODY></HTML>");
    }
    private Context getContext() throws NamingException {
        Hashtable myCtx = new Hashtable();
        myCtx.put(Context.INITIAL_CONTEXT_FACTORY,
                 "weblogic.jndi.WLInitialContextFactory");
        Context ctx = new InitialContext(myCtx);
        return ctx;
    }
    private String checkDataSource(DataSource ds) {
        try {
            Connection conn = ds.getConnection();
            Statement st = conn.createStatement();
            st.execute("select sysdate mydate from dual");
            st.getResultSet().next();
            Date mydate = st.getResultSet().getDate("mydate");
            conn.close();
            String date = mydate.toString();
            if (date.length() == 10 && date.indexOf("-") == 4 && date.
                lastIndexOf("-") == 7) {
                return "OK";
            } else {
                return "NOK";
            }
        } catch (Exception e) {
            return "NOK"; //getStackTrace(e);
        }
    }
    private static String getStackTrace(Throwable e) {
        StringWriter sw = new StringWriter();
        PrintWriter pw = new PrintWriter(sw);
        e.printStackTrace(pw);
        return sw.toString();
    }
    private String listJDBCContextTable() {
        String output = "<table>";
        ArrayList<String> tab = new ArrayList<String>();
        String line = "";
        try {

           tab = listContext((Context)getContext().lookup("jdbc"), "", tab);
            Collections.sort(tab);

           for (int i = 0; i < tab.size(); i++) {
                output += tab.get(i);
            }
            output += "</table>";
            return output;
        } catch (NamingException e) {
            return getStackTrace(e);
        }
    }
    private ArrayList<String> listContext(Context ctx, String indent,
        ArrayList<String> output) throws NamingException {
        String name = "";
        try {
            NamingEnumeration list = ctx.listBindings("");
            while (list.hasMore()) {
                Binding item = (Binding)list.next();
                String className = item.getClassName();
                name = item.getName();
                if (!(item.getObject() instanceof DataSource)) {
                    //output = output+indent + className + " " + name+"\n";
                } else {
                    output.add("<tr><td>" + name + "</td><td>" +
                              checkDataSource((DataSource)item.getObject()) +
                              "</td></tr>");
                }
                Object o = item.getObject();
                if (o instanceof javax.naming.Context) {
                    listContext((Context)o, indent + " ", output);
                }
            }
        } catch (NamingException ex) {
            output.add("<tr><td>" + name + "</td><td>" + getStackTrace(ex) +
                      "</td></tr>");
        }
        return output;
    }
}

You can download the JDev 11.1.1.6 project here; https://dl.dropbox.com/u/6693935/blog/DbUtils.zip 

Also I found that not all DataSources allow remote JDBC calls such as the MDS DataSource. When using a servlet, this is not a problem since the Java code runs on the server. When running a piece of Java code locally (from your laptop for example) however, it will not work and will throw; java.lang.UnsupportedOperationException: Remote JDBC disabled.

WLST

The below script is based on http://albinoraclesoa.blogspot.nl/2012/06/monitoring-jca-adapters-through-wlst.html and http://davidmichaelkarr.blogspot.nl/2008/10/make-wlst-scripts-more-flexible-with.html and was created by Marcel Bellinga. It contains an example on how to pass arguments to a WLST script and how to query / test DataSources from a WLST script. The DataSources for which statistics are printed, are determined by querying the DbAdapter and AqAdapter connectionpools.

import sys
import os
from java.lang import System

import getopt

user        = ''
credential  = ''
host        = ''
port        = ''
targetServerName  = ''

def usage():
    print "Usage:"
    print "ResourceAdapterMonitor -u user -c credential -h host -p port -s serverName"
   

def monitorDBAdapter(serverName):
    cd("ServerRuntimes/"+str(serverName)+"/ApplicationRuntimes/DbAdapter/ComponentRuntimes/DbAdapter/ConnectionPools")
    connectionPools = ls(returnMap='true')
    print '--------------------------------------------------------------------------------'
    print 'DBAdapter Runtime details for '+ serverName
    print '--------------------------------------------------------------------------------'

    print '%10s %13s %15s %18s' % ('Connection Pool', 'State', 'Current', 'Created')
    print '%10s %10s %24s %21s' % ('', '', 'Capacity', 'Connections')
    print '--------------------------------------------------------------------------------'
    for connectionPool in connectionPools:
       if connectionPool!='eis/DB/SOADemo':
          cd('/')
          cd("ServerRuntimes/"+str(serverName)+"/ApplicationRuntimes/DbAdapter/ComponentRuntimes/DbAdapter/ConnectionPools/"+str(connectionPool))
          print '%15s %15s %10s %20s' % (cmo.getName(), cmo.getState(), cmo.getCurrentCapacity(), cmo.getConnectionsCreatedTotalCount())

    print '--------------------------------------------------------------------------------'


def monitorAQAdapter(serverName):
    cd("ServerRuntimes/"+str(serverName)+"/ApplicationRuntimes/AqAdapter/ComponentRuntimes/AqAdapter/ConnectionPools")
    connectionPools = ls(returnMap='true')
    print '--------------------------------------------------------------------------------'
    print 'AqAdapter Runtime details for '+ serverName
    print '--------------------------------------------------------------------------------'

    print '%10s %13s %15s %18s' % ('Connection Pool', 'State', 'Current', 'Created')
    print '%10s %10s %24s %21s' % ('', '', 'Capacity', 'Connections')
    print '--------------------------------------------------------------------------------'
    for connectionPool in connectionPools:
       if connectionPool!='eis/DB/SOADemo':
          cd('/')
          cd("ServerRuntimes/"+str(serverName)+"/ApplicationRuntimes/AqAdapter/ComponentRuntimes/AqAdapter/ConnectionPools/"+str(connectionPool))
          print '%15s %15s %10s %20s' % (cmo.getName(), cmo.getState(), cmo.getCurrentCapacity(), cmo.getConnectionsCreatedTotalCount())

    print '--------------------------------------------------------------------------------'   

def parameters():
      global user
      global credential
      global host
      global port
      global targetServerName
      try:
        opts, args    = getopt.getopt(sys.argv[1:], "u:c:h:p:s:",
                                  ["user=", "credential=", "host=", "port=",
                                   "targetServerName="])
      except getopt.GetoptError, err:
        print str(err)
        usage()
        sys.exit(2)

      for opt, arg in opts:
        if opt == "-n":
            reallyDoIt  = false
        elif opt == "-u":
            user        = arg
        elif opt == "-c":
            credential  = arg
        elif opt == "-h":
            host        = arg
        elif opt == "-p":
            port        = arg
        elif opt == "-s":
            targetServerName  = arg       
      if user == "":
        print "Missing \"-u user\" parameter."
        usage()
        sys.exit(2)
      if credential == "":
        print "Missing \"-c credential\" parameter."
        usage()
        sys.exit(2)
      if host == "":
        print "Missing \"-h host\" parameter."
        usage()
        sys.exit(2)
      if port == "":
        print "Missing \"-p port\" parameter."
        usage()
        sys.exit(2)
      if targetServerName == "":
        print "Missing \"-s targetServerName\" parameter."
        usage()
        sys.exit(2)           

   
def main():
      parameters()
      #connect(username, password, admurl)
      connect(user,credential,'t3://'+host+':'+port)
      servers = cmo.getServers()
      domainRuntime()
      cd("/ServerLifeCycleRuntimes/" + targetServerName)
      if cmo.getState() == 'RUNNING': 
        monitorAQAdapter(targetServerName)
        monitorDBAdapter(targetServerName)               
      disconnect()
   
main()


Conclusion

There are various ways to monitor backend systems and databases . It is useful to create your own dashboards, especially when there are a lot of systems involved and you don't want to (or can't) login to the Enterprise Manager on every one of them. Make sure though such unsecured dashboards don't end up on production systems. Depending on the problem with a database, a JNDI lookup might or might not work. The DbAdapter and AqAdapter have JDBC DataSources configured. It is useful to create a script which determines the DataSources based on the DbAdapter/AqAdapter configuration since that listing contains all DataSources used by the adapter, even if they are not loaded succesfully. That is the list of DataSources that should be tested. This can be done with WLST as shown in this post. Using a servlet however is more convenient then using WLST scripts since the URL of the servlet can be mailed to for example testers so they can monitor the databases. WLST requires a usuable Middleware installation and connection properties, which are not always available. I might create a Java servlet which provides the functionality of the WLST script mentioned in this post in the near future.