Friday, September 30, 2022

Apache NiFi: Monitoring metrics and provenance events using Azure Log Analytics

There are several cases where you might want to use Azure Log Analytics to monitor your NiFi instances. An obvious one is when NiFi is running in Azure. Azure Log Analytics can also be used as single monitoring/alerting solution for multiple applications making operations easier by providing a single interface for this. You might want this if you want to monitor business processes which span multiple applications and you want to monitor the entire process to for example identify bottlenecks.

In this blog post I'll show you how easy it is to achieve this using the AzureLogAnalyticsReportingTask and the AzureLogAnalyticsProvenanceReportingTask from NiFi and what you need to configure in Azure Log Analytics to make this work.

Azure Log Analytics configuration part 1; preparations

In order to use Log Analytics, you first need to create a Log Analytics workspace. This is pretty straightforward and described here. After you've configured a workspace, you can obtain a "Workspace ID" and a "Primary key" under "Agent Management". You will need this information to configure NiFi with.


NiFi configuration

In NiFi you can add reporting tasks to provide an external system with metrics. First you go to the Controller Settings;


Next in the Reporting tasks tab, you can add the AzureLogAnalyticsReportingTask and/or the AzureLogAnalyticsProvenanceReportingTask. 


These require some configuration;


For the Log Analytics Workspace Id you can fill in the previously obtained Workspace ID. For the Log Analytics Workspace Key you can fill in the Primary Key as obtained earlier. It is also possible to specify process groups to monitor specifically. In not specified, the metrics and provenance data will be gathered on global level which might be of less value.

The AzureLogAnalyticsReportingTask provides metrics like for example JVM metrics and, a useful metric to monitor, FlowFilesQueued (in most cases this value shouldn't be too high as it could indicate a bottleneck or error). The AzureLogAnalyticsProvenanceReportingTask provides provenance events. These indicate when flowfiles are created, when they are changed, when they expire, etc. This allows you to more specifically monitor what is happening in your environment.

Azure Log Analytics configuration part 2; alerting

Now NiFi can send information to Azure Log Analytics, but how to go from there?

Inside Log Analytics, the metrics and provenance events become available under Logs, Custom Logs as tables you can query. In the below example I'm querying the FlowFilesQueued metric.


You can query the provenance events in a similar way


Based on a query, you can also create Alerts. First you need to create an Action Group. An action group is a collection of notifications which can be triggered by an Alert rule (see more here)

Next you can of course test the ActionGroup


After you've defined the Action group, you can create an Alert rule based on a query to trigger the Action group.


When the Alert goes off, you can receive mail or an SMS. You can also get an overview of alerts which have been fired and their status.


Mind that Alert rules are not free.


Finally

This setup gives a basic idea of how you can get NiFi metrics and provenance events into Azure Log Analytics. This is of course not the entire story. The example alert I've configured in this blog post does not contain enough information to pinpoint an issue. Also you might want to tweak how often the alert triggers. 

Depending on the metrics/provenance/logging solution which NiFi needs to feed, the way to achieve this might differ. See for example below on how you can achieve something similar for logging to Azure Log analytics and how feeding Elasticsearch with the same information could look like.

Logfiles to Azure Log Analytics

NiFi logfiles do not end up in Azure using the above method. Consider whether you need this since achieving this is more difficult (often more people involved) than just metrics and provenance data since it requires installation and maintenance of an Azure agent.

This can be done using the Azure Monitor Agent (the Azure Log Analytics Agent is a legacy product). Read some details about this here. For the installation of the agent, an Azure Arc enabled server is required when you want to run the Agent on Linux. In order to use this Agent to collect data, some configuration is required which is described here. A limitation of the agent is 'the log file must not allow circular logging, log rotation where the file is overwritten with new entries, or the file is renamed and the same file name is reused for continued logging.'. This can be a serious limitation since often you do not want your logfile to be ever increasing in size. As an alternative you can use syslog for this. NiFi can log to syslog since it uses Logback for logging and Logback has a SyslogAppender (read here).

ElasticSearch

Suppose you want to send metrics to ElasticSearch instead of Log Analytics, you can use the PrometheusReportingTask which provides information on an endpoint on the NiFi server. You can then poll the endpoint from an InvokeHTTP processor. The information can be converted to JSON using ReplaceText and regular expressions. The result can be send to ElasticSearch using the PutElasticsearchJson processor. There is an ElasticsearchReportingTask available here but it does not have much traction and people from the NiFi project themselves don't think this would be the way to go about this (read here).

Suppose you want to send logfiles to ElasticSearch, consider tools like FileBeat to achieve this. There is an Elasticsearch appender for logback available here which you can use instead but this would require adding additional libraries to the NiFi installation. I've not tried this myself.

No comments:

Post a Comment