Articles containing tips, tricks and nice to knows related to IT stuff I find interesting. Also serves as online memory.
Showing posts with label events. Show all posts
Showing posts with label events. Show all posts
Wednesday, December 28, 2022
Apache NiFi: Filter events and only let through the latest in a timeframe
In the IoT world, some devices generate large volumes of events that can be difficult for back-end systems to process in real time. Of course you can use NiFi to throttle messages. However, this will not be sufficient if the flow of events is consistently higher than what can be handled by the back-end system. A way to deal with this is to let Apache NiFi group and filter messages based on a specific attribute and only letting through the latest message for a specific device, in a certain timeframe. In this blog post I'll illustrate how you can do this. The trick is to merge several messages together using the MergeContent processor and then select the latest one using a Jolt transformation.
Labels:
apache nifi,
big data,
devices,
events,
filter,
iot,
jolt,
mergecontent,
nifi,
throttle,
throttling,
timeframe
Tuesday, February 26, 2019
Filesystem events to Elasticsearch / Kibana through Kafka Connect / Kafka
Filesystem events are useful to monitor. They can indicate a security breach. They can also help understanding how a complex system works by looking at the files it reads and writes.
When monitoring events, you can expect a lot of data to be generated quickly. The events might be interesting to process for different systems and at a different pace. Also it would be nice if you could replay events from the start or a specific moment. Enter Kafka. In order to put the filesystem events in Kafka (from an output file), the Kafka Connect FileSourceConnector is used. In order to get the data from Kafka to Elasticsearch, the Kafka Connect ElasticsearchSinkConnector is used. Both connectors can be used without Enterprise license.
When monitoring events, you can expect a lot of data to be generated quickly. The events might be interesting to process for different systems and at a different pace. Also it would be nice if you could replay events from the start or a specific moment. Enter Kafka. In order to put the filesystem events in Kafka (from an output file), the Kafka Connect FileSourceConnector is used. In order to get the data from Kafka to Elasticsearch, the Kafka Connect ElasticsearchSinkConnector is used. Both connectors can be used without Enterprise license.
Subscribe to:
Posts (Atom)