Understanding of Supervisor and it’s specification in Apache Druid for real-time data ingestion from Apache Kafka

Although both Apache Druid and Apache Kafka are potent open-source data processing tools, they have diverse uses. While Druid is a high-performance, column-store, real-time analytical database, Kafka is a distributed […]

Causes and remedies of poison pill in Apache Kafka

A poison pill is a message deliberately sent to a Kafka topic, designed to consistently fail when consumed, regardless of the number of consumption attempts. Poison Pill scenarios are frequently […]

Apache Kafka’s built-in command line tools – A hidden gem to scan internals.

Several tools/scripts are included in the bin directory of the Apache Kafka binary installation. Even if that directory has a number of scripts, through this article I want to highlight […]

The significance of deep storage in Apache Druid

The phrase “deep storage” refers to the long-term storage system used by Apache Druid, where past data segments are preserved for durability and retrieval in the future. Druid stores data […]

Forging Apache Druid with Apache Kafka for real-time streaming analytics

A real-time analytics database called Apache Druid is developed for quick slice-and-dice analysis on massive data volumes. The best data for Apache Druid is event-oriented and frequently utilized as the […]

Knowing and valuing Apache Kafka’s ISR (In-Sync Replicas)

To get more clarity about ISR in Apache Kafka, We should first carefully examine the replication process in the Kafka broker. In short, replication means having multiple copies of our […]

Handling bad messages via DLQ by configuring JDBC Kafka Sink Connector

Any trustworthy data streaming pipeline needs to be able to identify and handle faults. Exceptionally while IoT devices ingest endlessly critical data/events into permanent persistence storage like RDBMS for future […]

Streaming Data to RDBMS via Kafka JDBC Sink Connector without leveraging Schema Registry

In today’s M2M (Machine to machine) communications landscape, there is a huge requirement for streaming the digital data from heterogeneous IoT devices to the various RDBMS for further analysis via […]

Resolve Apache Kafka starting issue installed on Single/Multi-node cluster

This short article explains how to resolve the error “ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)kafka.common.InconsistentClusterIdException:” when we start the Apache Kafka installed and configured on a […]

Few intrinsic of Apache Zookeeper and their importance

As a bird’s eye view, Apache Zookeeper has been leveraged to get coordination services for managing distributed applications. Holds responsibility for providing configuration information, naming, synchronization, and group services over […]