Few intrinsic of Apache Zookeeper and their importance

As a bird’s eye view, Apache Zookeeper has been leveraged to get coordination services for managing distributed applications. Holds responsibility for providing configuration information, naming, synchronization, and group services over […]

Apache Zookeeper QuorumPeerMain

Resolve Apache Zookeeper starting issue installed on multi-node cluster

This miniature article explains how to resolve the error “Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain“ when we start the Apache Zookeeper (apache-zookeeper-3.5.6.tar.gz) installed on a multi-node cluster. […]

Checksum HDFS

How checksum smartly manages data integrity in HDFS (Hadoop Distributed File System).

Ensuring data integrity is basic necessity or back bond in big data processing environment to achieve accurate outcome. Of course, same is applicable while executing any data moving operations with […]

Manual procedure to add a new Datanode into an existing basic data lake without Apache Ambari or Cloudera Manager. Constructed using HDFS (Hadoop Distributed File System) on the multi-node cluster

The aim of this article is to highlight the essential steps when there would be a need for a new DataNode into an exiting multi-node Hadoop cluster. Midsize or startup […]

Network Topology to create Multi Node Hybrid cluster for Hadoop Installation

The aim of this article is to provide an outline for creating network topology for Hadoop installation in multi node hybrid cluster with limited available hardware resources. This cluster would […]