Categories: Tech Threads

Resolve permission issue among DataNodes with NameNode to establish Secure Shell /SSH without passphrase

SSH issue

Sometimes it has been observed that when we configure and deploy multi-node Hadoop cluster or add new DataNodes, there is a SSH permission issue in communication with Hadoop daemons. In a fully distributed environment when the cluster is alive and running, the NameNode (Hadoop core services like NodeManager, YARN, etc) uses SSH for communication with DataNodes very frequently. Simply, in other words, we can say monitoring the heartbeats of every configured slaves or DataNodes. The error in the terminal console appears as “Permission Denied (public key, password)” once we start the Hadoop daemons at NameNode ($ sbin /.start-dfs.sh ) in that cluster.

Most of the time we suspect that there was an issue in public private RSA key pair generation followed by granting accurate permissions. And we keep repeating those steps to resolve the issue.
Even though key-pair generation and permission grant were correct to connect via SSH

    $ ssh-keygen -t rsa -P ” -f ~/.ssh/id_rsa
    $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    $ chmod 0600 ~/.ssh/authorized_keys

and able to ssh to all the systems (designated for DataNodes) without a passphrase from NameNode terminal (without starting Hadoop daemons), there could be an issue of permission denied as mentioned above. This issue ideally might come upon unknowing modification/changes in the sshd_config file in any DataNode or Secondary NameNode (if configured in a separate system ) in the cluster. This file is available in /etc/ssh/ in Ubuntu 14.04. Here are the following parameters in the sshd_config that we need to be careful.

1. “PubKeyAuthentication” key should be uncommented with value “yes”

2. “PasswordAuthentication” key should be uncommented with value “yes”

3. The key “UsePAM” should be uncommented with value “no”

After verification with necessary corrections, restart the ssh service or reboot the systems.

sudo service network-manager restart
sudo service ssh restart

And finally, restart the cluster after the successful format of NameNode. The error will disappear and successfully starts all the DataNode in the cluster. We used Ubuntu 14.04 as OS in the multi-node cluster.

Written by
Gautam Goswami

Can be contacted for real time POC development and hands-on technical training. Also to develop/support any Hadoop related project. Email:- gautam@onlineguwahati.com, gautambangalore@gmail.com. Gautam is a consultant as well as Educator. Prior to that, he worked as Sr. Technical Architect in multiple technologies and business domain across many countries. Currently, he is specializing in Big Data processing and analysis, Data lake creation, architecture etc. using HDFS. Besides, involved in HDFS maintenance and loading of multiple types of data from different sources, Design and development of real time use case development on client/customer demands to demonstrate how data can be leveraged for business transformation, profitability etc. He is passionate about sharing knowledge through blogs, training, seminars, presentations etc. on various Big Data related technologies, methodologies, real time projects with their architecture /design, multiple procedure of huge volume data ingestion, basic data lake creation etc.

Page: 1 2

Next How Data Analytics is transforming Sports Performance – Captivating »

Previous « Issue in starting node and resource manager in Apache Hadoop 3.2.0 with OpenJDK 11

Real-Time at Sea: Harnessing Data Stream Processing to Power Smarter Maritime Logistics

According to the International Chamber of Shipping, the maritime industry has increased fourfold in the… Read More

2 weeks ago

Tech Threads

Driving Streaming Intelligence On-Premises: Real-Time ML with Apache Kafka and Flink

Lately, companies, in their efforts to engage in real-time decision-making by exploiting big data, have… Read More

2 months ago

Tech Threads

Dark Data Demystified: The Role of Apache Iceberg

Lurking in the shadows of every organization is a silent giant—dark data. Undiscovered log files,… Read More

3 months ago

Tech Threads

The Role of Materialized Views in Modern Data Stream Processing Architectures + RisingWave

Incremental computation in data streaming means updating results as fresh data comes in, without redoing… Read More

6 months ago

Tech Threads

Unlocking the Power of Patterns in Event Stream Processing (ESP): The Critical Role of Apache Flink’s FlinkCEP Library

We call this an event when a button is pressed, a sensor detects a temperature… Read More

6 months ago

Tech Threads

Real-Time Redefined: Apache Flink and Apache Paimon Influence Data Streaming’s Future

Apache Paimon is made to function well with constantly flowing data, which is typical of… Read More

7 months ago

Resolve permission issue among DataNodes with NameNode to establish Secure Shell /SSH without passphrase

$ ssh-keygen -t rsa -P ” -f ~/.ssh/id_rsa $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys

sudo service network-manager restart sudo service ssh restart

Related Post

Recent Posts