`
zsjg13
  • 浏览: 137902 次
  • 性别: Icon_minigender_1
  • 来自: 安徽
社区版块
存档分类
最新评论

Setting up SSH for a Hadoop cluster

阅读更多

When setting up a Hadoop cluster, you'll need to designate one specific node as the master node. This server will typically host the NameNode and JobTraker

daemons. It'll also serve as the base station contacting and activating the DataNode and TaskTracker daemons on all of the slave nodes.

 

Hadoop uses passphraseless SSH for this purpose. SSH utilizes standard public key cryptography to create a pair of keys for user verification——one public,

one private. The public key is stored locally on every node in the cluster, and the master node sends the private key when attempting to access a remote machine. With both pieces of information, the target machine can validate the login attempt.

 

1. Define a common account

This access is from a user account on one node to another user account on the target machine. For Hadoop, the accounts should have the same username on

all of the nodes (we use hadoop-user in this book), and for security purpose we recommend it being a user-level account. This account is only for managing your

Hadoop cluster. Once the cluster daemons are up and running, you'll be able to run your actual MapReduce jobs from other accounts.

 

2. Verify SSH installation

$ which ssh

$ which sshd

$ which ssh-keygen

 

没有装的话,那就装个OpenSSH

 

3. Generate SSH key pair

Having verified that SSH is correctly installed on all nodes of the cluster, we use ssh-keygen on the master node to generate an RSA key pair. Be certain to

avoid entering a passphrase, or you'll have to manually enter that phrase every time the master node attempts to access another node.

 

$ ssh-keygen -t rsa

 

4. Distribute public key and validate logins

Albeit a bit tedious, you'll next need to copy the public key to every slave node as well as the master node:

[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key

 

Manually log in to the target node and set the master key as an authorized key (or append to the list of authorized keys if you have others defined).

[hadoop-user@target]$ mkdir ~/.ssh
[hadoop-user@target]$ chmod 700 ~/.ssh

[hadoop-user@target]$ mv ~/master_key ~/.ssh/authorized_keys

[hadoop-user@target]$ chmod 600 ~/.ssh/authorized_keys

 

After generating the key, you can verify it’s correctly defined by attempting to log in to the target node from the master:

[hadoop-user@master]$ ssh target

The authenticity of host 'target (xxx.xxx.xxx.xxx)' can’t be established.
RSA key fingerprint is 72:31:d8:1b:11:36:43:52:56:11:77:a4:ec:82:03:1d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'target' (RSA) to the list of known hosts.
Last login: Sun Jan 4 15:32:22 2009 from master

 

After confirming the authenticity of a target node to the master node, you won’t be prompted upon subsequent login attempts.

[hadoop-user@master]$ ssh target
Last login: Sun Jan 4 15:32:49 2009 from master

 

We’ve now set the groundwork for running Hadoop on your own cluster.

 

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics