When setting up a Hadoop cluster, you'll need to designate one specific node as the master node. This server will typically host the NameNode and JobTraker
daemons. It'll also serve as the base station contacting and activating the DataNode and TaskTracker daemons on all of the slave nodes.
Hadoop uses passphraseless SSH for this purpose. SSH utilizes standard public key cryptography to create a pair of keys for user verification——one public,
one private. The public key is stored locally on every node in the cluster, and the master node sends the private key when attempting to access a remote machine. With both pieces of information, the target machine can validate the login attempt.
1. Define a common account
This access is from a user account on one node to another user account on the target machine. For Hadoop, the accounts should have the same username on
all of the nodes (we use hadoop-user in this book), and for security purpose we recommend it being a user-level account. This account is only for managing your
Hadoop cluster. Once the cluster daemons are up and running, you'll be able to run your actual MapReduce jobs from other accounts.
2. Verify SSH installation
$ which ssh
$ which sshd
$ which ssh-keygen
没有装的话,那就装个OpenSSH
3. Generate SSH key pair
Having verified that SSH is correctly installed on all nodes of the cluster, we use ssh-keygen on the master node to generate an RSA key pair. Be certain to
avoid entering a passphrase, or you'll have to manually enter that phrase every time the master node attempts to access another node.
$ ssh-keygen -t rsa
4. Distribute public key and validate logins
Albeit a bit tedious, you'll next need to copy the public key to every slave node as well as the master node:
[hadoop-user@master]$ scp ~/.ssh/id_rsa.pub hadoop-user@target:~/master_key
Manually log in to the target node and set the master key as an authorized key (or append to the list of authorized keys if you have others defined).
[hadoop-user@target]$ mkdir ~/.ssh
[hadoop-user@target]$ chmod 700 ~/.ssh
[hadoop-user@target]$ mv ~/master_key ~/.ssh/authorized_keys
[hadoop-user@target]$ chmod 600 ~/.ssh/authorized_keys
After generating the key, you can verify it’s correctly defined by attempting to log in to the target node from the master:
[hadoop-user@master]$ ssh target
The authenticity of host 'target (xxx.xxx.xxx.xxx)' can’t be established.
RSA key fingerprint is 72:31:d8:1b:11:36:43:52:56:11:77:a4:ec:82:03:1d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'target' (RSA) to the list of known hosts.
Last login: Sun Jan 4 15:32:22 2009 from master
After confirming the authenticity of a target node to the master node, you won’t be prompted upon subsequent login attempts.
[hadoop-user@master]$ ssh target
Last login: Sun Jan 4 15:32:49 2009 from master
We’ve now set the groundwork for running Hadoop on your own cluster.
相关推荐
Hadoop Cluster Deployment.
Setting up Hadoop for Elasticsearch Setting up Java Setting up a dedicated user Installing SSH and setting up the certificate Downloading Hadoop Setting up environment variables Configuring ...
hadoop cluster build detail
Easy-to-understand recipes for securing and monitoring a Hadoop cluster, and design considerations Recipes showing you how to tune the performance of a Hadoop cluster Learn how to build a Hadoop ...
Hadoop cluster planning guide
ssh for hadoop
配置hadoop的集群文档,包含了详细配置的PDF文档和WordCount代码
hadoop-cluster-docker, 在 Docker 容器中运行 Hadoop 在 Docker 容器内运行Hadoop集群博客:在 Docker 更新中运行Hadoop集群。博客:基于Docker搭建Hadoop集群之升级版 3节点Hadoop集群 1.拉 Docker 图像sudo do
指导Hadoop集群部署的资料, 注意: 内容是英文的, 可能有些同学会失望
Spring Data for Apache Hadoop API。 Spring Data for Apache Hadoop 开发文档
NULL 博文链接:https://mryangjw.iteye.com/blog/2062690
Data Algorithms Recipes for Scaling Up with Hadoop and Spark 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 本资源转载自网络,如有侵权,请联系上传者或csdn删除
SQL for Apache Hadoop, SQL for Apache Hadoop, SQL for Apache Hadoop, SQL for Apache Hadoop
Hadoop在centOS系统下的安装文档,系统是虚拟机上做出来的,一个namenode,两个datanode,详细讲解了安装过程。
基于Java和ssh在Hadoop平台上完成文件操作,结果查询等功能
ch09 - Setting Up a Hadoop Cluster ch10 - Administering Hadoop ch11 - Pig ch12 - Hive ch13 - HBase ch14 - ZooKeeper ch15 - Sqoop ch16 - Case Studies app1 - Installing Apache Hadoop app2 - Cloudera's ...
Chapters 9 and 10 are for Hadoop administrators, and describe how to set up and maintain a Hadoop cluster running HDFS and MapReduce. Chapters 11, 12, and 13 present Pig, HBase, and ZooKeeper, ...
MapReduce is the distribution system that the Hadoop MapReduce engine uses to distribute work around a cluster by working parallel on smaller data sets. It is useful in a wide range of applications, ...
Several good tools and guides describe how to deploy Hadoop clusters, but very little documentation tells how to increase performance on a Hadoop cluster once it is deployed. This white paper ...
Setting Up a Hadoop Cluster Chapter 11. Administering Hadoop Part IV. Related Projects Chapter 12. Avro Chapter 13. Parquet Chapter 14. Flume Chapter 15. Sqoop Chapter 16. Pig Chapter 17. Hive ...