Zookeeper集群:
192.168.182.12 (bigdata12) 192.168.182.13 (bigdata13) 192.168.182.14 (bigdata14)
Hadoop集群:
192.168.182.12 (bigdata12) NameNode1主节点 ResourceManager1主节点 Journalnode 192.168.182.13 (bigdata13) NameNode2备用主节点 ResourceManager2备用主节点 Journalnode 192.168.182.14 (bigdata14) DataNode1 NodeManager1 192.168.182.15 (bigdata15) DataNode2 NodeManager2
二、准备工作
1、安装JDK:每台机器都需要安装
我这里使用的是jdk-8u152-linux-x64.tar.gz安装包
解压JDK:
tar -zxvf jdk-8u144-linux-x64.tar.gz -C ~/training
2、配置环境变量:
1)配置java环境变量:
vi ~/.bash_profileexport JAVA_HOME=/root/training/jdk1.8.0_144export PATH=$JAVA_HOME/bin:$PATH
2)生效环境变量:
source ~/.bash_profile
3)验证是否安装成功:
java -version
3、配置IP地址与主机名的映射关系 原因:方便SSH调用 方便Ping通
vi /etc/hosts
输入:
192.168.182.13 bigdata13 192.168.182.14 bigdata14 192.168.182.15 bigdata15
4、配置免密码登录
1)在每台机器上产生公钥和私钥
ssh-keygen -t rsa
含义:通过ssh协议采用非对称加密算法的rsa算法生成一组密钥对:公钥和私钥
2)在每台机器上将自己的公钥复制给其他机器
注:以下四个命令需要在每台机器上都运行一遍
ssh-copy-id -i .ssh/id_rsa.pub root@bigdata12ssh-copy-id -i .ssh/id_rsa.pub root@bigdata13ssh-copy-id -i .ssh/id_rsa.pub root@bigdata14ssh-copy-id -i .ssh/id_rsa.pub root@bigdata15
三、安装Zookeeper集群(在bigdata12上安装)
在主节点(bigdata12)上安装和配置ZooKeeper
我这里使用的是zookeeper-3.4.10.tar.gz安装
1、解压Zookeeper:
tar -zxvf zookeeper-3.4.10.tar.gz -C ~/training
2、配置和生效环境变量:
export ZOOKEEPER_HOME=/root/training/zookeeper-3.4.10export PATH=$ZOOKEEPER_HOME/bin:$PATHsource ~/.bash_profile
3、修改zoo.cfg配置文件:
vi /root/training/zookeeper-3.4.10/conf/zoo.cfg
修改:
dataDir=/root/training/zookeeper-3.4.10/tmp
在最后一行添加:
server.1=bigdata12:2888:3888server.2=bigdata13:2888:3888server.3=bigdata14:2888:3888
4、修改myid配置文件
在/root/training/zookeeper-3.4.10/tmp目录下创建一个myid的空文件:
mkdir /root/training/zookeeper-3.4.10/tmp/myidecho 1 > /root/training/zookeeper-3.4.10/tmp/myid
5、将配置好的zookeeper拷贝到其他节点,同时修改各自的myid文件
scp -r /root/training/zookeeper-3.4.10/ bigdata13:/root/trainingscp -r /root/training/zookeeper-3.4.10/ bigdata14:/root/training
进入bigdata13和bigdata14两台机器中,找到myid文件,将其中的1分别修改为2和3:
vi myid
在bigdata13中输入:2在bigdata14中输入:3
四、安装Hadoop集群(在bigdata12上安装)
1、修改hadoop-env.sh
export JAVA_HOME=/root/training/jdk1.8.0_144
2、修改core-site.xml
fs.defaultFS hdfs://ns1 hadoop.tmp.dir /root/training/hadoop-2.7.3/tmp ha.zookeeper.quorum bigdata12:2181,bigdata13:2181,bigdata14:2181
3、修改hdfs-site.xml(配置这个nameservice中有几个namenode)
dfs.nameservices ns1 dfs.ha.namenodes.ns1 nn1,nn2 dfs.namenode.rpc-address.ns1.nn1 bigdata12:9000 dfs.namenode.http-address.ns1.nn1 bigdata12:50070 dfs.namenode.rpc-address.ns1.nn2 bigdata13:9000 dfs.namenode.http-address.ns1.nn2 bigdata13:50070 dfs.namenode.shared.edits.dir qjournal://bigdata12:8485;bigdata13:8485;/ns1 dfs.journalnode.edits.dir /root/training/hadoop-2.7.3/journal dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.ns1 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfenceshell(/bin/true) dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000
4、修改mapred-site.xml
mapreduce.framework.name yarn
配置Yarn的HA
5、修改yarn-site.xml
yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 bigdata12 yarn.resourcemanager.hostname.rm2 bigdata13 yarn.resourcemanager.zk-address bigdata12:2181,bigdata13:2181,bigdata14:2181 yarn.nodemanager.aux-services mapreduce_shuffle
6、修改slaves 从节点的地址
bigdata14bigdata15
7、将配置好的hadoop拷贝到其他节点
scp -r /root/training/hadoop-2.7.3/ root@bigdata13:/root/training/scp -r /root/training/hadoop-2.7.3/ root@bigdata14:/root/training/scp -r /root/training/hadoop-2.7.3/ root@bigdata15:/root/training/
五、启动Zookeeper集群
在每一台机器上输入:
zkServer.sh start
六、启动journalnode
在bigdata12和bigdata13两台节点上启动journalnode节点:
hadoop-daemon.sh start journalnode
七、格式化HDFS和Zookeeper(在bigdata12上执行)
格式化HDFS:
hdfs namenode -format
将/root/training/hadoop-2.7.3/tmp拷贝到bigdata13的/root/training/hadoop-2.7.3/tmp下
scp -r dfs/ root@bigdata13:/root/training/hadoop-2.7.3/tmp
格式化zookeeper:
hdfs zkfc -formatZK
日志:INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.
以上日志表明在Zookeeper的文件系统中创建了/hadoop-ha/ns1的子目录用于保存Namenode的结构信息
八、启动Hadoop集群(在bigdata12上执行)
启动Hadoop集群的命令:
start-all.sh
日志:
Starting namenodes on [bigdata12 bigdata13]bigdata12: starting namenode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop113.outbigdata13: starting namenode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop112.outbigdata14: starting datanode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop115.outbigdata15: starting datanode, logging to /root/training/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop114.outbigdata13: starting zkfc, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-zkfc- bigdata13.outbigdata12: starting zkfc, logging to /root/training/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata12.out
在bigdata13上手动启动ResourceManager作为Yarn的备用主节点:
yarn-daemon.sh start resourcemanager
至此,Hadoop集群的HA架构就已经搭建成功。
版权声明:本文为博主原创文章, 未经博主允许不得转载。http://www.cnblogs.com/lijinze-tsinghua/