准备一台linux环境,内存4G或以上,硬盘40G或以上,我这里使用的是Centos7.7 64位的操作系统(注意:一定要使用64位的操作系统),需要虚拟机联网,关闭防火墙,关闭selinux,安装好JDK8。
根据以上需求,只需要将node1再克隆一台即可,命名为node4,专门用来进行Hadoop编译。
这里使用maven3.x以上的版本应该都可以,不建议使用太高的版本,强烈建议使用3.0.5的版本即可
将maven的安装包上传到/export/software
然后解压maven的安装包到/export/server
cd /export/software/tar -zxvf apache-maven-3.0.5-bin.tar.gz -C ../server/
配置maven的环境变量
vim /etc/profile
填写以下内容
export MAVEN_HOME=/export/server/apache-maven-3.0.5export MAVEN_OPTS="-Xms4096m -Xmx4096m"export PATH=:$MAVEN_HOME/bin:$PATH
让修改立即生效
source /etc/profile
解压maven的仓库
tar -zxvf mvnrepository.tar.gz -C /export/server/
修改maven的配置文件
cd /export/server/apache-maven-3.0.5/confvim settings.xml
指定我们本地仓库存放的路径
/export/server/mavenrepo
添加一个我们阿里云的镜像地址,会让我们下载jar包更快
alimaven aliyun maven http://maven.aliyun.com/nexus/content/groups/public/ central
解压findbugs
tar -zxvf findbugs-1.3.9.tar.gz -C ../server/
配置findbugs的环境变量
vim /etc/profile
添加以下内容:
export MAVEN_HOME=/export/server/apache-maven-3.0.5export PATH=:$MAVEN_HOME/bin:$PATHexport FINDBUGS_HOME=/export/server/findbugs-1.3.9export PATH=:$FINDBUGS_HOME/bin:$PATH
让修改立即生效
source /etc/profile
yum -y install autoconf automake libtool cmakeyum -y install ncurses-develyum -y install openssl-develyum -y install lzo-devel zlib-devel gcc gcc-c++yum -y install bzip2-devel
解压protobuf并进行编译
cd /export/softwaretar -zxvf protobuf-2.5.0.tar.gz -C ../server/cd /export/server/protobuf-2.5.0./configuremake && make install
cd /export/software/tar -zxvf snappy-1.1.1.tar.gz -C ../server/cd ../server/snappy-1.1.1/./configuremake && make install
对源码进行编译
cd /export/softwaretar -zxvf hadoop-2.7.5-src.tar.gz -C ../server/cd /export/server/hadoop-2.7.5
编译支持snappy压缩:
mvn package -DskipTests -Pdist,native -Dtar -Drequire.snappy -e -X
编译完成之后我们需要的压缩包就在下面这个路径里面,生成的文件名为hadoop-2.7.5.tar.gz
cd /export/server/hadoop-2.7.5/hadoop-dist/target
将编译后的Hadoop安装包导出即可
使用完全分布式,实现namenode高可用,ResourceManager的高可用
集群运行服务规划
停止之前的hadoop集群的所有服务,然后重新解压编译后的hadoop压缩包
解压压缩包
node1机器执行以下命令进行解压
mkdir -p /opt/softwaremkdir -p /opt/servercd /opt/softwaretar -zxvf hadoop-2.7.5.tar.gz -C /opt/server/cd /opt/server/hadoop-2.7.5/etc/hadoop
以下操作都在node1机器上进行
ha.zookeeper.quorum node1:2181,node2:2181,node3:2181 fs.defaultFS hdfs://ns hadoop.tmp.dir /opt/server/hadoop-2.7.5/data/tmp fs.trash.interval 10080
ha.zookeeper.quorum node1:2181,node2:2181,node3:2181 fs.defaultFS hdfs://ns hadoop.tmp.dir /opt/server/hadoop-2.7.5/data/tmp fs.trash.interval 10080
yarn.log-aggregation-enable true yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id mycluster yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 node2 yarn.resourcemanager.hostname.rm2 node3 yarn.resourcemanager.address.rm1 node2:8032 yarn.resourcemanager.scheduler.address.rm1 node2:8030 yarn.resourcemanager.resource-tracker.address.rm1 node2:8031 yarn.resourcemanager.admin.address.rm1 node2:8033 yarn.resourcemanager.webapp.address.rm1 node2:8088 yarn.resourcemanager.address.rm2 node3:8032 yarn.resourcemanager.scheduler.address.rm2 node3:8030 yarn.resourcemanager.resource-tracker.address.rm2 node3:8031 yarn.resourcemanager.admin.address.rm2 node3:8033 yarn.resourcemanager.webapp.address.rm2 node3:8088 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.ha.id rm1 If we want to launch more than one RM in single node, we need this configuration yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore yarn.resourcemanager.zk-address node2:2181,node3:2181,node1:2181 For multiple zk services, separate them with comma yarn.resourcemanager.ha.automatic-failover.enabled true Enable automatic failover; By default, it is enabled only when HA is enabled. yarn.client.failover-proxy-provider org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider yarn.nodemanager.resource.cpu-vcores 2 yarn.nodemanager.resource.memory-mb 2048 yarn.scheduler.minimum-allocation-mb 1024 yarn.scheduler.maximum-allocation-mb 2048 yarn.log-aggregation.retain-seconds 2592000 yarn.nodemanager.log.retain-seconds 604800 yarn.nodemanager.log-aggregation.compression-type gz yarn.nodemanager.local-dirs /opt/server/hadoop-2.7.5/yarn/local yarn.resourcemanager.max-completed-applications 1000 yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.connect.retry-interval.ms 2000
mapreduce.framework.name yarn mapreduce.jobhistory.address node3:10020 mapreduce.jobhistory.webapp.address node0:19888 mapreduce.jobtracker.system.dir /opt/server/hadoop-2.7.5/data/system/jobtracker mapreduce.map.memory.mb 1024 mapreduce.reduce.memory.mb 1024 mapreduce.task.io.sort.mb 100 mapreduce.task.io.sort.factor 10 mapreduce.reduce.shuffle.parallelcopies 15 yarn.app.mapreduce.am.command-opts -Xmx1024m yarn.app.mapreduce.am.resource.mb 1536 mapreduce.cluster.local.dir /opt/server/hadoop-2.7.5/data/system/local
node1node2node3
export JAVA_HOME=/export/server/jdk1.8.0_241
将第一台机器的安装包发送到其他机器上
第一台机器执行以下命令:
cd /opt/serverscp -r hadoop-2.7.5/ node2:$PWDscp -r hadoop-2.7.5/ node3:$PWD
三台机器上共同创建目录
三台机器执行以下命令
mkdir -p /opt/server/hadoop-2.7.5/data/dfs/nn/namemkdir -p /opt/server/hadoop-2.7.5/data/dfs/nn/editsmkdir -p /opt/server/hadoop-2.7.5/data/dfs/nn/namemkdir -p /opt/server/hadoop-2.7.5/data/dfs/nn/edits
更改node3的rm2
第二台机器执行以下命令
vim yarn-site.xml
yarn.resourcemanager.ha.id rm2 If we want to launch more than one RM in single node, we need this configuration
node1机器执行以下命令
cd /opt/server/hadoop-2.7.5bin/hdfs zkfc -formatZKsbin/hadoop-daemons.sh start journalnodebin/hdfs namenode -formatbin/hdfs namenode -initializeSharedEdits -forcesbin/start-dfs.sh
node2上面执行
cd /opt/server/hadoop-2.7.5bin/hdfs namenode -bootstrapStandbysbin/hadoop-daemon.sh start namenode
node2上执行
cd /opt/server/hadoop-2.7.5sbin/start-yarn.sh
node3上面执行
cd /export/servers/hadoop-2.7.5sbin/start-yarn.sh
node2上面执行
cd /opt/server/hadoop-2.7.5bin/yarn rmadmin -getServiceState rm1
node3上面执行
cd /opt/server/hadoop-2.7.5bin/yarn rmadmin -getServiceState rm2
node3机器执行以下命令启动jobHistory
cd /opt/server/hadoop-2.7.5sbin/mr-jobhistory-daemon.sh start historyserver
node1机器查看hdfs状态
http://192.168.88.161:50070/dfshealth.html#tab-overview
node2机器查看hdfs状态
http://192.168.88.162:50070/dfshealth.html#tab-overview
http://192.168.88.163:8088/cluster
页面访问:
http://192.168.88.163:19888/jobhistory
留言与评论(共有 0 条评论) “” |