美文网首页数值分析
【大数据工程师】【Hadoop】在CentOS7上安装Hadoo

【大数据工程师】【Hadoop】在CentOS7上安装Hadoo

作者: 炼狱腾蛇Eric | 来源:发表于2019-07-12 15:21 被阅读0次

1. 官方网站

https://hadoop.apache.org/

2. 下载

https://hadoop.apache.org/releases.html

3. 环境

  • 系统:CentOS 7.6
  • JAVA:JRE 1.8

4. 配置系统环境

  • 修改/etc/hosts文件,增加下面的行(如果系统已经注册在DNS服务器,有A记录,这步可以省略)
192.168.56.201 node1
192.168.56.202 node2
192.168.56.203 node3
  • 配置所有机器免密码登陆,使用ssh-keygen生成公钥和密钥
[root@node1 etc]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:vSs9kZ7yxb16okc8e40afXJkICUQRI4Ig0xP3sh2F7c root@node1
The key's randomart image is:
+---[RSA 2048]----+
| o..+   .o*o. .  |
|  o= = . = . o   |
|    * + o E . .  |
|   . . . .   . . |
|        S .o    o|
|          oo+o o |
|         o.++o+oo|
|        o =o+.+=.|
|         +++o*.  |
+----[SHA256]-----+

  • 使用ssh-copy-id把公钥拷贝到对应机器上
[root@node1 etc]# ssh-copy-id node2
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node2 (192.168.56.202)' can't be established.
ECDSA key fingerprint is SHA256:S+fUIkc5tzfLcZ8OKjyo5Gj89fYbiM9Q/r+K6k9LZkQ.
ECDSA key fingerprint is MD5:52:96:7a:52:d1:e7:3b:99:d9:b7:a0:2a:87:71:78:f3.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node2's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node2'"
and check to make sure that only the key(s) you wanted were added.
  • 安装java,配置好yum源,使用yum安装就行
$ yum -y install java java-devel
  • 配置JAVA_HOME,/etc/profile.d/java.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

5. 安装和配置hadoop

  • 下载hadoop
$ cd /opt
$ wget https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
  • 解压hadoop
$ tar xf hadoop
  • 配置Hadoop环境变量,/etc/profile.d/hadoop.sh
export HADOOP_HOME=/opt/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME

export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export HDFS_DATANODE_USER=root
export HDFS_DATANODE_SECURE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_NAMENODE_USER=root

export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
  • 重读环境变量
$ source /etc/profile
  • 给hadoop建立一个文件夹,如果在生产系统最好建立一个文件系统
mkdir -p /data/hadoop
  • 修改$HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://node1:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/tmp</value>
</configuration>
  • 修改$HADOOP_HOME/etc/hadoop/hdfs-site.xml,配置副本个数以及数据存放的路径
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/data/hadoop/hdfs/name</value>
    </property>
    <property>
        <name>dfs.namenode.data.dir</name>
        <value>/data/hadoop/hdfs/data</value>
    </property>
    <property>
        <name>dfs.namenode.http-address</name>
        <value>node1:50070</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node2:50090</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
</configuration>
  • 修改$HADOOP_HOME/etc/hadoop/mapred-site.xml,配置使用yarn框架执行mapreduce处理程序,与之前版本多了后面两部
    不配置mapreduce.application.classpath这个参数mapreduce运行时会报错:
    Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>
            /opt/hadoop-3.2.0/etc/hadoop,
            /opt/hadoop-3.2.0/share/hadoop/common/*,
            /opt/hadoop-3.2.0/share/hadoop/common/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/hdfs/*,
            /opt/hadoop-3.2.0/share/hadoop/hdfs/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/mapreduce/*,
            /opt/hadoop-3.2.0/share/hadoop/mapreduce/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/yarn/*,
            /opt/hadoop-3.2.0/share/hadoop/yarn/lib/*
        </value>
    </property>
</configuration>
  • 修改$HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>
            /opt/hadoop-3.2.0/etc/hadoop,
            /opt/hadoop-3.2.0/share/hadoop/common/*,
            /opt/hadoop-3.2.0/share/hadoop/common/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/hdfs/*,
            /opt/hadoop-3.2.0/share/hadoop/hdfs/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/mapreduce/*,
            /opt/hadoop-3.2.0/share/hadoop/mapreduce/lib/*,
            /opt/hadoop-3.2.0/share/hadoop/yarn/*,
            /opt/hadoop-3.2.0/share/hadoop/yarn/lib/*
        </value>
    </property>
</configuration>
  • worker文件里面添加
node1
node2
node3
  • 把环境变量文件/etc/profile.d/hadoop.sh/opt/hadoop-3.2.0拷贝到node2和node3
$ scp /etc/profile.d/hadoop.sh node2:/etc/profile.d/hadoop.sh
$ scp /etc/profile.d/hadoop.sh node3:/etc/profile.d/hadoop.sh
$ scp -r /opt/hadoop-3.2.0 node2:/opt/
$ scp -r /opt/hadoop-3.2.0 node3:/opt/
  • 初始化data
$ hdfs namenode -format
  • 启动和停止命令
$ $HADOOP_HOME/bin/start-all.sh
$ $HADOOP_HOME/bin/stop-all.sh
  • 使用浏览器访问http://192.168.56.201:50070
    image.png

相关文章

网友评论

    本文标题:【大数据工程师】【Hadoop】在CentOS7上安装Hadoo

    本文链接:https://www.haomeiwen.com/subject/zvqdkctx.html