美文网首页数值分析
【大数据工程师】【Spark】在CentOS7上安装Spark2

【大数据工程师】【Spark】在CentOS7上安装Spark2

作者: 炼狱腾蛇Eric | 来源:发表于2019-07-12 16:43 被阅读0次

1. 官网

https://www.scala-lang.org/
http://spark.apache.org/

2. 下载

https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.tgz
http://spark.apache.org/downloads.html

3. 环境

  • OS:CentOS 7.6
  • Scala:2.13
  • JDK:1.8
  • Spark:2.4

4.基础环境配置

$ tar xf scala-2.13.0.tgz
  • 配置环境变量/etc/profile.d/scala.sh
export SCALA_HOME=/opt/scala-2.13.0
export PATH=${SCALA_HOME}/bin:$PATH
  • 配置环境变量/etc/profile.d/spark.sh
export SPARK_HOME=/opt/spark-2.4.3-bin-hadoop2.7
export PATH=${SPARK_HOME}/bin:$PATH
export PYSPARK_PYTHONPATH=${SPARK_HOME}/bin:${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.10.7-src.zip:$PATH
export SPARK_WORKER_MEMORY=1G
export SPARK_MASTER_HOST=node1
export SPARK_PID_DIR=/data/spark/pid
export SPARK_LOCAL_DIRS=/data/spark/spark_shuffle

5. 安装Spark

  • 解压Spark
$ tar xf spark-2.4.3-bin-hadoop2.7.tgz
  • 修改$SPARK_HOME/conf/spark-defaults.conf,添加
spark.master                     spark://node1:7077
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://node1:9000/eventLog
spark.serializer                 org.apache.spark.serializer.KryoSerializer
spark.driver.memory              1g
  • 修改$SPARK_HOME/conf/slaves
node1
node2
node3
  • 把scala,spark和环境变量都copy到node2和node3
$ scp /etc/profile.d/scala.sh node2:/etc/profile.d/scala.sh
$ scp /etc/profile.d/scala.sh node3:/etc/profile.d/scala.sh
$ scp /etc/profile.d/spark.sh node2:/etc/profile.d/spark.sh
$ scp /etc/profile.d/spark.sh node3:/etc/profile.d/spark.sh
$ scp -r /opt/scala-2.13.0 node2:/opt/
$ scp -r /opt/scala-2.13.0 node3:/opt/
$ scp -r /opt/spark-2.4.3-bin-hadoop2.7 node2:/opt/
$ scp -r /opt/spark-2.4.3-bin-hadoop2.7 node3:/opt/
  • 启动/停止Spark
$SPARK_HOME/sbin/start-all.sh
$SPARK_HOME/sbin/stop-all.sh
  • 访问http://192.168.56.201:8080/,能看到三个节点就对了
    image.png

相关文章

网友评论

    本文标题:【大数据工程师】【Spark】在CentOS7上安装Spark2

    本文链接:https://www.haomeiwen.com/subject/figekctx.html