1. 官网
https://www.scala-lang.org/
http://spark.apache.org/
2. 下载
https://downloads.lightbend.com/scala/2.13.0/scala-2.13.0.tgz
http://spark.apache.org/downloads.html
3. 环境
- OS:CentOS 7.6
- Scala:2.13
- JDK:1.8
- Spark:2.4
4.基础环境配置
- 参考前面Hadoop的安装
https://www.jianshu.com/p/3df30248d0fb - 解压Scala
$ tar xf scala-2.13.0.tgz
- 配置环境变量
/etc/profile.d/scala.sh
export SCALA_HOME=/opt/scala-2.13.0
export PATH=${SCALA_HOME}/bin:$PATH
- 配置环境变量
/etc/profile.d/spark.sh
export SPARK_HOME=/opt/spark-2.4.3-bin-hadoop2.7
export PATH=${SPARK_HOME}/bin:$PATH
export PYSPARK_PYTHONPATH=${SPARK_HOME}/bin:${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.10.7-src.zip:$PATH
export SPARK_WORKER_MEMORY=1G
export SPARK_MASTER_HOST=node1
export SPARK_PID_DIR=/data/spark/pid
export SPARK_LOCAL_DIRS=/data/spark/spark_shuffle
5. 安装Spark
- 解压Spark
$ tar xf spark-2.4.3-bin-hadoop2.7.tgz
- 修改
$SPARK_HOME/conf/spark-defaults.conf,添加
spark.master spark://node1:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://node1:9000/eventLog
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 1g
- 修改
$SPARK_HOME/conf/slaves
node1
node2
node3
- 把scala,spark和环境变量都copy到node2和node3
$ scp /etc/profile.d/scala.sh node2:/etc/profile.d/scala.sh
$ scp /etc/profile.d/scala.sh node3:/etc/profile.d/scala.sh
$ scp /etc/profile.d/spark.sh node2:/etc/profile.d/spark.sh
$ scp /etc/profile.d/spark.sh node3:/etc/profile.d/spark.sh
$ scp -r /opt/scala-2.13.0 node2:/opt/
$ scp -r /opt/scala-2.13.0 node3:/opt/
$ scp -r /opt/spark-2.4.3-bin-hadoop2.7 node2:/opt/
$ scp -r /opt/spark-2.4.3-bin-hadoop2.7 node3:/opt/
- 启动/停止Spark
$SPARK_HOME/sbin/start-all.sh
$SPARK_HOME/sbin/stop-all.sh
- 访问
http://192.168.56.201:8080/,能看到三个节点就对了
image.png








网友评论