Spark可以不依赖Hadoop运行。如果运行的结果(包括中间结果)不需要存储到HDFS,并且集群管理器不采用YARN的情况下是可以不依赖hadoop的。
版本规划
项目 | 版本号 |
---|---|
hadoop | 2.7.7 |
spark | 2.1.0 |
scala | 2.11.8 |
zk | 3.4.13 |
java | 1.8.0 |
kafka | 2.12-2.1.0 |
mongoDB | 4.2.0-rc2 |
kafka及mongoDB在后续章节中会使用到,这里先列出版本号
涉及端口
端口 | 用途 |
---|---|
8080 | spark-ui |
7077 | master url port |
6066 | rest url port |
1、集群环境规划
IP | 主机名 | Master | Worker | ZK |
---|---|---|---|---|
172.*.*.6 | master | Y | N | Y |
172.*.*.7 | slave1 | N | Y | Y |
172.*.*.8 | slave2 | N | Y | Y |
172.*.*.9 | slave2 | N | Y | Y |
2、修改主机名
172.*.*.6设置为master
vi /etc/sysconfig/network
HOSTNAME=master
#重启生效或者下面临时使用命令生效
hostname master
172.*.*.7设置为slave1
vi /etc/sysconfig/network
HOSTNAME=slave1
#重启生效或者下面临时使用命令生效
hostname slave1
172.*.*.8设置为slave2
vi /etc/sysconfig/network
HOSTNAME=slave2
#重启生效或者下面临时使用命令生效
hostname slave2
172.*.*.9设置为slave3
vi /etc/sysconfig/network
HOSTNAME=slave3
#重启生效或者下面临时使用命令生效
hostname slave3
3、配置ip与主机名映射
6、7、8上分别配置
vi /etc/hosts
172.16.14.6 master
172.16.14.7 slave1
172.16.14.8 slave2
172.16.14.9 slave3
4、配置免密登录
#三台集群中分别生成密钥
ssh-keygen -t rsa
#将公钥拷贝到master的authorized_keys中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#赋予authorized_keys 600权限
chmod 600 authorized_keys
#最终authorized_keys文件内容如下
[root@localhost .ssh]# cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAtEvxRj/3xPCtnO38Gy4Y/Y4gj6XX5s+G2hwG5xx19PiDQEKeW3BYUDE616OVdecStBo3X+0Plr2ioirI/3WGlUkm0todr/irpksy0MTpvsjCNUnCWGUHGFMUmrcw1LSiNLhoOSS02AcIq+hw3QJO0w0Wo0EN8xcOhrYwuAByoVv3CvqWd/2Vce2rNOXxLNSmc9tR0Dl3ZqOAq+2a55GM7cETj+eiexDeF5zEVJ2vykQdH3+sZ2XLrQu4WXOMn70xFosk7E1lwJ14QLy6lpfRcWnB1JVKJx9mglze6v3U35g59Vu/LP7t3ebW+dJIOD3/Attb5HcvN8MNfQVOX3JD4w== root@master
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAuU9KJmlmYCx7D+vfMCl2Fj/kz1mfWBrChco0jmZtbygpYY8MUSjmfnsC/wefWKMnFtEruJb+RrgBLxVY6lNzvVKXh+iVPhrjubzj54FoZjepR+1EEznIvwkKa+Y4fkcSJjmcSq/Wvjvz34j3/wVoa1qZtbQing+GzC8Xt0y5rQ6fD1gzD4Oniu43fHAeQDxpo2cVNnTdO2HEe56ZfhIctVRP63rc2CoEuD7d0Ea2WhV0Uruqri/ZKFHVAQQqQ7z/jdCgzTdTXJ5t5hpyeaK8+mYhUKEyOF3xrACW1Is6grUjhbjUxTLt2y2Ytw1d5voFxCUJ6MQcy91KFE/9Lfefyw== root@slave1
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArucVUJdZBYXJD0r9WiX6VnR5S3F7BhoR7hB8UTkXs+WRJGEX9E44yjH+BjIJAPn2v/XwOCdqzSZrGPzLL/BG+XRhGN5NGmdplv8xI3C93hC5kZewRHrHlcAG5Kv4mcHlU+ugcWiyQbIaQvLaFXaq48ZVQHYrzXrz3ZT6QDpsaZtSeW4Z4KWeFmL+AwNyAqxK0nxYXR1zNQJ1r0IdApKmP1WNvbcblB2UKx5G7VMxOs62WY0R9LGdJK6Mmmr5QPlWlpn/g5vXlBvgD80pM6iixFAyz8q19aMQjErTWuULNvX8tdcm+StJV52N8EsiuNMOs+xLVO7L00yxZRtwrXKGgQ== root@slave2
#将master的authorized_keys远程传输到slave1/slave2
scp ~/.ssh/authorized_keys root@slave1:~/.ssh/
scp ~/.ssh/authorized_keys root@slave2:~/.ssh/
#检查远程免密登录
ssh slave1
ssh slave2
ssh slave3
5、集群配置
master配置(spark-env.sh)
export SCALA_HOME=/opt/middleware/scala-2.11.8
export JAVA_HOME=/usr/local/jdk
export SPARK_MASTER_IP=master
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=3g
export SPARK_MASTER_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master:2181,slave1:2181,slave2:2181 -Dspark.deploy.zookeeper.dir=/opt/middleware/zookeeper-3.4.13"
slave1配置(spark-env.sh)
export SCALA_HOME=/opt/middleware/scala-2.11.8
export JAVA_HOME=/usr/local/jdk
export SPARK_MASTER_IP=master
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=3g
export SPARK_MASTER_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master:2181,slave1:2181,slave2:2181 -Dspark.deploy.zookeeper.dir=/opt/middleware/zookeeper-3.4.13"
slave2配置(spark-env.sh)
export SCALA_HOME=/opt/middleware/scala-2.11.8
export JAVA_HOME=/usr/local/jdk
export SPARK_MASTER_IP=master
export SPARK_MASTER_HOST=master
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=3g
export SPARK_MASTER_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master:2181,slave1:2181,slave2:2181 -Dspark.deploy.zookeeper.dir=/opt/middleware/zookeeper-3.4.13"
slaves配置(三台一样)
vi /opt/middleware/spark-2.1.0-bin-hadoop2.7/conf/slaves
内容如下
slave1
slave2
slave3
6、启动
#在master上执行
/opt/middleware/spark-2.4.0-bin-hadoop2.7/sbin/start-all.sh
7、页面访问
http://172.*.*.6:8080/

截图是在两个slave情况下的,不影响阅读
常见问题汇总
问题1
NoClassDefFoundError: org/apache/spark/internal/Logging
解决方案:
sprintbootjar包排除log相关
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>${springboot.version}</version>
<exclusions>
<exclusion>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
</exclusion>
</exclusions>
</dependency>
问题2
NoSuchMethodError: javax.validation.BootstrapConfiguration.getClockProviderClassName()Ljava/lang/String;
解决方案:
删除spark jars下的validation-api-1.x.Final.jar,和springboot2.0冲突
问题3
cannot assign instance of java.lang.invoke.SerializedLambda to field
解决方案:
org.apache.spark.api.java.JavaRDD
原因未知,设置setJars()不生效,只好将Lambda表达式换成function
问题4
NoSuchMethodError: com.google.gson.GsonBuilder.setLenient()Lcom/google/gson/GsonBuilder
解决方案:
gson冲突,删除spark自带的gson2.2.4
问题5
ClassNotFoundException: com.mongodb.spark.rdd.partitioner.MongoPartition
解决方案:
将mongo-spark-connector_2.11-2.1.5.jar拷贝到jars下
问题6
ClassNotFoundException: com.mongodb.client.model.Collation
解决方案:
将mongo-java-driver-3.10.2.jar拷贝到jars下(加载不到自己app所属的jar包????)
问题7
ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD
问题8
19/06/04 15:46:40 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
19/06/04 15:46:40 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
19/06/04 15:46:40 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
19/06/04 15:46:40 WARN Utils: Service 'Driver' could not bind on port 0. Attempting port 1.
Exception in thread "main" java.net.BindException: 无法指定被请求的地址: Service 'Driver' failed after 16 retries (starting from 0)! Consider explicitly setting the appropriate port for the service 'Driver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:127)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:501)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:210)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:353)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
解决方案
spark-env.sh中不需要配置SPARK_LOCAL_IP
问题9
check your cluster UI to ensure that workers are registered and have sufficient resources
原因分析:
SPARK_WORKER_MEMORY配置的是2g
而spark.executor.cores为3,spark.executor.memory -> 1g,超出worker配置的2g内存,所以获取不到资源
解决方案
spark-evn.sh中配置
export SPARK_WORKER_MEMORY=3g
SparkLauncher提交任务时配置
spark.executor.cores=2
spark.executor.memory=1g
网友评论