Spark | WordCount

作者: icebreakeros | 来源:发表于2019-07-05 20:34 被阅读0次

wordcount

spark-shell

./bin/spark-shell \
--master spark://192.168.219.51:7077 \
--executor-mempry 2g

./bin/spark-shell \
--master spark://192.168.219.51:7077,192.168.219.52:7077 \
--executor-mempry 2g

val lines = sc.textFile("/opt/data/spark/hellospark")
lines.count()
lines.first()

# 修改日志级别
vim ./conf/log4j.properties
log4j.rootCategory=INFO, console

spark word count

vim WordCount.scala
object WordCount {

  def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("wordcount")
    val sc = new SparkContext(conf)
    val input = sc.textFile("/data/spark/demo/word_count")
    val lines = input.flatMap(line => line.split(" "))
    val count = lines.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
    val output = count.saveAsTextFile("/data/spark/demo/word_count_result")
  }
}

File - Project Structure - Artifacts - "+" - Jar - from modules
Build - Build Artifacts

./bin/spark-submit \
--mater spark://local.localdomain:7077 \
--class WordCount /data/spark/demo/jar/spark-demo.jar

相关文章

网友评论

    本文标题:Spark | WordCount

    本文链接:https://www.haomeiwen.com/subject/imbkhctx.html